We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

PA

SUMMARY:

  • Over 7+ years of professional IT experience which includes 2+ years of experience in Big data ecosystem related technologies like Hadoop, Pig, Hive, Sqoop, Spark, Scala and 5 years in Java and Oracle PLSQL development.
  • In - depth knowledge of Hadoop architecture and its components like HDFS, Name Node, Data Node, Job Tracker, Application Master, Resource Manager, Task Tracker and Map Reduce programming paradigm.
  • Experience in cluster planning, designing, deploying, performance tuning, administering and monitoring Hadoop ecosystem.
  • Commendable knowledge / experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
  • Experience in developing Map/Reduce jobs to process large data sets utilizing the Map/Reduce programming paradigm.
  • Experience working with datasets coming from multiple disparate data sources and customization of SPARK on Hadoop platform.
  • Experience writing interactive query on large datasets.
  • Efficient in developing Spark topologies to process large set of data for better decision-making.
  • Good understanding on Functional Programming.
  • Used many partitioning concepts in spark code.
  • Have written pattern matching code using scala and used transformations in the scalacode.
  • Good knowledge on caching, broadcasting, accumulators, joins concepts in spark.
  • Good understanding of spark and scala concepts.
  • Used Spark SQL for adhoc data querying of data.
  • Developed and executed shell scripts to automate the jobs
  • Good understanding of cloud configuration in Amazon web services (AWS).
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications
  • Experience in database design. Used PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle 10g/11g/12C
  • Proficient in writing SQL, PL/SQL stored procedures, functions, constraints, packages and triggers.
  • Good experience in Hive tables design, loading the data into hive tables.
  • Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
  • Hadoop Shell commands, Writing Map reduce Programs, Verifying the Hadoop Log Files.
  • Exposure on Query Programming Model of Hadoop.
  • Experience on System Study, Analysis, of Business requirement, preparation of Technical design, UTP and UTC, Coding, Unit testing, Integration testing, System testing and Implementation.
  • Experience in Object Oriented Analysis and Design (OOAD) and development of software using UML methodology.
  • Hands on experience with Core Java with Multithreading, Concurrency, Exception Handling, File handling, IO, Generics and Java Collections.
  • Experience working with different SDLC methodologies like Waterfall, and Agile (SCRUM)
  • Extensive experience working with Oracle database.
  • Expertise in writing Packages, Stored Procedures, Functions, Views and Database Triggers using SQL and PL/SQL in Oracle.
  • Hands on experience in design, development of end user screens and reports using Oracle Developer/2000 (Forms, Reports), Forms and Reports 9i, Oracle Developer Suite 10g and other front-end tools.
  • Worked with query tools like Toad, SQL*Plus, SQL Developer.
  • Manipulated Stored Procedures, Triggers, Views, Functions and Packages using TOAD.
  • Experience in Performance Tuning & Optimization of SQL statements.
  • Good inter personnel and communication skills along with excellent team player and self-starter skills.

TECHNICAL SKILLS:

Programming: Hadoop, Map Reduce, Scala, Spark, HDFS, Hive, Pig, Java, Cassandra,SQL, PL/SQL, Oracle Forms.

Middleware: Apache Tomcat, Maven, Horton Work’s data platform.

Databases: Oracle 12C/11g/10g,HBase, Cassandra.

Querying/Reporting: PL/SQL, SQL, Forms

Oracle Tools: ANSI SQL, Oracle PL/SQL, SQL*Plus, TOAD, iSQL*Plus, SQL*Loader, Oracle procedure Builder

Development Tools: Developer 11g/10g/9i, Forms 9i/10g

Operating Systems: UNIX (Sun Solaris/HP-UX/IBM AIX), Red Hat Linux, Oracle Enterprise Linux and Windows

Tools: SQL TRACE, EXPLAIN PLAN, Eclipse, Toad, Forms D2K, HP ALM, CVS.

PROFESSIONAL EXPERIENCE:

Confidential, PA

Hadoop Developer

Responsibilities:

  • Used Agile methodology in developing the application, which included iterative application development, weekly Sprints, stand up meetings and customer reporting backlogs.
  • Worked on a live 450 nodes Hadoop cluster running on Horton
  • Worked with data of 2 PB in size (TB of data with replication factor of 3)
  • Data ingestion from Oracle, Teradata into HDFS using Sqoop.
  • Created and worked Sqoop jobs with incremental load to populate Hive tables.
  • Developed aggregations on huge flat file data using MR code
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
  • Solved performance issues in MapReduce jobs.
  • Developed UDFs in Java as and when necessary to use in PIG and HIVE queries
  • Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
  • Developed UDFs in Java as and when necessary to use in PIG and HIVE queries
  • Monitoring and supporting to the Sqoop schedule jobs on workflow scheduler.
  • Used github to store the code.
  • Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Used ORC and Parquet file formats in Hive.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
  • Developed spark code using scala for faster testing and processing of data.
  • Responsible to migrate map reduce jobs into spark RDD transformations using Scala.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
  • Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Involved in converting Hive/SQL queries into SPARK TRANSFORMATIONS using Spark RDDs, and SCALA.
  • Used DATAFRAME API in Scala for converting the distributed collection of data organized into named columns.
  • Used SPARK to improve the performance and optimization of the existing algorithms in Hadoop using SPARK CONTEXT, SPARK-SQL, DATA FRAME, PAIR RDD'S, SPARK YARN.

Environment: RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, TeraData, Oracle SQL,SQL Server,UC4, github, Hortonworks data platform distribution,Spark,Scala, UC4.

Confidential, Oakland, CA

Hadoop Developer

Responsibilities:

  • Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data.
  • Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest behavioral data into HDFS for analysis.
  • Importing and exporting data into HDFS from database and vice versa using Sqoop.
  • Written hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
  • Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest behavioral data into HDFS for analysis.
  • Set up, monitoring and managing the Hadoop cluster using Hortonworks data platform.
  • Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing with PIG.
  • Cluster co-ordination services through ZooKeeper.
  • Used Maven extensively for building jar files of Map Reduce programs and deployed to Cluster.
  • Created customized BI tool for manager team that perform Query analytics using HiveQL.
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Designed and implemented a Cassandra NoSQL based database that persists high-volume user profile data.
  • Migrated high-volume OLTP transactions from Oracle to Cassandra
  • Created Data Pipeline of Map Reduce programs using Chained Mappers.
  • Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
  • Implemented map reduce programs to perform joins on the Map side using Distributed Cache in Java.
  • Modelled Hive partitions extensively for data separation and faster data processing and followed Pig and Hive best practices for tuning.

Environment: RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Oozie, Mahout, HBase, Hortonworks data platform distribution, Cassandra, UC4.

Confidential, NYC, NY

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
  • Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer data and purchase histories into HDFS for analysis.
  • Implemented optimization and performance tuning in Hive and Pig.
  • Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
  • Used Pig as ETL tool to do transformations, event joins, filters and some pre-aggregations before storing the data onto HDFS.
  • Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Loaded the aggregated data onto DB2 for reporting on the dashboard.
  • Implemented Partitioning and bucketing in Hive.
  • Experience in managing and reviewing Hadoop log files.
  • Extensively used Pig for data cleansing.
  • Configured Flume to extract the data from the web server output files to load into HDFS.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Loaded business accounts/customer details into HDFS from RDBMS Database using Sqoop.
  • Created POC on the existing HDFS data using Apache spark.

Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Hbase, Spark, Scala, Hue, Linux, Map Reduce, Hadoop distribution of Cloudera 3 and Flume.

Confidential

Oracle PL/SQL Developer

Responsibilities:

  • Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
  • Generated server side PL/SQL scripts for data manipulation and validation.
  • Involved in working with Business to automating many reports using UNIX shell scripts on daily basis.
  • Used Bulk Collections for better performance and easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.
  • Coded many generic routines (as functions), which could be called from other procedures.
  • Worked with the change requests on user interface
  • Developed various Reports for the end users as per their requirements and created many reports to suit preprinted format of the company.
  • Created user defined Exceptions while handling exceptions.
  • Wrote stored procedures, Functions and Database triggers using PL/SQL.
  • Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
  • Identified and implemented programming enhancements
  • Written unit test cases for the modules in order to verify the functionality as per the requirements.

Environment: Oracle 11g, SQL, PL/SQL, .NET Framework, Visual Studio,Toad, HP ALM

Confidential

Oracle PL/SQL Developer

Responsibilities:

  • Involved in SDLC gathering requirements from end users.
  • Developed views to facilitate easy interface implementation and enforce security on critical customer information.
  • Involved in GUI designing using Oracle Developer 10g (Forms 10g and Report 10g).
  • Developed stored procedures and triggers to facilitate consistent data entry into the database.
  • Written Stored Procedures using PL/SQL and functions and procedure for common utilities.
  • Participated in system analysis and data modeling, which included creating tables, views, indexes, synonyms, triggers, functions, procedures, cursors and packages.
  • Created programming code using advanced concepts of Records, Collections, and Dynamic SQL.
  • Developed Database Triggers for audit and validation purpose.
  • Used PL/SQL to validate data and to populate billing tables.
  • Developed Installation scripts for all the deliverables.
  • Performed functional testing for different Oracle Forms application functionalities.
  • Created and manipulated stored procedures, functions, packages and triggers using TOAD.
  • Wrote heavy stored procedures using dynamic SQL to populate data into temp tables from fact and dimensional tables.
  • Involved in migrating database from oracle 9i to 10g database.
  • Involved in developing screens and generating reports.
  • Developed Forms and Reports.

Environment: Oracle 9i, 10g SQL, PL/SQL, Forms 9i, SQL*Loader, SQL Navigator, Toad, HP ALM.

Confidential

Oracle PL/SQL Developer

Responsibilities:

  • Created and maintained PL/SQL scripts and stored procedures.
  • Coded many generic routines (as functions), which could be called from other procedures.
  • Developed user interface screens, Master detail relations and Report screens.
  • Developed various Reports for the end users as per their requirements and created many reports to suit preprinted format of the company.
  • Created user defined Exceptions while handling exceptions.
  • Wrote stored procedures, Functions and Database triggers using PL/SQL.
  • Conducted training sessions for the end-users.
  • Developed the presentation layer using JSP and Servlets.
  • Used JavaScript for client side validations.
  • Creating XML based configuration, property files for application
  • Used JDBC to connect to the database.
  • Designed database tables.
  • Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
  • Identified and implemented programming enhancements

Environment: Oracle 9i,10g SQL*Plus, Forms D2k, Sharepoint,CVS, Core Java, Eclipse, Apache Tomcat.

We'd love your feedback!