We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Boston, MA

SUMMARY

  • Around 7 years of IT experience including 2.5 years of experience in Hadoop development and good object oriented programming skills.
  • Good knowledge of Hadoop Development and various components such as HDFS, Job Tracker, Task Tracker, Data Node, Name Node and Map­Reduce concepts.
  • Responsible for writing MapReduce programs.
  • Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Spark, Map Reduce, Sqoop, Yarn, Oozie and Zookeeper.
  • Good exposure to the design, development / support of Apache SPARK, Hadoop and Big data ecosystem using Apache Spark 1.6( SQL + DataFrames, Spark Streaming, MLlib, GraphX ), Infosphere Biginsights 4.0 (IBM's Product),
  • Cloudera CDH 5.5, Hortonworks HDP 2.3, MapR 5.0.
  • Experience in installation, configuration, Management, supporting and monitoring Hadoop cluster using various distributions such as Hortonworks and Cloudera.
  • Experience in analyzing data using HiveQL, Beeline, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Excellent programming skills with experience in Java, C, SQL.
  • Experience in using Cloudera Manager for installation and management of single­node and multi­node Hadoop cluster (CDH3, CDH4 & CDH5).
  • Experience in cloud stack such as Amazon AWS and VMWARE stack.
  • Expertize skills of importing and exporting data using Sqoop from HDFS file system to Relational Database Systems and vice versa.
  • Configuring and performance tuning the Sqoop jobs for importing the raw data from the data warehouse.
  • Good understanding of NoSQL Database and hands on work experience in writing applications on NoSQL. Database like HBase.
  • Good knowledge in querying data from Hbase for searching grouping and sorting.
  • Expertize in writing Linux Scripts, setting up Autosys jobs, writing Pig Scripts, Hive queries, Oozie workflows and Map Reduce programs.
  • Experienced in using Integrated Development environments like Eclipse, NetBeans, Kate and gEdit.
  • Migration from different databases (i.e. Oracle, DB2, Cassandra, MongoDB) to Hadoop.
  • Worked and migrated RDMBS databases into different NoSQL database.
  • Experience in designing and coding web applications using Core Java & web Technologies­ JSP, Servlets and JDBC.
  • Excellent knowledge in Java and SQL in application development and deployment.
  • Hands on experience in creating various database objects like tables, views, functions, and triggers using SQL.
  • Excellent technical, communication, analytical and problem solving skills and ability to get on well with people including cross - cultural backgrounds and trouble­shooting capabilities.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.

TECHNICAL SKILLS

Hadoop Technologies: Apache Hadoop, Cloudera Hadoop Distribution (HDFS and Map Reduce)

Hadoop Ecosystem: Hive, Pig, Sqoop, Flume, Zookeeper, and Oozie, Spark.

NOSQL Databases: Hbase.

Programming Languages: Java, C, C++, Linux shell scripting.

Web Technologies: HTML, J2EE, CSS, JavaScript, AJAX, Servlets, JSP, DOM, XML

Databases: MySQL, SQL, Oracle, SQL Server

Software Engineering: UML, Object Oriented Methodologies, Scrum, Agile methodologies

Operating System: Linux, Macintosh, Windows.

IDE Tools: Eclipse, Rational rose.

PROFESSIONAL EXPERIENCE

Hadoop Developer

Confidential, Boston, MA

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in writing MapReduce jobs.
  • Involved in SQOOP, HDFS Put or CopyFromLocal to ingest data.
  • Used Pig to do transformations, event joins, filter bot traffic and some pre­aggregations before storing the data onto HDFS.
  • Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
  • Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Computed various metrics using Java MapReduce to calculate metrics that define user experience, revenue etc.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify the success of the experiment.
  • Used Eclipse and ant to build the application.
  • Involved in using SQOOP for importing and exporting data into HDFS and Hive.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
  • Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, MapReduce) and move the data files within and outside of HDFS.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java, Cloudera HDFS, Eclipse, Spark - Streaming/SQL.

Hadoop Developer

Confidential, Rochester, MN

Responsibilities:

  • Coordinated with business customers to gather business requirements.
  • Importing and exporting data into HDFS from database and vice versa using SQOOP.
  • Responsible to manage data coming from different sources.
  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hbase database and SQOOP.
  • Load and transform large sets of structured and semi structured data.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Analyzed data using Hadoop components Hive and Pig.
  • Involved in running Hadoop streaming jobs to process terabytes of data.
  • Gained experience in managing and reviewing Hadoop log files.
  • Involved in writing Hive queries for data analysis to meet the business requirements.
  • Worked on streaming the analyzed data to the existing relational databases using SQOOP for making it available for visualization and report generation by the BI team.
  • Involved in creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Developed PIG Latin scripts for the analysis of semi structured data.
  • Imported data using SQOOP to load data from MySQL to HDFS on regular basis.

Environment: Hadoop, Hive, Map Reduce, Pig, SQOOP.

Hadoop Developer

Confidential, San Diego, CA­

Responsibilities:

  • Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
  • Responsible for Installing, Configuring and Managing of Hadoop Cluster spanning multiple racks.
  • Developed Pig Latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
  • Used Sqoop tool to extract data from a relational database into Hadoop.
  • Involved in performance enhancements of the code and optimization by writing custom comparators and combiner logic.
  • Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
  • Good understanding of job schedulers like Fair Scheduler which assigns resources to jobs such that all jobs get, on average, an equal share of resources over time and an idea about Capacity Scheduler.
  • Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
  • Involved in identifying possible ways to improve the efficiency of the system. Involved in the requirement analysis, design, development and Unit Testing use of MRUnit and Junit
  • Prepare daily and weekly project status report and share it with the client.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Apache Hadoop, Java (JDK 1.7), Oracle, My SQL, Hive, Pig, Sqoop, Linux, Cent OS, Junit, MRUnit.

Java Developer

Confidential

Responsibilities:

  • Involved in Analysis, Design, Implementation, and Testing of the project.
  • Implemented the presentation layer with HTML, XHTML, JavaScript, and CSS.
  • Developed web components using JSP, Servlets, and JDBC.
  • Implemented database using MySQL.
  • Involved in fixing defects and unit testing with test cases using Junit.
  • Developed user and technical documentation.
  • Made extensive use of Java Naming and Directory interface (JNDI) for looking up enterprise beans.
  • Developed presentation layer using HTML, CSS, and JavaScript.
  • Developed stored procedures and triggers in PL/SQL.
  • Database design, writing stored procedures and triggers, writing session and entity beans, JMS client and message driven beans to receive & process JMS messages, JSPs & Servlets using MVC architecture.
  • Deployed the application in WebLogic server.
  • Responsible for Parsing XML data using XML parser and Testing, fixing of the bugs and coding modifications.
  • Involved in writing JUnit test cases and suits using Eclipse IDE.

Environment: Java, JSP, Servlets, JDBC, JavaScript, CSS, MySQL, JUnit, Eclipse, Apache Tomcat.

We'd love your feedback!