We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Portland, OR

SUMMARY

  • Hadoop Developer with 7+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies
  • 5 years of comprehensive experience in Big Data processing using Apache Hadoopand its ecosystem (MapReduce, Pig, Hive, Sqoop, Flume and HBase).
  • Experience in installing, configuring and maintaining the Hadoop Cluster Knowledge of administrative tasks such as installing Hadoop (on Ubuntu) and its ecosystem components such as Hive, Pig, sqoop.
  • Good knowledge about YARN configuration.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Wrote Hive queries for data analysis to meet the requirements
  • Created Hive tables to store data into HDFS and processed data using Hive QL
  • Expert in working with Hive data warehouse tool - creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
  • Good knowledge in creating Custom Serdes in Hive
  • Developed Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, UNION, SPLIT to extract data from data files to load into HDFS
  • Extending Hive and Pig core functionality by writing custom UDFs
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data
  • Good knowledge in Linux shell scripting or shell commands.
  • Hands on experience in dealing with Compression Codecs like Snappy, Gzip.
  • Good understanding of Data Mining and Machine Learning techniques
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
  • Extensive experience with SQL, PL/SQL and database concepts
  • Also used Hbase in accordance with PIG/Hive as and when required for real time low latency queries.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie (hive, pig) and Zookeeper (Hbase).
  • Experience in developing solutions to analyze large data sets efficiently
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP
  • Strong experience as a senior Java Developerin Web/intranet, Client/Server technologies using Java, J2EE, Servlets, JSP, EJB, JDBC.
  • Ability to work in high-pressure environments delivering to and managing stakeholder expectations.
  • Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.

TECHNICAL SKILLS

Hadoop Technologies: Apache Hadoop, Cloudera Hadoop Distribution (HDFS and Map Reduce)

Hadoop Ecosystem: Hive, Pig, Sqoop, Flume, Zookeeper, Oozie

NOSQL Databases: Hbase

Programming Languages: Java, C, C++, Linux shell scripting

Web Technologies: HTML, J2EE, CSS, JavaScript, AJAX, Servlets, JSP, DOM, XML

Databases: MySQL, SQL, Oracle, SQL Server

Software Engineering: UML, Object Oriented Methodologies, Scrum, Agile methodologies

Operating System: Linux, Windows 7, Windows 8, XP

IDE Tools: Eclipse, Rational rose

PROFESSIONAL EXPERIENCE

Confidential, Portland, OR

Hadoop Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in writing MapReduce jobs.
  • Involved in SQOOP, HDFS Put or CopyFromLocal to ingest data.
  • Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
  • Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
  • Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Computed various metrics using Java MapReduce to calculate metrics that define user experience, revenue etc.
  • Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify the success of the experiment.
  • Used Eclipse and ant to build the application.
  • Involved in using SQOOP for importing and exporting data into HDFS and Hive.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
  • Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, MapReduce) and move the data files within and outside of HDFS.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java Cloudera HDFS, Eclipse.

Confidential, Minneapolis, MN

Hadoop Developer

Responsibilities:

  • Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
  • Responsible for Installing, Configuring and Managing of HadoopCluster spanning multiple racks.
  • Developed Pig latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
  • Used Sqoop tool to extract data from a relational database into Hadoop.
  • Involved in performance enhancements of the code and optimization by writing custom comparators and combiner logic.
  • Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
  • Good understanding of job schedulers like Fair Scheduler which assigns resources to jobs such that all jobs get, on average, an equal share of resources over time and an idea about Capacity Scheduler.
  • Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
  • Involved in identifying possible ways to improve the efficiency of the system. Involved in the requirement analysis, design, development and Unit Testing use of MRUnit and Junit.
  • Prepare daily and weekly project status report and share it with the client.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Apache Hadoop, Java (JDK 1.6), Oracle, My SQL, Hive, Pig, Sqoop, Linux, Cent OS, Junit, MRUnit

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

  • Experience in administration, installing, upgrading and managing CDH3, Pig, Hive & Hbase
  • Architecture and implementation of the Product Platform as well as all data transfer, storage and Processing from Data Center and to HadoopFile Systems
  • Experienced in defining job flows.
  • Implemented CDH3 Hadoopcluster on CentOS.
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
  • Wrote Custom Map Reduce Scripts for Data Processing in Java
  • Importing and exporting data into HDFS and Hive using Sqoop and also used flume from to extract from multiple resources.
  • Responsible to manage data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map reduce way.
  • Used Flume to Channel data from different sources to HDFS
  • Created HBase tables to store variable data formats of PII data coming from different portfolios
  • Implemented best income logic using Pig scripts. Wrote custom Pig UDF to analyze data
  • Load and transform large sets of structured, semi structured and unstructured data
  • Cluster coordination services through Zookeeper
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Hadoop, MapReduce, Hive, HBase, Flume, Pig, Zookeeper, Java, ETL, SQL, CentOS, Eclipse.

Confidential, Quitman, TX

Java Developer

Responsibilities:

  • Used CVS for maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
  • Involved in Analysis, design and coding on J2EE Environment.
  • Implemented MVC architecture using Struts, JSP, and EJB's.
  • Worked on Hibernate object/relational mapping according to database schema.
  • Presentation layer design and programming on HTML, XML, XSL, JSP, JSTL and Ajax.
  • Designed, developed and implemented the business logic required for Security presentation controller.
  • Used JSP, Servlet coding under J2EE Environment.
  • Designed XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
  • Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
  • Involved in designing of class and dataflow diagrams using UML Rational Rose.
  • Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
  • Involved in the Design of ERD (Entity Relationship Diagrams) for Relational database.
  • Developed Shell scripts in UNIX and procedures using SQL and PL/SQL to process the data from the input file and load into the database.
  • Used Core Javaconcepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
  • Creating cross-browser compatible and standards-compliant CSS-based page layouts.
  • Performed Unit Testing on the applications that are developed.

Environment: Java(jdk1.6), J2EE, JSP, Servlets, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose, WebLogic Server, Apache Ivy, JUnit, SQL, PL/SQL, CSS, HTML, XML,, Eclipse

Confidential, New York, NY

Java Developer

Responsibilities:

  • Designed use cases for different scenarios.
  • Involved in acquiring requirements from the clients.
  • Designed and developed components for billing application.
  • Developed functional code and met expected requirements.
  • Wrote product technical documentation as necessary.
  • Designed presentation part in JSP(Dynamic content) and HTML(for static pages)
  • Designed Business logic in EJB and Business facades.
  • Used MDBs (JMS) and MQ Series for Account information exchange between current and legacy system.
  • Attached an SMTP server to the system, which handles Dynamic E-Mail Dispatches.
  • Created Connection pools and Data Sources.
  • Involved in the Enhancements of Data Base tables and procedures.
  • Deployed this application, which uses J2EE architecture model and Struts Framework first on
  • WebLogic and helped in migrating to JBoss Application server.
  • Participated in code reviews and optimization of code.
  • Followed Change Control Process by utilizing CVS Version Manager.

Environment: J2EE, JSP, Struts Frame Work, EJB, JMS, WebLogic Application Server, Tomcat Web Server, PL/SQL, CVS, MS PowerPoint, MS Outlook

We'd love your feedback!