We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

New York, NY

SUMMARY:

  • Around 9 years of experience with emphasis on Relational Database, Big Data ecosystem related technologies.
  • Strong working experience with Big Data and Hadoop ecosystems and Java.
  • Expertize with the tools in Hadoop ecosystem including MapReduce, HBase, Hive, Pig, Oozie, Sqoop and Flume.
  • Excellent understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for Data Analysis, Sqoop for Data Migration, Oozie for Data Scheduling, and HBase as a NoSQL data store.
  • Proficient in writing Map Reduce jobs.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience in Hadoop Shell Commands, writing Map Reduce Programs, verifying, managing and reviewing Hadoop log files.
  • Experience in working with flume to load the log data from multiple sources directly into HDFS.
  • Experience in providing support to data analyst in running Pig and Hive queries.
  • Experience in fine tuning Map Reduce jobs for better scalability and performance.
  • Experience working with Cloudera Distributions of Hadoop.
  • Expertise in writing SQLs, PL/SQL procedures in MySQL.
  • Data Design and Development on Microsoft SQL Server.
  • Good Java Development skills using J2EE.
  • Self-motivated, ability to handle multiple tasks, learn and adapt quickly with new technologies.

TECHNICAL SKILLS:

Big Data Technologies: Hadoop, HDFS, Map Reduce, Pig, Hive, HBase, Sqoop, Flume, Oozie

Operating System: Linux (Ubuntu), Windows XP, UNIX

Languages: C, Java JDK 1.6, Pig, UNIX Shell scripting

Data Bases: MySQL, Microsoft SQL Server

IDE: Eclipse

PROFESSIONAL EXPERIENCE:

Confidential,New York,NY

Hadoop Developer

Responsibilities:

  • Created NDM Jobs in Mainframe in order to copy the daily SOR files from Mainframe to Edgenode (Unix).
  • Developed DMX-h copy tasks to split the SOR file into Header,Detail & Trailer as part of Data Quality processing.
  • Created Shell scripts to execute the DMX Jobs in Unix using Autosys.
  • Created JIL script to execute autosys jobs to trigger shell scripts.
  • Worked with Sqoop to fetch the data from RDBMS to Hadoop Cluster.
  • Built complex DMX tasks using Sort, Aggregate & Join combinations in order to achieve MapReduce functionality.
  • Analysed the existing code and designed the approach to create the required tasks in DMX and specified which part needs to be executed in mapper and which is to be run on reducer side.
  • Generated partitions while working with MapReducer Jobs in DMX.
  • Created Lookup tasks for the better performance when working with smaller size of files instead of Joins.
  • Parametersied the components using shell for reusability.
  • Created Custom Tasks in DMX Jobs to execute a shell to run SQL.
  • Used Teradata to have the final Hadoop file data from each application into Teradata table using DMX TPT Utility

Environment: Hadoop, Mainframe, Oracle, Linux, Hive, HDFS,DMX-h, Sqoop, Autosys, Shell

Confidential,Hartford,CT

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Written multiple MapReduce programs in Java for Data Analysis
  • Wrote MapReduce job using Pig Latin and Java API
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing
  • Hadoop log files
  • Developed pig scripts for analyzing large data sets in the HDFS.
  • Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume
  • Designed and presented plan for POC on impala.
  • Experienced in migrating HiveQL into Impala to minimize query response time.
  • Knowledge on handling Hive queries using Spark SQL that integrate with Spark environment.
  • Implemented Avro and parquet data formats for apache Hive computations to handle custom business requirements.
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Performed extensive Data Mining applications using HIVE.
  • Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS using autosys and Oozie coordinator jobs.
  • Experienced in analyzing and Optimizing RDD’s by controlling partitions for the given data
  • Good understanding on DAG cycle for entire spark application flow on Spark application WebUI
  • Experienced in writing live Real-time Processing using Spark Streaming with Kafka
  • Developed custom mappers in python script and Hive UDFs and UDAFs based on the given requirement
  • Used Hive QL to analyze the partitioned and bucketed data and compute various metrics for reporting
  • Experienced in querying data using Spark SQL on top of Spark engine.
  • Experience in managing and monitoring Hadoop cluster using Cloudera Manager
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop

Environment: CDH, Java (JDK1.7), Hadoop, MapReduce, HDFS, Hive, Sqoop, Flume, HBase, Cassandra Pig, Oozie, Kerberos, Scala, Spark, SparkSQL, Spark Streaming, Kafka, Linux, AWS, Shell Scripting, MySQL Oracle 11g, PL/SQL, SQL*PLUS

Confidential

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Setup and benchmarked Hadoop clusters for internal use.
  • Developed Simple to complex Map/reduce Jobs using Java programming language that are implemented using Hive and Pig.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior. Used UDF’s to implement business logic in Hadoop.
  • Implemented business logic by writing UDFs in Java and used various UDFs from other sources.
  • Experienced on loading and transforming of large sets of structured and semi structured data.
  • Managing and Reviewing Hadoop Log Files, deploy and Maintaining Hadoop Cluster.
  • Involved in implementation of JBoss Fuse ESB 6.1 .
  • Consumed REST based web services.

Environment: Hadoop, Hive, Impala, Java, J2ee, RestServices, MapReduse, Jboss Fuse ESB 6.1.

Confidential

Software Developer

Responsibilities:
  • Working with the client and business management on gathering requirements and understanding functional aspects of the application.
  • Understand the requirements and provide the technical expertise to project team covering database, middleware and web technologies.
  • Designed complex applications based on business requirements and prepares Business requirement documents (BRD)
  • Developed Technical specification documents (TSD) after design discussions with development team includes technical lead, team lead, project manager and development team
  • Managed offshore teams based on Project Plan and Geographic Locations.
  • Developed Proof of Concepts (POC)'s on providing support for technologies to use for projects.
  • Used multiple frameworks MVC, Struts, Spring, Hibernate on project design.
  • Used multiple middleware technologies Spring ORM, Hibernate, JPA to implement security, transaction management/processing etc.
  • Using various design patterns DAO,DTO depending on the project design.
  • Worked on multiple databases MySql, DB2to manage project data.
  • Designed the databases for complex projects which include complex triggers, procedures, normalizations, constraints, Job schedulers etc.

Environment: Java, J2EE, MYSQL, JDBC, Struts, Hibernate

Confidential

Software Developer

Responsibilities:
  • Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC)
  • Reviewed the functional, design, source code and test specifications
  • Involved in developing the complete front end development using Java Script and CSS
  • Author for Functional, Design and Test Specifications.
  • Developed web components using JSP, Servlets and JDBC
  • Designed tables and indexes
  • Designed, Implemented, Tested and Deployed Enterprise Java Beans both Session and Entity using WebLogic as Application Server
  • Developed stored procedures, packages and database triggers to enforce data integrity. Performed data analysis and created crystal reports for user requirements
  • Implemented the presentation layer with HTML, XHTML and JavaScript
  • Implemented Backend, Configuration DAO, XML generation modules of DIS
  • Analyzed, designed and developed the component
  • Used JDBC for database access
  • Used Spring Framework for developing the application and used JDBC to map to Oracle database.
  • Used Data Transfer Object (DTO) design patterns
  • Unit testing and rigorous integration testing of the whole application
  • Written and executed the Test Scripts using JUNIT
  • Actively involved in system testing
  • Developed XML parsing tool for regression testing
  • Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product

Environment: Java, JavaScript, HTML, CSS, JDK 1.5.1, JDBC, Oracle10g, XML, XSL, Solaris and UML

We'd love your feedback!