We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Minneapolis, MN

PROFESSIONAL SUMMARY:

  • Overall 7 years of IT professional experience in Software Development and Requirement Analysis in Agile work environment with 4+ years of Big Data Ecosystems experience in ingestion, storage, querying, processing and analysis of Big Data.
  • Experience in dealing with Apache Hadoop components like HDFS, Map Reduce, Hive, HBase, Pig, Sqoop, Oozier, Mahout, Python, Spark, Cassandra, Mongo DB,
  • Good understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name node, and Map Reduce concepts.
  • Experienced managing No - SQL DB on large Hadoop distribution Systems such as: Cloud era, Horton works HDP, Map M series etc.
  • Experienced developing Hadoop integration for data ingestion, data mapping and data process capabilities.
  • Worked with various data sources such as Flat files and RDBMS-Teradata, SQL server 2005, Netezza and Oracle. Extensive work in ETL process consisting of data transformation, data sourcing, mapping, conversion.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
  • Strong understanding of Data Modelling and experience with Data Cleansing, Data Profiling and Data analysis.
  • Designed and implemented Apache Spark streaming application using Python(pySpark) and Scala.
  • Experience in ETL (Data stage) analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.

TECHNICAL SKILLS:

  • Programming Languages: SQL, Java (Core), Python, C ++. C
  • Operating System: Windows ( NT/2000/XP/7/8), LINUX, UNIX
  • Databases: Oracle 10g/11g, MS SQL Server 2008, MySQL, HBase (NoSQL), MongoDB(NoSQL)
  • Big Data ecosystem: Hadoop - HDFS, Map reduce, Apache Pig, Hive, Apache Spark, Hbase, Flume, Oozie, MongoDB
  • Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5), Hortonworks, MapR and Apache
  • IDE Tools: Eclipse, NetBeans
  • Web Technologies: ASP.NET, HTML, XML
  • OLAP concepts: Data warehousing
  • Other Technologies: SQL Developer, TOAD

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential, Minneapolis, MN

Responsibilities:

  • Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, HBase, Sqoop, Flume, Oozie, Spark and Zookeeper.
  • Used Sqoop to transfer data between RDBMS and HDFS.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Implemented complex map reduce programs to perform map side joins using distributed cache.
  • Designed and implemented custom writable, custom input formats, custom partitions and custom comparators in MapReduce.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
  • Converted existing SQL queries into Hive QL queries.
  • Implemented UDFs, UDAFs, UDTFs in java for hive to process the data that can’t be performed using Hive inbuilt functions.
  • Effectively used Oozier to develop automatic workflows of Sqoop, MapReduce and Hive jobs.
  • Exported the analysed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Gathered the business requirements from the Business Partners and Subject Matter Experts.
  • Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
  • Loaded and analysed Omniture logs generated by different web applications
  • Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, oozy, Java, spark, Kafka, Flume, Storm, Knox, Linux, Scala, Maven, Java Scripting, Oracle 11g/10g, SVN

Big Data Developer

Confidential, Baltimore, MD

Responsibilities:

  • Analyze the requirement and lay out the plan to execute the task.
  • With Pig Latin & Hive, analyze the files obtained from claims submissions and third party vendor data dumps.
  • Developed Hadoop streaming Map/Reduce works using Python.
  • Modify database through query as per client request using SQL queries and DB2
  • Writing optimized SQL queries and extract data from data warehouse as per business user requirement.
  • Import data using Sqoop to load data from HDFS to MySQL and vice versa on regular basis.
  • Developed Pig scripts to implement ETL transformations.
  • Developed join data set scripts using Pig Latin join operations.
  • Write Pig User Defined functions whenever necessary for carrying out ETL tasks.
  • Developed HIVE UDFs to in corporate external business logic.
  • Imported Bulk Data into HBase Using Map Reduce programs.

Environment: Hadoop - Pig latin Script, Hive, Hbase, Apache Spark, MapReduce, SQL Server 2008,SQL Server Management Studio, Tableau 7,8.1,Tableau server, COBOL, JCL

Hadoop Developer

Confidential, Denver, CO

Responsibilities:

  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables
  • Installed & maintained Cloudera Hadoop distribution.
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Involved in loading the data from Linux file system to HDFS.
  • Implemented MapReduce programs on log data to transform into structured way to find user information.
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.
  • Exported the analyzed data to the relational databases using sqoop for virtualization and to generate reports for the BI team
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Installed Oozie workflow engine to run multiple MapReduce, Hive and Pig jobs.

Environment: Hadoop, Cloudera, MapReduce, Hive, Sqoop, Spark, Flume, Talend, Python, MS-SQL Server, Tableau, ETL, NoSQL.

Hadoop Developer

Confidential, Cincinnati, OH

Responsibilities:

  • Worked with the business users to gather, define business requirements and analyze the possible technical solutions.
  • Installed Name node, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node.
  • Installed and Configured HDP2.2
  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Migrated complex map reduce programs into in memory Spark processing using Transformations and actions.
  • Good knowledge on Teradata Manager, TDWM, PMON, DBQL, SQL assistant and BTEQ.
  • Gathered system design requirements, design and write system specifications.
  • Designed and developed UNIX shell scripts as part of the ETL process to compare control totals, automate the process of loading, pulling and pushing data from and to different servers.

Environment: Hadoop, HDFS, Hive, Pig, Flume, Sqoop, Spark, MapReduce, Cloudera, Avro, Snappy, Zookeeper, CDH, NoSQL, HBase, Java (JDK 1.6), Eclipse, Python, MySQL.

Java/J2EE Developer

Confidential

Responsibilities:

  • Analyzed project requirements for this product and involved in designing using UML infrastructure.
  • Interacting with the system analysts & business users for design & requirement clarification.
  • Extensive use of HTML5 with Angular JS, JSTL, JSP, jQuery and Bootstrap for the presentation layer along with JavaScript for client-side validation.
  • Taken care of Java Multithreading part in back end components.
  • Developed HTML reports for various modules as per the requirement.
  • Developed Web Services using SOAP, SOA, WSDL Spring MVC and developed DTDs, XSD schemas for XML (parsing, processing, and design) to communicate with Active Directory application using Restful API.
  • Created multiple RESTful web services using jersey 2 framework.
  • Used Aqua Logic BPM (Business Process Managements) for workflow management.

Environment: Java, J2EE, HTML, CSS, JSP, JavaScript, Bootstrap, AngularJS, Servlets, JDBC, EJB, Java Beans, Hibernate, Spring MVC, Restful, JMS, MQ Series, AJAX, WebSphere Application Server, SOAP, XML, MongoDB, JUnit, Rational Suite, CVS Repository.

We'd love your feedback!