We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Chicago, IL

SUMMARY:

  • 7+ years of extensive IT experience with multinational clients this includes 4+ years of recent experience in Big Data/Hadoop Ecosystem.
  • Hands - on experience in working on Apache Hadoop ecosystem components like Map-Reduce, Hive, Pig, SQOOP, Spark, Flume, HBase, Kafka, Oozie and Zookeeper.
  • Excellent knowledge on Hadoop Components such as HDFS, MapReduce and YARN programming paradigm.
  • Experience with installation, configuration, supporting and managing of BigData and underlying infrastructure of Hadoop Cluster.
  • Experience in analyzing data using HiveQL, Pig Latin and extending HIVE and PIG core functionality by using custom UDFs.
  • Proficient in Relational Database Management Systems (RDBMS).
  • Extensive working knowledge of Partitioned table, UDFs, Performance tuning, compression related properties in Hive.
  • Good understanding of NoSQL databases and hands on experience in writing applications on NoSQL databases like HBase.
  • Hands on experience in using Amazon Web Services like EC2, EMR, RedShift, DynamoDB and S3.
  • Hands on using Apache Kafka for tracking data ingestion to Hadoop cluster and implementing Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Experience in Spark Streaming to ingest data from multiple data sources into HDFS.
  • Skillful Hands on Experience on Stream Processing including Storm and Spark streaming.
  • Knowledge in job work-flow scheduling and monitoring tools like Oozie.
  • Experience in analyzing data using HBase and custom MapReduce programs in Java.
  • Proficient in importing and exporting the data using SQOOP from HDFS to Relational Database systems and vice-versa.
  • Excellent knowledge in data transformations using MapReduce, HIVE and Pig scripts for different file formats.
  • Experience with various scripting languages like Linux/Unix shell scripts, Python.
  • Involved in importing Streaming data using FLUME to HDFS and analyzing using PIG and HIVE.
  • Experience in using Flume for aggregating log data from web servers and dumping into HDFS.
  • Experience in scheduling and monitoring Oozie workflows for parallel execution of jobs.
  • Proficient in Core Java, Servlets, Hibernate, JDBC and Web Services.
  • Experience in all Phases of Software Development Life Cycle (Analysis, Design, Development, Testing and Maintenance) using Waterfall and Agile methodologies.
  • Experience in using Sequence files, AVRO file, Parquet file formats; Managing and reviewing Hadoop log files.
  • Experience in Developing and maintaining applications on the AWS platform.
  • Hands on experience in working with RESTful web services using JAX-RS and SOAP web services using JAX-WS.

TECHNICAL SKILLS:

Hadoop Ecosystem: Pig, Hive, Sqoop, Flume, HBase, Kafka-Storm, Spark with Scala, Oozie, Zookeeper, Impala, Hadoop Distributions (Cloudera, Hortonworks)

Web Technologies: Ajax, jQuery, HTML, CSS, XML

Programing Languages: Java, Scala, C/ C++, Python

Databases: MySQL, MS-SQL Server, SQL, Oracle 11g, NoSQL (HBase, Cassandra)

Web Services: REST, AWS, SOAP,UD, Micro Services

Tools: Ant, Maven, Junit, Apache NiFi, Talend, Airflow

Servers: Apache Tomcat, WebSphere, JBoss

IDE's: MyEclipse, Eclipse, IntelliJ IDEA, NetBeansAWS HTML, Java Script, XML, SOAP, EMR, EC2.

ETL/BI Tools: Talend, Tableau, Pig

PROFESSIONAL EXPERIENCE:

Confidential, Chicago, IL

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce.
  • Involved in Hadoop along with Map Reduce, Hive and Pig set up.
  • Loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Written Map Reduce programs for some refined queries on big data.
  • Managing and scheduling jobs on a Hadoop cluster.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Developed Simple to complex Map/reduce Jobs using Hive.
  • Implemented Partitioning and bucketing in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Worked with Hive QL on big data of logs to perform a trend analysis of user behavior on various online modules.
  • Experience in managing and reviewing Hadoop log files.
  • Extensively used Pig for data cleansing.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, Scala2.12.8, Spark 2.1.0, Kafka 1.0.0, SQL, Pig, Sqoop, HBase, Zookeeper, MySQL, DB2, Teradata, AWS, Git, Agile.

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop stack and different big data analytic tools including Pig, Hive, Hbase database and Sqoop.
  • In depth understanding of Classic MapReduce and YARN architectures.
  • Developed Map Reduce programs for some refined queries on big data.
  • Created Azure HDInsight and deployed Hadoop cluster in could platform
  • Used HIVE queries to import data into Microsoft AZURE cloud and analyzed the data using HIVE scripts.
  • Using Ambari in Azure HDInsight cluster recorded and managed the data logs of name node and data node
  • Creating Hive tables and working on them for data analysis to cope up with the requirements.
  • Developed a frame work to handle loading and transform large sets of unstructured data from UNIX system to HIVE tables.
  • Worked with business team in creating Hive queries for ad hoc access.
  • Implemented Hive Generic UDF's to implement business logic.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Developed Pig UDF's to pre-process the data for analysis.
  • Deployed Cloudera Hadoop Cluster on Azure for Big Data Analytics
  • Analyzed the data by performing Hive queries, ran Pig scripts, SparkSQL and SparkStreaming.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Developed Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
  • Involved in creating generic Sqoop import script for loading data into Hive tables from RDBMS.
  • Involved in continuous monitoring of operations using Storm.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Implemented indexing for logs from Oozie to Elastic Search.
  • Experienced to implement MapReduce logics on Hortonworks distribution system (HDP 2.1, HDP 2.2 and HDP 2.3)
  • Design, develop, unit test, and support ETL mappings and scripts for data marts using Talend.

Environment: Hortonworks, Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, Apache Kafka 0.10.0.1, AZURE, Apache Storm, Oozie, SQL, Flume, Spark1.6.1, HBase and GitHub .

Confidential, Omaha, Nebraska

Hadoop Developer

Responsibilities:

  • Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
  • Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
  • As a Developer, worked directly with business partners discussing the requirements for new projects and enhancements to the existing applications.
  • Wrote extensive shell scripts to run appropriate programs.
  • Wrote multiple queries to pull data from Hbase
  • Reporting on the project based on Agile-Scrum Method. Conducted daily Scrum meetings and updated JIRA with new details.
  • Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
  • This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
  • Designed and implemented Mapreduce-based large-scale parallel relation-learning system.
  • Involved in review of functional and non-functional requirements.
  • Installed and configured Hadoop Mapreduce, HDFS and developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Wrote Pig Scripts to perform ETL procedures on the data in HDFS.
  • Analyzed the data by performing Hive queries and running Pigscripts and Python Scripts.
  • Used Hive to partition and bucket data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Got good experience with NoSQL database.

Environment: Java 1.6, Hadoop 2.2.0 (Yarn), Map-Reduce, Hive, Pig, Sqoop, Hbase-0.94, Storm-0.9.1, Linux Centos 6.4, Agile, Maven, Jira, Hortonworks Distribution Platform (HDP).

Confidential

Java Developer

Responsibilities:

  • Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side.
  • Used Hibernate Framework for persistence onto oracle database.
  • Written and debugged the ANT Scripts for building the entire web application.
  • Developed web services in Java and Experienced with SOAP, WSDL and used WSDL to publish the services to another application.
  • Implemented Java Message Services (JMS) using JMS API.
  • Coded using Servlets, SOAP Client and Apache CXF Rest API's for delivering the data from our application to external and internal for communication protocol.
  • Created SOAP Web Service using JAX-WS, to enabled client to consume a SOAP Web Service. .
  • Experienced in designing and developing multi-tier scalable applications using Java and J2EE Design Patterns.

Environment: Java, HTML, Java Script, SQL Server, PL/SQL, JSP, Web Services, SOAP, SOA, JSF, Java, JMS, Oracle, Eclipse, XML, Apache tomcat.

Confidential

Java Developer

Responsibilities:

  • Involved in the coding of JSP pages for the presentation of data on the View layer in MVC architecture.
  • Used J2EE design patterns like Factory Methods, MVC, and Singleton Pattern that made modules and code more organized, flexible and readable for future upgrades.
  • Worked with JavaScript to perform client-side form validations.
  • Used Struts tag libraries as well as Struts tile framework.
  • Used JDBC to access Database with Oracle thin driver of Type-3 for application optimization and efficiency.
  • Actively involved in tuning SQL queries for better performance.
  • Worked with XML to store and read exception messages through DOM.
  • Wrote generic functions to call Oracle stored procedures, triggers, functions.

Environment: Core Java, Maven, Oracle, AJAX, JDK, JSP, Eclipse, JavaScript .

Hire Now