Sr. Hadoop Developer Resume

PROFESSIONAL SUMMARY:

12+ years of extensive IT experience with multinational clients this includes 4+ years of recent experience in Big Data/Hadoop Ecosystem.
Hands - on experience in working on Apache Hadoop ecosystem components like Map-Reduce, Hive, Pig, SQOOP, Spark, Flume, HBase, Kafka, Oozie and Zookeeper.
Excellent knowledge on Hadoop Components such as HDFS, MapReduce and YARN programming paradigm.
Experience with installation, configuration, supporting and managing of BigData and underlying infrastructure of Hadoop Cluster.
Experience in analyzing data using HiveQL, Pig Latin and extending HIVE and PIG core functionality by using custom UDFs.
Proficient in Relational Database Management Systems (RDBMS).
Extensive working knowledge of Partitioned table, UDFs, Performance tuning, compression related properties in Hive.
Good understanding of NoSQL databases and hands on experience in writing applications on NoSQL databases like HBase.
Hands on experience in using Amazon Web Services like EC2, EMR, RedShift, DynamoDB and S3.
Hands on using Apache Kafka for tracking data ingestion to Hadoop cluster and implementing Kafka Custom encoders for custom input format to load data into Kafka Partitions.
Experience in Spark Streaming to ingest data from multiple data sources into HDFS.
Skillful Hands on Experience on Stream Processing including Storm and Spark streaming.
Knowledge in job work-flow scheduling and monitoring tools like Oozie.
Experience in analyzing data using HBase and custom MapReduce programs in Java.
Proficient in importing and exporting the data using SQOOP from HDFS to Relational Database systems and vice-versa.
Excellent knowledge in data transformations using MapReduce, HIVE and Pig scripts for different file formats.
Experience with various scripting languages like Linux/Unix shell scripts, Python.
Involved in importing Streaming data using FLUME to HDFS and analyzing using PIG and HIVE.
Experience in using Flume for aggregating log data from web servers and dumping into HDFS.
Experience in scheduling and monitoring Oozie workflows for parallel execution of jobs.
Proficient in Core Java, Servlets, Hibernate, JDBC and Web Services.
Experience in all Phases of Software Development Life Cycle (Analysis, Design, Development, Testing and Maintenance) using Waterfall and Agile methodologies.
Experience in using Sequence files, AVRO file, Parquet file formats; Managing and reviewing Hadoop log files.
Experience in Developing and maintaining applications on the AWS platform.
Hands on experience in working with RESTful web services using JAX-RS and SOAP web services using JAX-WS.

TECHNICAL SKILLS:

Hadoop Ecosystem: Pig, Hive, Sqoop, Flume, HBase, Kafka: Storm, Spark with Scala, Oozie, Zookeeper, Impala, Hadoop Distributions (Cloudera, Hortonworks)

Web Technologies: Ajax, jQuery, HTML, CSS, XML

Programing Languages: Java, Scala, C/ C++, Python

Databases: MySQL, MS: SQL Server, SQL, Oracle 11g, NoSQL (HBase, Cassandra)

Web Services: REST, AWS, SOAP,UD, Micro Services

Tools: Ant, Maven, Junit, Apache NiFi, Talend, Airflow

Servers: Apache Tomcat, WebSphere, JBoss

IDE's: MyEclipse, Eclipse, IntelliJ IDEA, Net Beans

AWS: HTML, Java Script, XML, SOAP, EMR, EC2.

ETL/BI Tools: Talend, Tableau, Pig

PROFESSIONAL EXPERIENCE:

Confidential

Sr. Hadoop Developer

Responsibilities:

Handled importing of data from various data sources, performed transformations using Hive, MapReduce.
Involved in Hadoop along with Map Reduce, Hive and Pig set up.
Loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Written Map Reduce programs for some refined queries on big data.
Managing and scheduling jobs on a Hadoop cluster.
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Developed Simple to complex Map/reduce Jobs using Hive.
Implemented Partitioning and bucketing in Hive.
Mentored analyst and test team for writing Hive Queries.
Worked with Hive QL on big data of logs to perform a trend analysis of user behavior on various online modules.
Experience in managing and reviewing Hadoop log files.
Extensively used Pig for data cleansing.
Developed the Pig UDF'S to pre-process the data for analysis.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, Scala2.12.8, Spark 2.1.0, Kafka, SQL, Pig, Sqoop, HBase, Zookeeper, MySQL, DB2, Teradata, AWS,Git, Agile.

Confidential, NYC

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop stack and different big data analytic tools including Pig, Hive, Hbase database and Sqoop.
In depth understanding of Classic MapReduce and YARN architectures.
Developed Map Reduce programs for some refined queries on big data.
Created Azure HDInsight and deployed Hadoop cluster in could platform
Used HIVE queries to import data into Microsoft AZURE cloud and analyzed the data using HIVE scripts.
Using Ambari in Azure HDInsight cluster recorded and managed the data logs of name node and data node
Creating Hive tables and working on them for data analysis to cope up with the requirements.
Developed a frame work to handle loading and transform large sets of unstructured data from UNIX system to HIVE tables.
Worked with business team in creating Hive queries for ad hoc access.
Implemented Hive Generic UDF's to implement business logic.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed Pig UDF's to pre-process the data for analysis.
Deployed Cloudera Hadoop Cluster on Azure for Big Data Analytics
Analyzed the data by performing Hive queries, ran Pig scripts, SparkSQL and SparkStreaming.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Developed Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
Extracted files from Cassandra through Sqoop and placed in HDFS for further processing.
Involved in creating generic Sqoop import script for loading data into Hive tables from RDBMS.
Involved in continuous monitoring of operations using Storm.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Implemented indexing for logs from Oozie to Elastic Search.
Experienced to implement MapReduce logics on Hortonworks distribution system (HDP 2.1, HDP 2.2 and HDP 2.3)
Design, develop, unit test, and support ETL mappings and scripts for data marts using Talend.

Environment: Hortonworks, Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, Apache Kafka, AZURE, Apache Storm, Oozie, SQL, Flume, Spark1.6.1, HBase, Cassandra, and GitHub .

Confidential, NYC

Hadoop Developer

Responsibilities:

Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
As a Developer, worked directly with business partners discussing the requirements for new projects and enhancements to the existing applications.
Wrote extensive shell scripts to run appropriate programs.
Wrote multiple queries to pull data from Hbase
Reporting on the project based on Agile-Scrum Method. Conducted daily Scrum meetings and updated JIRA with new details.
Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
Designed and implemented Mapreduce-based large-scale parallel relation-learning system.
Involved in review of functional and non-functional requirements.
Installed and configured Hadoop Mapreduce, HDFS and developed multiple MapReduce jobs in java for data cleaning and pre-processing.
Importing and exporting data into HDFS and Hive using Sqoop.
Wrote Pig Scripts to perform ETL procedures on the data in HDFS.
Analyzed the data by performing Hive queries and running Pigscripts and Python Scripts.
Used Hive to partition and bucket data.
Load and transform large sets of structured, semi structured and unstructured data.
Got good experience with NoSQL database.

Environment: Java 1.6, Hadoop 2.2.0 (Yarn), Map-Reduce, Hive, Pig, Sqoop, Hbase-0.94, Storm-0.9.1, Linux Centos 6.4, Agile, Maven, Jira, Hortonworks Distribution Platform (HDP).

Confidential, NYC

Java Developer

Responsibilities:

Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side.
Used Hibernate Framework for persistence onto oracle database.
Written and debugged the ANT Scripts for building the entire web application.
Developed web services in Java and Experienced with SOAP, WSDL and used WSDL to publish the services to another application.
Implemented Java Message Services (JMS) using JMS API.
Coded using Servlets, SOAP Client and Apache CXF Rest API's for delivering the data from our application to external and internal for communication protocol.
Created SOAP Web Service using JAX-WS, to enabled client to consume a SOAP Web Service. .
Experienced in designing and developing multi-tier scalable applications using Java and J2EE Design Patterns.

Environment: Java, HTML, Java Script, SQL Server, PL/SQL, JSP, Web Services, SOAP, SOA, JSF, Java, JMS, Oracle, Eclipse, XML, Apache tomcat.

Confidential, Greenwich, CT

Java Developer

Responsibilities:

Involved in the coding of JSP pages for the presentation of data on the View layer in MVC architecture.
Used J2EE design patterns like Factory Methods, MVC, and Singleton Pattern that made modules and code more organized, flexible and readable for future upgrades.
Worked with JavaScript to perform client-side form validations.
Used Struts tag libraries as well as Struts tile framework.
Used JDBC to access Database with Oracle thin driver of Type-3 for application optimization and efficiency.
Actively involved in tuning SQL queries for better performance.
Worked with XML to store and read exception messages through DOM.
Wrote generic functions to call Oracle stored procedures, triggers, functions.

Environment: Core Java, Maven, Oracle, AJAX, JDK, JSP, Eclipse, JavaScript .

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship