Hadoop Developer Resume Foster City, CA - Hire IT People

PROFESSIONAL SUMMARY:

7+ years of extensive experience in Full life cycle of Hadoop Development, Java Application development including requirements Analysis and Design, Development, implementation, support, maintenance and enhancements.
5+ years of comprehensive experience as a Hadoop Developer.
Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, YARN, HDFS, HBase, Zoo Keeper, Oozie, Hive, Cassandra, Sqoop, Pig, Flume.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Node Manager, Resource Manager and MapReduce.
Experience in working with MapReduce programs using Apache Hadoop for working with Big Data.
Experience in working with stream processing systems like Apache Spark.
Organizing data into tables, performing transformations, and simplifying complex queries with Hive.
Performing real - time interactive analyses on massive data sets stored in HDFS or HBase using SQL with Impala.
Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
Worked on messaging systems like Kafka along with spark stream processing to perform real time analysis.
Worked on integrating Apache Ignite and Kafka for Highly Scalable and Reliable Data Processing
Very good understanding of partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
Extracted the data from Teradata into HDFS using Sqoop.
Created and worked on Sqoop jobs with incremental load to populate Hive External tables.
Developed UDFs in Java as and when necessary to use in Pig and Hive queries.
Developed Oozie workflow for scheduling and orchestrating the ETL process.
Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper
Experience with Jakarta Struts Framework, MVC and J2EE Framework.
Experience in implementing J2EE Design Patterns.
Experienced in process, validate, parse, and extract data from XML files using DOM and SAX parsers.
Hands on experience in IDE tools like Eclipse, Visual Studio.
Worked extensively in Java, XML, XSL, EJB, JSP, JDBC, MVC, JSTL, Design Patterns and UML.
Having strong Object Oriented Design experience.
Hands on experience in Web Application Development using Client Script design technologies like Angular JS, JQuery as well as HTML, CSS, XML, Java Script
Experienced in developing Web services in Tomcat and Jboss.
Build and deployment tools like ANT, MAVEN 2, bug fixing and maintenance
Experience in database design using Stored Procedure, Functions, Triggers and strong experience in writing complex queries for DB2, SQL Server.
Worked with business users to extract clear requirements to create business value.
Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities.
Excellent problem solving skills, high analytical skills, good communication and interpersonal skills.

TECHNICAL SKILLS:-

Hadoop/Big Data Ecosystem: HDFS, MapReduce, YARN, HBase, Pig, Hive, Sqoop, Oozie, SparkScala, Ignite, Kafka, Zoo Keeper, Impala, Cassandra, Sqoop, Flume.

Java Technologies: Core Java, Servlets, JSP, JDBC, Collections

Web Technologies: JSP, JavaScript, AJAX, XML, DHTML, HTML, CSS, SOAP, WSDL, Web Services

Frame Works: Struts 2.x, Hibernate, Spring

Databases: My SQL, Oracle, SQL Server, MS Access, Elasticsearch

IDE: Eclipse, Visual Studio

Logging Tool: Log4j

Build Tools: ANT, MAVEN, GitHub, Gradle

Web Application Servers: Oracle Application Server, Apache Tomcat, Web Logic

Version Control System: Clear Case, Confidential

Operating Systems: Windows XP/7/8/10, Unix, Red hat Linux

Concepts: OOAD, UML, Design Patterns, Waterfall & Agile Methodology

PROFESSIONAL EXPERIENCE:-

Confidential, Foster City, CA

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop stack using different big data analytic tools including Pig and Hive, HBase and Sqoop.
Worked on MapReduce programs to extract data for extraction, transformation and aggregation from more than 20 sources having multiple file formats including XML, JSON, CSV & other compressed file formats.
Imported data using Sqoop from various sources like CyberSource, Informatica, Salesforce, Teradata, Authorize.net and Genesys Info Mart to load data into HDFS on a daily basis.
Created Oozie workflows for Hadoop based jobs including Sqoop, Hive and Pig.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Used Hive to analyze the partitioned and bucketed data and compute various metrics.
Handled importing of data from various data sources, performed transformations using hive, Map-Reduce, loaded data into HDFS and extracted data from MySQL into HDFS using Sqoop.
Installed and configured Hive, Pig, Oozie, and Sqoop on Hadoop cluster.
Developed simple to complex Map-Reduce jobs using Java programming language that was implemented using Hive and Pig.
Worked as support team member to improve the fraud detection system built using Kafka and Spark.
Provided support to MapReduce Programs by Cluster monitoring, maintenance and troubleshooting.
Worked on the backend using Scala and Spark to perform several aggregation logics.
Worked on real-time analytics and transactional processing using Ignite integrated with Kafka streams.
Worked hands on with ETL process and Involved in the development of the Hive/Impala queries.
Experience in using Sequence files, RCFile, Avro and Parquet file formats.
Worked on implementing cluster coordination services through Zookeeper.
Design, develop, unit test, and support ETL mappings and scripts for data marts using Talend. Checking & fixing delimiter in ASCII Files.
Perform data conversion, data integration and create mappings from source to target using Talend.
Experience in working with Flume to load the log data from multiple sources directly into HDFS.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers and pushed to HDFS.
Efficiently handled periodic exporting of SQL data into Elasticsearch.
Efficiently handled collection of data from various sources and profiling them using various analyst tools like Informatica Data Quality.
Involved with the design and implementation of the near real-time indexing pipeline, including index-management, cluster maintenance and interacting with Elasticsearch.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Worked using GitHub as a code repository and Gradle as a build tool
Used Hive to process data and Batch data filtering. Used Spark for other value centric data filtering.

Environment: Hadoop, Spark, HDFS, Hive, Pig, HBase, Java, Oozie, Sqoop, Scala, Spark, Flume, Impala, Zookeeper, Ignite, Kafka, MapReduce, Cloudera Manager, Cassandra, Elasticsearch Talend Big Data Studio, Avro, Parquet, Eclipse, My SQL, Gradle, Teradata, CyberSource, Informatica (IDQ), Salesforce, Authorize.net and Genesys Info Mart.

Confidential, RI

Hadoop Developer

Responsibilities:

Gathered the business requirements from the Business Partners and Subject Matter Experts.
Involved in installing Hadoop Ecosystem components.
Used to manage and review the Hadoop log files.
Responsible to manage data coming from different sources.
Supported Map Reduce Programs those are running on the cluster.
Involved in HDFS maintenance and WEBUI it through Hadoop-Java API.
Implemented MapReduce jobs by writing UDF's using Java API & Pig Latin.
Worked on pushing data as delimited files into HDFS using Talend Big data studio.
Worked on different Talend Hadoop Component like Hive, Pig.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Develop ETL job in Talend to load data from ASCII & Flat files.
Developing Scripts and Batch Job to schedule various Hadoop Program.
Written Hive queries for data analysis to meet the business requirements.
Creating Hive tables and working on them using Hive QL.
Worked on filtering raw data using tools like Tableau.
Responsible for running Hadoop streaming jobs to process csv data.
Experience in Gradle Build tool and understanding the artifactory and repo structure.
Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, HBase, Pig, Linux, Sqoop, Oozie, Flume, Talend, Tableau, Maven, GitHub, Gradle, XML, MySQL, MySQL Workbench.

Confidential, TN

Hadoop Developer

Responsibilities:

Gathered the business requirements from the Business Partners and Subject Matter Experts.
Created Hive tables which are extracted by the relevant EDW tables.
Responsible to manage data coming from different sources.
Importing and exporting data into HDFS and Hive using Sqoop.
Supported Map Reduce Programs those are running on the cluster.
Worked extensively on Hive and PIG.
Written Hive queries for data analysis to meet the business requirements.
Involved in creating UDF's where Custom Functionalities are required.
Wrote MapReduce job using Java API.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Developing Scripts and Batch Job to schedule various Hadoop Program.
Responsible to manage data coming from different sources.

Environment: Hadoop, MapReduce, HDFS, Eclipse, Omniture, Hive, PIG, HBase, Sqoop, Oozie and SQL.

Confidential, CA

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop stack using different big data analytic tools including Pig and Hive, HBase and Sqoop.
Worked on MapReduce programs to extract data for extraction, transformation and aggregation from more than 20 sources having multiple file formats including XML, JSON, CSV & other compressed file formats.
Imported data using Sqoop from various sources like CyberSource, Informatica, Salesforce, Teradata, Authorize.net and Genesys Info Mart to load data into HDFS on a daily basis.
Created Oozie workflows for Hadoop based jobs including Sqoop, Hive and Pig.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Used Hive to analyze the partitioned and bucketed data and compute various metrics.
Handled importing of data from various data sources, performed transformations using hive, Map-Reduce, loaded data into HDFS and extracted data from MySQL into HDFS using Sqoop.
Installed and configured Hive, Pig, Oozie, and Sqoop on Hadoop cluster.
Developed simple to complex Map-Reduce jobs using Java programming language that was implemented using Hive and Pig.
Worked as support team member to improve the fraud detection system built using Kafka and Spark.
Provided support to MapReduce Programs by Cluster monitoring, maintenance and troubleshooting.
Worked on the backend using Scala and Spark to perform several aggregation logics.
Worked on real-time analytics and transactional processing using Ignite integrated with Kafka streams.
Worked hands on with ETL process and Involved in the development of the Hive/Impala queries.
Experience in using Sequence files, RCFile, Avro and Parquet file formats.
Worked on implementing cluster coordination services through Zookeeper.
Design, develop, unit test, and support ETL mappings and scripts for data marts using Talend. Checking & fixing delimiter in ASCII Files.
Perform data conversion, data integration and create mappings from source to target using Talend.
Experience in working with Flume to load the log data from multiple sources directly into HDFS.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers and pushed to HDFS.
Efficiently handled periodic exporting of SQL data into Elasticsearch.
Efficiently handled collection of data from various sources and profiling them using various analyst tools like Informatica Data Quality.
Involved with the design and implementation of the near real-time indexing pipeline, including index-management, cluster maintenance and interacting with Elasticsearch.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Worked using GitHub as a code repository and Gradle as a build tool
Used Hive to process data and Batch data filtering. Used Spark for other value centric data filtering.

Confidential, CT

Software Developer

Responsibilities:

Worked on analyzing Hadoop stack using different big data analytic tools including Pig and Hive, HBase and Sqoop.
Worked on MapReduce programs to extract data for extraction, transformation and aggregation from more than 20 sources having multiple file formats including XML, JSON, CSV & other compressed file formats.
Imported data using Sqoop from various sources like CyberSource, Informatica, Salesforce, Teradata, Authorize.net and Genesys Info Mart to load data into HDFS on a daily basis.
Created Oozie workflows for Hadoop based jobs including Sqoop, Hive and Pig.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Used Hive to analyze the partitioned and bucketed data and compute various metrics.
Handled importing of data from various data sources, performed transformations using hive, Map-Reduce, loaded data into HDFS and extracted data from MySQL into HDFS using Sqoop.
Installed and configured Hive, Pig, Oozie, and Sqoop on Hadoop cluster.
Developed simple to complex Map-Reduce jobs using Java programming language that was implemented using Hive and Pig.
Worked as support team member to improve the fraud detection system built using Kafka and Spark.
Provided support to MapReduce Programs by Cluster monitoring, maintenance and troubleshooting.
Worked on the backend using Scala and Spark to perform several aggregation logics.
Worked on real-time analytics and transactional processing using Ignite integrated with Kafka streams.
Worked hands on with ETL process and Involved in the development of the Hive/Impala queries.
Experience in using Sequence files, RCFile, Avro and Parquet file formats.
Worked on implementing cluster coordination services through Zookeeper.
Design, develop, unit test, and support ETL mappings and scripts for data marts using Talend. Checking & fixing delimiter in ASCII Files.
Perform data conversion, data integration and create mappings from source to target using Talend.
Experience in working with Flume to load the log data from multiple sources directly into HDFS.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers and pushed to HDFS.
Efficiently handled periodic exporting of SQL data into Elasticsearch.
Efficiently handled collection of data from various sources and profiling them using various analyst tools like Informatica Data Quality.
Involved with the design and implementation of the near real-time indexing pipeline, including index-management, cluster maintenance and interacting with Elasticsearch.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Worked using GitHub as a code repository and Gradle as a build tool
Used Hive to process data and Batch data filtering. Used Spark for other value centric data filtering.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Foster City, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship