Hadoop Developer Resume Sacramento, CA - Hire IT People

SUMMARY

7+ years of experience in all phases of Software Application requirement analysis, design, development and maintenance of Hadoop/Big Data application like SPARK, KAFKA, EMR, Hive, Sqoop and applications using java and scala to tailor with industry needs.
Hands on experience with Spark Core, Spark SQL, Spark Streaming.
Used Spark - SQL to perform transformations and actions on data residing in Hive.
Used Kafka & Spark Streaming for real-time processing.
Experience with migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop.
Good Knowledge in Apache Spark data processing to handle data from RDBMS and streaming sources with Spark streaming.
Experience in Data Warehousing and ETL processes and Strong database, SQL, ETL and data analysis skills.
Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Have good skills in writing SPARK Jobs in Scala for processing large sets of structured, semi-structured and store them in HDFS.
Good Knowledge in Spark SQL queries to load tables into HDFS to run select queries on top.
Experience in writing Hive Queries for processing and analyzing large volumes of data.
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
Implemented several optimization mechanisms like Combiners, Distributed Cache, Data Compression, and Custom Practitioner to speed up the jobs.

TECHNICAL SKILLS

Big Data/Hadoop: HDFS, Hive, Sqoop, Impala, Kafka, Map Reduce, Cloudera, Amazon EMR.

Spark Components: Spark Core, Spark SQL, Spark Streaming.

Programming Languages: SQL, Scala and Java

Databases: MySQL, Hive-QL, RDBMS.

Cloud: Amazon EMR, EC2, S3.

Operating Systems: Windows, Unix, Red Hat Linux.

PROFESSIONAL EXPERIENCE

Confidential - Sacramento, CA

Hadoop Developer

Responsibilities:

Interacting with multiple teams understanding their business requirements for designing flexible and common component.
Validating the source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
Implemented Spark SQL to access hive tables into spark for faster processing of data.
Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data.
Validating and visualizing the data in Tableau.
Using hive extensively to create a views for the feature data.
Working with platform and Hadoop teams closely for the needs of the team.
Using Kafka for Data ingestion for different data sets.
Experienced in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOP.
Developed sqoop jobs to import the data from RDBMS and file servers into Hadoop.

Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Spark, Kafka, s3, Sqoop.

Confidential - Portland, Oregon

Spark/Hadoop Developer

Responsibilities:

Interacting with multiple teams understanding their business requirements for designing flexible and common component.
Validating the source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
Used Spark SQL for creating data frames and performed transformations on data frames like adding schema manually, casting, joining data frames before storing them.
Implemented Spark SQL to access hive tables into spark for faster processing of data.
Worked on Spark streaming using Apache Kafka for real time data processing.
Experience in creating Kafka producer and Kafka consumer for Spark streaming.
Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data into HDFS.
Worked on three layers for storing data such as raw layer, intermediate layer and publish layer.
Creating external hive tables to store and queries the data which is loaded.
Optimizations techniques include partitioning, bucketing.
Using Avro file format compressed with Snappy in intermediate tables for faster processing of data.
Used parquet file format for published tables and created views on the tables.
Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.
Automated the jobs with Oozie and scheduled them with Autosys.
Experience in AWS to spin up the EMR cluster to process the huge data which is stored in S3 and push it to HDFS.
Participated in evaluation and selection of new technologies to support system efficiency.
Participated in development and execution of system and disaster recovery processes.

Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Spark, Kafka, s3, Sqoop, Java, Scala, Eclipse, Tableau and Maven, SBT

Confidential - Richmond, VA

Spark/Hadoop Developer

Responsibilities:

Interacting with multiple teams understanding their business requirements for designing flexible and common component.
Validating the source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
Used Spark SQL for creating data frames and performed transformations on data frames like adding schema manually, casting, joining data frames before storing them.
Implemented Spark SQL to access hive tables into spark for faster processing of data.
Worked on Spark streaming using Apache Kafka for real time data processing.
Experience in creating Kafka producer and Kafka consumer for Spark streaming.
Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data into HDFS.
Used Sqoop for importing and exporting data from Netezza, Teradata into HDFS and Hive.
Worked on three layers for storing data such as raw layer, intermediate layer and publish layer.
Creating external hive tables to store and queries the data which is loaded.
Optimizations techniques include partitioning, bucketing.
Using Avro file format compressed with Snappy in intermediate tables for faster processing of data.
Used parquet file format for published tables and created views on the tables.
Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.
Automated the jobs with Oozie and scheduled them with Autosys.
Experience in AWS to spin up the EMR cluster to process the huge data which is stored in S3 and push it to HDFS.
Participated in evaluation and selection of new technologies to support system efficiency.
Participated in development and execution of system and disaster recovery processes.

Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Spark, Kafka, s3, Sqoop, Java, Scala, Eclipse, Tableau and Maven, SBT.

Confidential

Java Developer

Responsibilities:

Involved in the complete SDLC software development life cycle of the application from requirement gathering and analysis to testing and maintenance.
Developed the modules based on MVC Architecture.
Developed UI using JavaScript, JSP, HTML and CSS for interactive cross browser functionality and complex user interface.
Created business logic using servlets and session beans and deployed them on Apache Tomcat server.
Created complex SQL Queries, PL/SQL Stored procedures and functions for back end.
Prepared the functional, design and test case specifications.
Performed unit testing, system testing and integration testing.
Developed unit test cases. Used JUnit for unit testing of the application.
Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.

Environment: Java, JSP, Servlets, Apache Tomcat, Oracle, SQL

Confidential

Java Developer

Responsibilities:

Involved in design, development and analysis documents in sharing with Clients.
Developed web pages using Struts framework, JSP, XML, JavaScript, Hibernate, springs, Html/ DHTML and CSS, configure struts application, use tag library.
Developed Application using Spring and Hibernate, Spring batch, Web Services like Soap and restful Web services.
Used Spring Framework at Business Tier and also spring’s Bean Factory for initializing services.
Used AJAX, JavaScript to create interactive user interface.
Implemented client side validations using JavaScript & server side validations.
Developed Single Page application using angular JS & backbone JS.
Implemented Hibernate to persist the data into Database and wrote HQL based queries to implement CRUD operations on the data.
Developed an API to write XML documents from a database. Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
Database modeling, administration and development using SQL and PL/SQL in Oracle 11g.
Coded different deployment descriptors using XML. Generated Jar files are deployed on Apache Tomcat Server.
Involved in the development of presentation layer and GUI framework in JSP. Client-Side validations were done using JavaScript.
Involved in configuring and deploying the application using WebSphere.
Involved in code reviews and mentored the team in resolving issues.
Undertook the Integration and testing of the various parts of the application.
Developed automated Build files using ANT.
Used Subversion for version control and log4j for logging errors.
Code Walkthrough, Test cases and Test Plans

Environment: HTML5, JSP, Servlets, JDBC, JavaScript, Json, Spring, SQL, Oracle 11g, Tomcat, Eclipse IDE, XML, XSL, ANT, Tomcat 5.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Sacramento, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship