We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Sacramento, CA

SUMMARY

  • 7+ years of experience in all phases of Software Application requirement analysis, design, development and maintenance of Hadoop/Big Data application like SPARK, KAFKA, EMR, Hive, Sqoop and applications using java and scala to tailor with industry needs.
  • Hands on experience with Spark Core, Spark SQL, Spark Streaming.
  • Used Spark - SQL to perform transformations and actions on data residing in Hive.
  • Used Kafka & Spark Streaming for real-time processing.
  • Experience with migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop.
  • Good Knowledge in Apache Spark data processing to handle data from RDBMS and streaming sources with Spark streaming.
  • Experience in Data Warehousing and ETL processes and Strong database, SQL, ETL and data analysis skills.
  • Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Have good skills in writing SPARK Jobs in Scala for processing large sets of structured, semi-structured and store them in HDFS.
  • Good Knowledge in Spark SQL queries to load tables into HDFS to run select queries on top.
  • Experience in writing Hive Queries for processing and analyzing large volumes of data.
  • Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
  • Implemented several optimization mechanisms like Combiners, Distributed Cache, Data Compression, and Custom Practitioner to speed up the jobs.

TECHNICAL SKILLS

Big Data/Hadoop: HDFS, Hive, Sqoop, Impala, Kafka, Map Reduce, Cloudera, Amazon EMR.

Spark Components: Spark Core, Spark SQL, Spark Streaming.

Programming Languages: SQL, Scala and Java

Databases: MySQL, Hive-QL, RDBMS.

Cloud: Amazon EMR, EC2, S3.

Operating Systems: Windows, Unix, Red Hat Linux.

PROFESSIONAL EXPERIENCE

Confidential - Sacramento, CA

Hadoop Developer

Responsibilities:

  • Interacting with multiple teams understanding their business requirements for designing flexible and common component.
  • Validating the source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
  • Implemented Spark SQL to access hive tables into spark for faster processing of data.
  • Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data.
  • Validating and visualizing the data in Tableau.
  • Using hive extensively to create a views for the feature data.
  • Working with platform and Hadoop teams closely for the needs of the team.
  • Using Kafka for Data ingestion for different data sets.
  • Experienced in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOP.
  • Developed sqoop jobs to import the data from RDBMS and file servers into Hadoop.

Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Spark, Kafka, s3, Sqoop.

Confidential - Portland, Oregon

Spark/Hadoop Developer

Responsibilities:

  • Interacting with multiple teams understanding their business requirements for designing flexible and common component.
  • Validating the source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
  • Used Spark SQL for creating data frames and performed transformations on data frames like adding schema manually, casting, joining data frames before storing them.
  • Implemented Spark SQL to access hive tables into spark for faster processing of data.
  • Worked on Spark streaming using Apache Kafka for real time data processing.
  • Experience in creating Kafka producer and Kafka consumer for Spark streaming.
  • Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data into HDFS.
  • Worked on three layers for storing data such as raw layer, intermediate layer and publish layer.
  • Creating external hive tables to store and queries the data which is loaded.
  • Optimizations techniques include partitioning, bucketing.
  • Using Avro file format compressed with Snappy in intermediate tables for faster processing of data.
  • Used parquet file format for published tables and created views on the tables.
  • Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.
  • Automated the jobs with Oozie and scheduled them with Autosys.
  • Experience in AWS to spin up the EMR cluster to process the huge data which is stored in S3 and push it to HDFS.
  • Participated in evaluation and selection of new technologies to support system efficiency.
  • Participated in development and execution of system and disaster recovery processes.

Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Spark, Kafka, s3, Sqoop, Java, Scala, Eclipse, Tableau and Maven, SBT

Confidential - Richmond, VA

Spark/Hadoop Developer

Responsibilities:

  • Interacting with multiple teams understanding their business requirements for designing flexible and common component.
  • Validating the source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
  • Used Spark SQL for creating data frames and performed transformations on data frames like adding schema manually, casting, joining data frames before storing them.
  • Implemented Spark SQL to access hive tables into spark for faster processing of data.
  • Worked on Spark streaming using Apache Kafka for real time data processing.
  • Experience in creating Kafka producer and Kafka consumer for Spark streaming.
  • Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data into HDFS.
  • Used Sqoop for importing and exporting data from Netezza, Teradata into HDFS and Hive.
  • Worked on three layers for storing data such as raw layer, intermediate layer and publish layer.
  • Creating external hive tables to store and queries the data which is loaded.
  • Optimizations techniques include partitioning, bucketing.
  • Using Avro file format compressed with Snappy in intermediate tables for faster processing of data.
  • Used parquet file format for published tables and created views on the tables.
  • Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.
  • Automated the jobs with Oozie and scheduled them with Autosys.
  • Experience in AWS to spin up the EMR cluster to process the huge data which is stored in S3 and push it to HDFS.
  • Participated in evaluation and selection of new technologies to support system efficiency.
  • Participated in development and execution of system and disaster recovery processes.

Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Spark, Kafka, s3, Sqoop, Java, Scala, Eclipse, Tableau and Maven, SBT.

Confidential

Java Developer

Responsibilities:

  • Involved in the complete SDLC software development life cycle of the application from requirement gathering and analysis to testing and maintenance.
  • Developed the modules based on MVC Architecture.
  • Developed UI using JavaScript, JSP, HTML and CSS for interactive cross browser functionality and complex user interface.
  • Created business logic using servlets and session beans and deployed them on Apache Tomcat server.
  • Created complex SQL Queries, PL/SQL Stored procedures and functions for back end.
  • Prepared the functional, design and test case specifications.
  • Performed unit testing, system testing and integration testing.
  • Developed unit test cases. Used JUnit for unit testing of the application.
  • Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.

Environment: Java, JSP, Servlets, Apache Tomcat, Oracle, SQL

Confidential

Java Developer

Responsibilities:

  • Involved in design, development and analysis documents in sharing with Clients.
  • Developed web pages using Struts framework, JSP, XML, JavaScript, Hibernate, springs, Html/ DHTML and CSS, configure struts application, use tag library.
  • Developed Application using Spring and Hibernate, Spring batch, Web Services like Soap and restful Web services.
  • Used Spring Framework at Business Tier and also spring’s Bean Factory for initializing services.
  • Used AJAX, JavaScript to create interactive user interface.
  • Implemented client side validations using JavaScript & server side validations.
  • Developed Single Page application using angular JS & backbone JS.
  • Implemented Hibernate to persist the data into Database and wrote HQL based queries to implement CRUD operations on the data.
  • Developed an API to write XML documents from a database. Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Database modeling, administration and development using SQL and PL/SQL in Oracle 11g.
  • Coded different deployment descriptors using XML. Generated Jar files are deployed on Apache Tomcat Server.
  • Involved in the development of presentation layer and GUI framework in JSP. Client-Side validations were done using JavaScript.
  • Involved in configuring and deploying the application using WebSphere.
  • Involved in code reviews and mentored the team in resolving issues.
  • Undertook the Integration and testing of the various parts of the application.
  • Developed automated Build files using ANT.
  • Used Subversion for version control and log4j for logging errors.
  • Code Walkthrough, Test cases and Test Plans

Environment: HTML5, JSP, Servlets, JDBC, JavaScript, Json, Spring, SQL, Oracle 11g, Tomcat, Eclipse IDE, XML, XSL, ANT, Tomcat 5.

We'd love your feedback!