We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Fremont, CA

SUMMARY

  • Hadoop Developer with around 8 years of experience in Information Technology & Hadoop Ecosystem.
  • Expertise in Hadoop Ecosystem components HDFS, Map Reduce, Hive, Pig, Sqoop, Hbase, Kafka, Samza for Data Analytics.
  • Good Knowledge in Apache Spark and SparkSQL.
  • Have a hands - on experience on fetching the live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka. Experience on streaming data using Apache Flume.
  • Worked on Key-Value pair with the help of RDD’s transformation and action for sorting, filtering and analyzing Big-Data (pyspark)
  • Experience in designing and developing tables in HBase and storing aggregated data from Hive Table. Good Knowledge with NoSQL Databases - Cassandra, Mongo DB and HBase.
  • Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3 and Redshift.
  • Knowledge on Scala Programming Language for developing Spark applications.
  • Worked on all kinds of file format such as AVRO, Sequence, Parquet, text-file for both importing and exporting from HDFS.
  • Deep Knowledge in the core concepts of MapReduce Framework and Hadoop ecosystem
  • Hands on experience in cleansing semi-structured and unstructured data using Pig Latin scripts
  • Experience in working with BI Visualization tools like Tableau, Qlikview and informatica.
  • Worked on predictive modeling techniques like Neural Networks, Decision Trees and Regression Analysis
  • Experience in handling multiple relational databases: MySQL, SQL Server, PostgreSQL and Oracle
  • Extensive experience in working with Struts and Spring MVC (Model View Controller) architecture for developing applications using various Java/J2EE technologies like Servlets, JSP, JDBC, JSTL.
  • Hands-on experience in developing web applications using MVC (Model View Controller) architecture including Spring MVC, Struts, Hibernate and Servlets.

TECHNICAL SKILLS

Big Data Technologies: Hadoop Architecture, HDFS, Map Reduce, Hive, Pig, Hbase, Sqoop, Zookeeper, Flume, Kafka, Samza Apache Spark, Spark Streaming, Spark SQL, Spark MLib

Databases: MySQL, SQL Server, PL/SQL, Cassandra, TeraData

Environment: Cloudera, Hortonworks, MapR

BI Tools: Tableau, Informatica

PROFESSIONAL EXPERIENCE

Confidential, Fremont, CA

Hadoop Developer

Responsibilities:

  • Developed efficient Map Reduce programs in java for filtering out the unstructured data.
  • Imported data from various relational data stores to HDFS using Sqoop
  • Exported the business required information to RDBMS using Sqoop to make the data available for BI team to generate reports based on data
  • Responsible for installing and configuring Hadoop MapReduce, HDFS, also developed various MapReduce jobs for data cleaning
  • Installed and configured Hive to create tables for the unstructured data in HDFS
  • Hold good expertise on major components in Hadoop Ecosystem including Hive, PIG, HBase, HBase-Hive Integration, Sqoop and Flume.
  • Involved in loading data from UNIX file system to HDFS
  • Responsible for managing and scheduling jobs on Hadoop Cluster
  • Responsible for importing and exporting data into HDFS and Hive using Sqoop
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Experienced in managing Hadoop log files
  • Worked on managing data coming from different sources
  • Wrote HQL queries to create tables and loaded data from HDFS to make it structured
  • Load and transform large sets of structured, semi structured and unstructured data
  • Extensively worked on Hive for generating transforming files from different analytical formats to .txt i.e. text files enabling to view the data for further analysis
  • Created Hive tables, loaded them with data and wrote hive queries that run internally in MapReduce way
  • Wrote and modified store procedures enabling to load and modify data as per the project requirements
  • Responsible for developing PIG Latin scripts enabling the extraction of data from the web server output files to load into HDFS
  • Extensively used Flume to collect the log files from the web servers and then integrated these files into HDFS
  • Responsible for implementing schedulers on Job Tracker enabling them to effectively use the resources available in the cluster for any given MapReduce jobs.
  • Constantly worked on tuning the performance of the queries in Hive and Pig, making the queries work even more powerfully in processing and retrieving the data
  • Supported Map Reduce Programs running on the cluster
  • Created external tables in Hive and loaded the data into these tables
  • Hands on experience in database performance tuning and data modeling
  • Monitored the cluster coordination using ZooKeeper

Environment: Hadoop v1.2.1, HDFS, MapReduce, Hive, Sqoop, Pig, DB2, Oracle, XML, CDH4.x

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

  • Importing Large Data Sets from DB2 to Hive Table using Sqoop.
  • Created Hive Managed and External Tables as per the requirements
  • Designing and developing tables in HBase and storing aggregating data from Hive
  • Developing Hive Scripts for data aggregating and processing as per the Use Case.
  • Writing Java Custom UDF's for processing data in Hive.
  • Developing and maintaining Workflow Scheduling Jobs in Oozie for importing data from RDBMS to Hive.
  • The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Implemented Partitioning, Bucketing in Hive for better organization of the data
  • Optimized Hive queries for performance tuning.
  • Involved with the team of fetching live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka.
  • Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores, NOSQL, Hadoop, PIG, MySQL and Oracle databases
  • Experience in using Avro, Parquet, RC File and JSON file formats and developed UDFs using Hive and Pig.
  • Installed Oozie workflow to run several MapReduce jobs.
  • Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
  • Worked on different file formats like XML files, Sequence files, JSON, CSV and Map files using Map Reduce Programs.
  • Continuously monitored and managed Hadoop cluster using Cloudera Manager.
  • Performed POC’s using latest technologies like spark, Kafka, Scala
  • Worked on the conversion of existing MapReduce batch applications to Spark for better performance.

Environment: Hadoop v2.4.0, HDFS, Map Reduce, Core Java, Oozie, Hive, Sqoop, CDH 4.x.x

Confidential, Tampa, Florida

Senior Hadoop Developer

Responsibilities:

  • Being a ground up project, we have developed the entire application from scratch and I have worked mainly on writing code for Kafka Producer and Kafka Consumer as per our requirement.
  • After persisting the data into Kafka brokers successfully, It is written to a flat file from where we load it into HIVE table.
  • Defined and created the structure of Hive table on one side and Hbase table on the other side.
  • Developed a spark pipeline to transfer data from lake to Cassandra in cloud to make the data available for decision engine to publish customized offers real time.
  • Worked on Big Data Integration and Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods technologies.
  • Performed complex mathematical, statistical and machine learning analysis using SparkMlib, Spark Streaming and GraphX. Worked on Amazon Web Services EC2 console.
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce and Spark to ingest customer behavioral data and purchase histories into HDFS for analysis.
  • Used Storm to consume events coming through Kafka and generate sessions and publish them back to Kafka.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.

Environment: Hadoop v2.6.0, HDFS, CDH 5.3.x, Map Reduce, HBase, Sqoop, Core Java, Hive, Oozie DB, Spark Streaming and Apache Kafka

Confidential, Dublin, Ohio

Java Developer

Responsibilities:

  • Involved in development, testing and maintenance process of the application
  • Used Spring MVC framework to implement the MVC architecture.
  • Developed Stored Procedures, Triggers and Functions in Oracle.
  • Developed spring services, DAO's and performed object relation mappings using Hibernate.
  • Involved in understanding the business processes and defining the requirements.
  • Build test cases and performed unit testing.
  • Logging done using Log4j.
  • Used CVS for version control.

Environment: Java 7 version, IntelliJ, Maven, Spring Framework, JavaScript, Oracle SQL Developer

Confidential, Cleveland, Ohio

Java Developer

Responsibilities:

  • Participated in the implementation of efforts like coding, unit testing.
  • Implemented a web based application using Servlet, JSP.
  • Developed Customs tags to display dynamic contents and to avoid large amounts of java code in JSP pages.
  • Developed code for handling the exceptions using exceptional handing.
  • Wrote PL/SQL queries, stored procedures, and triggers to perform back-end database operation
  • Prepared test case document and performed unit testing and system testing.
  • Followed the algorithms given by senior database programmers and developing tables and database queries.

Environment: Java J2EE, Java Spring, Hibernate, Maven, Jenkins, Excel,, Eclipse IDE, Windows

We'd love your feedback!