Hadoop Developer Resume Fremont, CA - Hire IT People

SUMMARY

Hadoop Developer with around 8 years of experience in Information Technology & Hadoop Ecosystem.
Expertise in Hadoop Ecosystem components HDFS, Map Reduce, Hive, Pig, Sqoop, Hbase, Kafka, Samza for Data Analytics.
Good Knowledge in Apache Spark and SparkSQL.
Have a hands - on experience on fetching the live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka. Experience on streaming data using Apache Flume.
Worked on Key-Value pair with the help of RDD’s transformation and action for sorting, filtering and analyzing Big-Data (pyspark)
Experience in designing and developing tables in HBase and storing aggregated data from Hive Table. Good Knowledge with NoSQL Databases - Cassandra, Mongo DB and HBase.
Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3 and Redshift.
Knowledge on Scala Programming Language for developing Spark applications.
Worked on all kinds of file format such as AVRO, Sequence, Parquet, text-file for both importing and exporting from HDFS.
Deep Knowledge in the core concepts of MapReduce Framework and Hadoop ecosystem
Hands on experience in cleansing semi-structured and unstructured data using Pig Latin scripts
Experience in working with BI Visualization tools like Tableau, Qlikview and informatica.
Worked on predictive modeling techniques like Neural Networks, Decision Trees and Regression Analysis
Experience in handling multiple relational databases: MySQL, SQL Server, PostgreSQL and Oracle
Extensive experience in working with Struts and Spring MVC (Model View Controller) architecture for developing applications using various Java/J2EE technologies like Servlets, JSP, JDBC, JSTL.
Hands-on experience in developing web applications using MVC (Model View Controller) architecture including Spring MVC, Struts, Hibernate and Servlets.

TECHNICAL SKILLS

Big Data Technologies: Hadoop Architecture, HDFS, Map Reduce, Hive, Pig, Hbase, Sqoop, Zookeeper, Flume, Kafka, Samza Apache Spark, Spark Streaming, Spark SQL, Spark MLib

Databases: MySQL, SQL Server, PL/SQL, Cassandra, TeraData

Environment: Cloudera, Hortonworks, MapR

BI Tools: Tableau, Informatica

PROFESSIONAL EXPERIENCE

Confidential, Fremont, CA

Hadoop Developer

Responsibilities:

Developed efficient Map Reduce programs in java for filtering out the unstructured data.
Imported data from various relational data stores to HDFS using Sqoop
Exported the business required information to RDBMS using Sqoop to make the data available for BI team to generate reports based on data
Responsible for installing and configuring Hadoop MapReduce, HDFS, also developed various MapReduce jobs for data cleaning
Installed and configured Hive to create tables for the unstructured data in HDFS
Hold good expertise on major components in Hadoop Ecosystem including Hive, PIG, HBase, HBase-Hive Integration, Sqoop and Flume.
Involved in loading data from UNIX file system to HDFS
Responsible for managing and scheduling jobs on Hadoop Cluster
Responsible for importing and exporting data into HDFS and Hive using Sqoop
Experienced in running Hadoop streaming jobs to process terabytes of xml format data
Experienced in managing Hadoop log files
Worked on managing data coming from different sources
Wrote HQL queries to create tables and loaded data from HDFS to make it structured
Load and transform large sets of structured, semi structured and unstructured data
Extensively worked on Hive for generating transforming files from different analytical formats to .txt i.e. text files enabling to view the data for further analysis
Created Hive tables, loaded them with data and wrote hive queries that run internally in MapReduce way
Wrote and modified store procedures enabling to load and modify data as per the project requirements
Responsible for developing PIG Latin scripts enabling the extraction of data from the web server output files to load into HDFS
Extensively used Flume to collect the log files from the web servers and then integrated these files into HDFS
Responsible for implementing schedulers on Job Tracker enabling them to effectively use the resources available in the cluster for any given MapReduce jobs.
Constantly worked on tuning the performance of the queries in Hive and Pig, making the queries work even more powerfully in processing and retrieving the data
Supported Map Reduce Programs running on the cluster
Created external tables in Hive and loaded the data into these tables
Hands on experience in database performance tuning and data modeling
Monitored the cluster coordination using ZooKeeper

Environment: Hadoop v1.2.1, HDFS, MapReduce, Hive, Sqoop, Pig, DB2, Oracle, XML, CDH4.x

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

Importing Large Data Sets from DB2 to Hive Table using Sqoop.
Created Hive Managed and External Tables as per the requirements
Designing and developing tables in HBase and storing aggregating data from Hive
Developing Hive Scripts for data aggregating and processing as per the Use Case.
Writing Java Custom UDF's for processing data in Hive.
Developing and maintaining Workflow Scheduling Jobs in Oozie for importing data from RDBMS to Hive.
The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Implemented Partitioning, Bucketing in Hive for better organization of the data
Optimized Hive queries for performance tuning.
Involved with the team of fetching live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka.
Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores, NOSQL, Hadoop, PIG, MySQL and Oracle databases
Experience in using Avro, Parquet, RC File and JSON file formats and developed UDFs using Hive and Pig.
Installed Oozie workflow to run several MapReduce jobs.
Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
Worked on different file formats like XML files, Sequence files, JSON, CSV and Map files using Map Reduce Programs.
Continuously monitored and managed Hadoop cluster using Cloudera Manager.
Performed POC’s using latest technologies like spark, Kafka, Scala
Worked on the conversion of existing MapReduce batch applications to Spark for better performance.

Environment: Hadoop v2.4.0, HDFS, Map Reduce, Core Java, Oozie, Hive, Sqoop, CDH 4.x.x

Confidential, Tampa, Florida

Senior Hadoop Developer

Responsibilities:

Being a ground up project, we have developed the entire application from scratch and I have worked mainly on writing code for Kafka Producer and Kafka Consumer as per our requirement.
After persisting the data into Kafka brokers successfully, It is written to a flat file from where we load it into HIVE table.
Defined and created the structure of Hive table on one side and Hbase table on the other side.
Developed a spark pipeline to transfer data from lake to Cassandra in cloud to make the data available for decision engine to publish customized offers real time.
Worked on Big Data Integration and Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods technologies.
Performed complex mathematical, statistical and machine learning analysis using SparkMlib, Spark Streaming and GraphX. Worked on Amazon Web Services EC2 console.
Developed data pipeline using Flume, Sqoop, Pig and Java map reduce and Spark to ingest customer behavioral data and purchase histories into HDFS for analysis.
Used Storm to consume events coming through Kafka and generate sessions and publish them back to Kafka.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.

Environment: Hadoop v2.6.0, HDFS, CDH 5.3.x, Map Reduce, HBase, Sqoop, Core Java, Hive, Oozie DB, Spark Streaming and Apache Kafka

Confidential, Dublin, Ohio

Java Developer

Responsibilities:

Involved in development, testing and maintenance process of the application
Used Spring MVC framework to implement the MVC architecture.
Developed Stored Procedures, Triggers and Functions in Oracle.
Developed spring services, DAO's and performed object relation mappings using Hibernate.
Involved in understanding the business processes and defining the requirements.
Build test cases and performed unit testing.
Logging done using Log4j.
Used CVS for version control.

Environment: Java 7 version, IntelliJ, Maven, Spring Framework, JavaScript, Oracle SQL Developer

Confidential, Cleveland, Ohio

Java Developer

Responsibilities:

Participated in the implementation of efforts like coding, unit testing.
Implemented a web based application using Servlet, JSP.
Developed Customs tags to display dynamic contents and to avoid large amounts of java code in JSP pages.
Developed code for handling the exceptions using exceptional handing.
Wrote PL/SQL queries, stored procedures, and triggers to perform back-end database operation
Prepared test case document and performed unit testing and system testing.
Followed the algorithms given by senior database programmers and developing tables and database queries.

Environment: Java J2EE, Java Spring, Hibernate, Maven, Jenkins, Excel,, Eclipse IDE, Windows

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Fremont, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship