We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Los Angeles, CA

SUMMARY:

  • 7 years of Professional Experience in Implementation and Application Support projects which includes 4+ years on Big Data technologies such as Hadoop, Kafka, NiFi, and Couchbase.
  • Expertise in Big Data technologies using Hortonworks distribution and its ecosystem like HDFS, MapReduce (MRV1, MRV2/YARN), Apache PIG, Apache Spark, Apache HBase, Apache Hive, Apache Sqoop, Apache Zookeeper, Apache Flume, Apache Oozie, Apache Cassandra, Cloudera Hue.
  • In depth understanding / knowledge of the Hadoop Architecture and its various components such as HDFS, Resource Manager, Node Manager, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node and MapReduce concepts.
  • Experienced in importing and exporting data from relational database into / from HDFS using Sqoop.
  • Good at doing the encryption on HDFS level by using voltage encryption for hiding customer personal information.
  • Extensively worked on creating complex MapReduce (MR) Batch programs to perform Big Data processing and analysis using Pig Latin and customized core JAVA UDF's.
  • Developed Pig Latin scripts to perform complex Big Data processing and analysis on HDFS
  • Experience in implementing partitioning and bucketing techniques in HIVE.
  • Experience in writing Hive QL queries to store processed data into Hive tables for Big Data oriented analysis.
  • Developed projects using Apache Spark with in - memory processing features.
  • Experience in working with NoSQL Column-Oriented Databases like HBase and their Integration with HDFS.
  • Experience in tuning and debugging Spark application and using spark optimization techniques.
  • Experience in loading log files from multiple sources directly into HDFS using Flume.
  • Experience in analyzing data using Hive QL, Pig Latin, HBase and custom Map Reduce programs in Java. Extending Hive and Pig core functionality by writing custom UDFs.
  • Experience in tuning the performances by using Partitioning, Bucketing and Indexing in HIVE.
  • Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Extensively worked in Core JAVA and developed various customized JAVA UDF's
  • Passionate towards working in Big Data and Analytics environment
  • Excellent knowledge on Spark Architecture and Hadoop Architecture and its ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Analytical thinker that consistently resolves on-going issues or defects, often called upon to consult on problems as well a fast learner.
  • An individual with excellent interpersonal and communication skills, strong business acumen and work ethics, technical competency, team-player spirit and leadership skills
  • Highly motivated, creative problem solving skills, self-starter with a positive attitude and willingness to learn new concepts and accepts challenges
  • Experienced in Team management and Project management.

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential, Los Angeles, CA

Environment: Hortonworks, HDFS, MapReduce (MR), Apache PIG, Apache Sqoop, Apache Zookeeper, Apache Flume, Hive, Core JAVA, Teradata, DB2 UDB, Apache HBase, Apache Cassandra, UNIX Shell Scripts, Agile SCRUM.

Responsibilities:

  • Working on a live Big Data Hadoop production environment
  • Understanding the client requirements and creating formal Business Requirement specifications.
  • Transforming the data according to business logic in HIVE & PIG.
  • Worked with Big Data Policy and Security teams in order to create data policy and develop interfaces to anonymize the data
  • Good experience in writing MapReduce programs in Java on MRv2 / YARN environment.
  • Worked on NiFi, created workflows in NiFi, supported in performance testing the NiFi cluster.
  • Developed spark streaming jobs to consume data from different sources such as Kafka/s3 and ingest data into couchbase/Kafka.
  • Developed Spark streaming jobs to ingest data into HDFS/HBase and validated the data.
  • Extracted structured data from Teradata, DB2 UDB and DB2 z/OS Relational Database onto HDFS using Sqoop
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
  • Generating the required reports using Oozie workflow and Hive queries for operations team from the ingested data.
  • Created complex Pig Latin scripts to process the extracted data as per the Business Requirement specifications and developed Pig Scripts to store unstructured data in HDFS.
  • Experience in writing customized Hive UDFs for complex processing
  • Created Managed tables and External tables in Hive and loaded data from HDFS
  • Designed and implemented complex Map Reduce (MR) jobs to support distributed processing using Pig Latin
  • Worked with structured/semi-structured/unstructured data
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how it translates to MapReduce (MR) jobs
  • Created HBase Tables to store data onto them with Hive integration
  • Unit testing and Integration testing
  • Production Deployment and maintenance.

Hadoop Engineer

Confidential

Environment: HDFS, Map Reduce, Apache Hive, Apache Pig, Apache Spark, Sqoop, Oozie, HBase, EDW

Responsibilities:

  • Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
  • Migrated the existing data to Hadoop from RDBMS (SQL Server and Oracle) using Sqoop for processing the data.
  • Created Hive queries that helped data analysis on Customer purchase trends by comparing fresh data with EDW reference data and historical metrics.
  • Developed Shell scripts to perform data loads in automated way and perform analysis.
  • Responsible for loading unstructured and semi-structured data into Hadoop cluster coming from different sources using Flume.
  • Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
  • Developed custom MapReduce programs and custom User Defined Functions (UDF's) in Hive to transform the large volumes of data with respect to business requirement.
  • Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
  • Created internal and external tables with properly defined static and dynamic partitions for efficiency.
  • Implemented Hive custom UDF's to achieve comprehensive data analysis.
  • Implemented Sqoop Jobs for incremental data imports for few 100 TB's data.
  • Created HBase tables on Hive for handling updated in Hadoop.
  • Responsible for landing multi source data to HDFS using Spark streaming.

Hadoop Developer

Confidential - Chicago, IL

Environment: Cloudera Apache Hadoop (CDH 4), HDFS, MapReduce (MR), Apache PIG, Apache Sqoop, Apache Zookeeper, Core JAVA, Teradata, DB2 UDB (LUW), HBase, UNIX Shell Scripts, Tortoise SVN, Agile SCRUM

Responsibilities:

  • Worked on Distributed/Cloud Computing (Map Reduce/Hadoop, Hive, Pig, HBase, Sqoop, Spark, Zookeeper etc.), Hortonworks for Hadoop
  • Imported and exported data into HDFS, Hive and HBase using SQOOP.
  • Worked on loading and transformation of large sets of structured, semi structured data into Hadoop system.
  • Load data from various data sources into HDFS using Flume.
  • Developed the Pig UDFs to pre-process the data for analysis.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Stored parsed data into HBase and Hive using HBase-Hive Integration.
  • Involved in loading data into Cassandra NoSQL Database and Cassandra integration and merging with SQL data.
  • Hiding the customer personal information by doing encryption on HDFS level using voltage encryption.
  • Performed Hive Query Optimization for better performance.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDFs in Hive Querying.
  • Used SQL scripting, and Stored Procedures for managing data in databases.
  • Worked on performance tuning of HIVE queries with partitioning and bucketing process.
  • Worked on huge datasets from Hive to understand and visualize the data for analysis
  • Experienced in the design, development, and creation of HBase schemas.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Hive.

Hadoop Developer

Confidential

Description: This was a pilot project intended to train employees on the latest advancements in Hadoop and its ecosystem components and setup a Hadoop environment to perform big data analysis.

Environment: Hadoop MapReduce, HDFS, Hive, Sqoop, Pig, Linux and MySQL

Responsibilities:

  • Involved in understanding the requirement from business and design an implementation plan.
  • Extracted files from DB2 through Sqoop and placed in HDFS and processed
  • Analyzed large data sets by running Hive queries and Pig scripts
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Developed Simple to complex MapReduce Jobs using Hive and Pig
  • Involved in running Hadoop jobs for processing millions of records of text data
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Developed multiple MapReduce jobs in java for data cleaning and pre-processing
  • Involved in unit testing using MR unit for Map Reduce jobs
  • Involved in loading data from LINUX file system to HDFS
  • Responsible for managing data from multiple sources
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured data.
  • Responsible to manage data coming from different sources.
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts

Software Developer

Confidential

Environment: JAVA, J2EE, JSP, HTML, CSS, JAVA SCRIPT, Tomcat, Servlets, JDBC, Oracle, SQL, DB2 UDB (LUW), DB2 z/OS (Mainframe), UNIX Shell Scripts and PERL Scripts, Tortoise SVN, Waterfall, MS SQL Server 2008, T-SQL, SQL Server Integration Services (SSIS)

Responsibilities:

  • Responsible for understanding the requirement from Business users
  • Participated discussions with the teammates and in designing the system
  • Developed web pages using JSP and handled the requests using java and Servlets.
  • Developed client side validations using java script.
  • Validation done Server side on basis of file format support.
  • Developed java code according to MVC architecture
  • Created objects like table, complex stored procedure, view, UDF, Cursor, DML, and DDL Trigger based on the requirement using T-SQL programming.
  • Applying Query optimization techniques & tuning the query by creating indexes to improve the Query& System performance.
  • Used SQL Database in the project and developed the Admin screens using JSP, JavaScript.
  • Worked on SQL Server Integration Services (SSIS) to integrate and analyze data from multiple homogeneous and heterogeneous information sources
  • Used SQL Database in the project and developed the Admin screens using JSP, JavaScript.
  • Bug fixing for priority one issues.

We'd love your feedback!