We provide IT Staff Augmentation Services!

Senior Hadoop/big Data Developer Resume

4.00/5 (Submit Your Rating)

CaliforniA

SUMMARY

  • Having over 17 years of experience in application development, enhancement, maintenance, support and testing.
  • 7+ years of Data processing experience using Big Data using Hadoop /Spark Eco - system components (HDFS, MapReduce (MR1/YARN), Pig, HBase, Hive, Cassandra, Sqoop, Flume, Oozie, Spark and SparkSQL).
  • Hands on experience in using Hive and Pig scripts to perform data analysis, data cleaning and data transformation.
  • Hands on experience in capturing data and importing using Sqoop from existing relational databases (SQL Server, MySQL, DB2) with the halp of connectors and fast loaders.
  • Developing Rest API services using Springboot and Java.
  • Solid experience in Storage, Querying, Processing and Analysis of Big Data using Hadoop framework.
  • Good Experience in managing and reviewing Hadoop log files.
  • Involved in analysis, design, application development & testing using Hadoop Framework.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop jobs.
  • Experience in using different file formats like Parquet, AVRO, ORC and CSV using different compression Techniques.
  • Excellent understanding of Hadoop Architecture and different daemons of Hadoop cluster, which includes Job Tracker, Task Tracker, Name Node, and Data Node and YARN daemons like Resource Manager and Node Manager and Job History Server daemon.
  • Good understanding of Hadoop/Spark design principals, cluster connectivity and performance tuning.
  • Having good experience of Spark RDD, SparkSQL & SparkStreaming
  • Worked on Version control systems like SVN, SCCS and Git.
  • Proficiency knowledge in parallel processing using MapReduce and Spark.
  • Strong in preparation of Unix shell script and Python.
  • Good knowledge on using AWS working environment.
  • Experience in mainframe technologies using COBOL, PL/1, JCL, VSAM and DB2.
  • Quick learner and good team player. Possess good analytical and communication skills.
  • Good understanding and experience with Software Development methodologies like Agile and Waterfall.

TECHNICAL SKILLS

Programming Languages: Java, Scala, Python, shell Scripting, Cobol, PL/1.

Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Spark, Oozie, Sqoop, Kafka

Databases: DB2, MySQL, HBase, Cassandra, SQL Server, AWS Atana, Apache Solr

Operating Systems: Linux, UNIX, Windows, IBM Mainframe Z/OS

IDE/Testing Tools: Eclipse, Jenkins, JUnit

Tools: Maven, Git, SVN, SCCS, Event Engine

Distribution: Apache Hadoop, MapR, AWS and Hortonworks framework & ClouderaDistribution Hadoop

PROFESSIONAL EXPERIENCE

Confidential, California

Senior Hadoop/Big Data Developer

Responsibilities:

  • Designing, developing, and maintaining data pipelines through the use of Hadoop Distribution.
  • Developing APIs for exchanging data with other computer systems and consuming APIs from other software systems into Big Data system.
  • Ensuring that queries with billions of rows are tuned for faster querying by setting various keys on large tables for faster access, adding indexes for faster querying and through in-depth learning of Hadoop technology to understand differences from other standard databases.
  • Designing data pipelines by utilizingHadoop Framework, Control-M, Java and Shell Scripting.
  • Design and implementing cost-effective solution for managing the flood of data that involves ingesting the data into a Data Lake.
  • Automate workflows that increase the lab productivity and ensure the deadlines with bug fixing and performance improvements to enhance the workflow.
  • Develop Stable and robustETLpipelines, a critical component of the data infrastructure of modern enterprises.
  • Developed Sqoop scripts to perform Data ingestion into Data Lake.
  • Developing Rest API services using Spring Boot and Java to fetch data from Apache Solr collections.

Environment: Spark 2.4, Hive 2.1, Scala 2.11, Maven, SVN, Unix, Shell script, Control-M, CDH6.1, Python, Apache Solr. HBase2.1, Spring Boot 2.

Confidential, California

Senior Hadoop/Big Data Developer

Responsibilities:

  • Understanding business requirements and functional specifications of Confidential -iTunes domain.
  • Application Development and Unit Testing.
  • Integration and System Testing & UAT Support.
  • Implementing codes changes in Production environment.
  • Involved in HDP upgrade from HDP2.2 to HDP2.6
  • Involved in Autosys jobs decommission (MapReduce jobs).
  • Migration Royalty, Obligation and Affiliate process from Teradata to Hadoop.
  • Analyzing production issues and finding root cause.
  • Providing KT sessions (GDT, Pie Spark) to offshore team.
  • Coordinating with offshore team and Business Users in Client location
  • Customer Support, Research and Analysis, Process optimization and reporting enhancements.
  • Identify customer’s business flow and provide recommendations in key strategic areas.

Environment: Spark 2.2, Hive 2.3, Scala 2.11, Maven, Git hub/ Bit Bucket, Unix, Shell script, Autosys, HDP2.6, Python, Teradata.

Confidential, California

Hadoop/Big Data Developer

Responsibilities:

  • Involved in preparing design specifications.
  • Developed Microservice Rest API using SpringBoot and Java/J2EE to change metadata
  • Developed unit test cases using Unit.
  • Developed spark job to convert compressed CSV files into parquet format to compatible with AWS Atana.
  • Developed Hive scripts to migrate data from Vertica to AWS Atana.
  • Developed spark scripts for various multiple column level transformations.
  • Automated the build process using Jenkins.
  • Configured capacity scheduler for yarn resource manager.
  • Developed spark code to read data from Kafka.
  • Performing support activities to establish data pipeline.

Environment: AWS EMR, Spark 2.1, Hive 2.3, AWS Atana, Scala, Java/J2EE (Spring Boot), MySQL, Maven, Git hub/ Bit Bucket, Kinesis, Jenkins, Cassandra, Kafka

Confidential

Technical Lead

Responsibilities:

  • Monitoring/Maintaining day-to-day batch jobs using Event Engine.
  • Interacting with SOR team to fix production data issues.
  • Performing data transformation using Pig and Hive.
  • Performing Snapshot, full refresh and Load append data refresh.
  • Providing support for CMDL raw data and ODL ingestion, EFS Big Data applications
  • Providing weekly/monthly status reports to customer.
  • Onsite Technical Lead - Delegate & review the tasks for offshore team.
  • Involved in Knowledge transition activities between application owner and offshore team.
  • Deploying the code into the various environments and monitored.
  • Monitor workflows in production and RCA for failed workflows.
  • Performing ETL process using Datameer.
  • Performing code change management (Unix shell script, Pig and Hive).
  • Involving in code changes in production environment.
  • Performing data loading into Parquet File Hive tables.
  • Analyzing code and fixing the production issues.

Environment: Pig, Hive, Apache Sqoop, Oozie, MapR, Datameer, SVN, Event Engine, HDFS, Shell script, Service Now.

Confidential

Project Lead

Responsibilities:

  • Understand the SRS and prepared technical specification document.
  • Prepared the test data as per the requirement and load into Development/Test environment.
  • Developed Data Ingestion and data clearing scripts using Shell script.
  • Design and develop the Hive DDL according to Data source and DW Schema.
  • Design and Developed Sqoop and file import jobs to HDFS.
  • Developed Pig script to process Unstructured data and loaded into Hive tables using HCatalog.
  • Performed Incremental and Full refresh tables.
  • Data loaded RC, ORC, Parquet file types in Hive tables.
  • Hive, Oozie metadata set up using Oracle database.
  • Developed Hive UDF based on requirements.
  • Developed real-time analytical application using spark and scala.
  • Design and Develop ETL workflow in Hive using Oozie workflow.
  • Developed Pig UDFs using piggybank and implemented in Pig.
  • Understand the application architecture and explain current Platform architecture to the stakeholders.
  • Deploying the code into the various environments and monitored.

Environment: HDFS, Pig, Map Reduce, Hive, Apache Spark, Oozie, Sqoop, Cloudera Manager, Flume & Scala.

Confidential

Responsibilities:

  • Analyzing the existing system, like, interfaces with different mainframe systems, functionality, critical logic and report details and documentation.
  • Preparing weekly status report. Attended weekly status call with client and updated status, issues/concerns about the project progress.
  • Involved in Release activities, like coordinating with release/deployment teams during production installation and performed post implementation validations.
  • Performed code changes to existing programs.
  • Preparation of Quality deliverables.
  • Providing status report to client and senior Management.
  • Onsite - Offshore co-ordination.
  • Monitoring & supporting Control-M batch processing.
  • Provided technical training and system knowledge to team members.
  • Involved in analyzing production issues.
  • Preparation of technical understanding documents.
  • Performing Unit & System integration testing.
  • Providing knowledge transfer to offshore team members.

Environment: COBOL, Z/OS, Pl/1, JCL, VSAM, DB2

We'd love your feedback!