Hadoop Tech Lead Resume
Sunnyvale, CaliforniA
SUMMARY:
- Having 13 years of experience in application development, enhancement, maintenance, support and testing.
- 5years of experience working with Big Data using Hadoop Eco - system components (HDFS, MapReduce (MR1/YARN), Pig, HBase, Hive, Cassandra, Sqoop, Flume, Oozie, Spark and SparkSQL).
- Hands on experience in using Hive and Pig scripts to perform data analysis, data cleaning and data transformation.
- Hands on experience in capturing data and importing using Sqoop from existing relational databases (Oracle, MySQL, DB2) with the help of connectors and fast loaders.
- Solid experience in Storage, Querying, Processing and Analysis of Big Data using Hadoop framework.
- Good Experience in managing and reviewing Hadoop log files.
- Involved in analysis, design, application development & testing using Hadoop Framework.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop jobs.
- Experience in using different file formats like Parquet, AVRO, ORC and CSV using different compression Techniques.
- Excellent understanding of Hadoop Architecture and different daemons of Hadoop cluster, which includes Job Tracker, Task Tracker, Name Node, and Data Node and YARN daemons like Resource Manager and Node Manager and Job History Server daemon.
- Goodunderstanding of Hadoop/Spark design principals, cluster connectivity and performance tuning.
- Having good experience of SparkSQL and Spark using Scala/Python.
- Worked on Version control systems like SVN, SCCS and Git.
- Proficiency knowledge in parallel processing using MapReduce and Spark.
- Strong in preparation of Unix shell script and Python.
- Good knowledge on using AWS working environment.
- Experience in mainframe technologies using COBOL, PL/1,JCL, VSAM and DB2.
- Quick learner and good team player. Possess good analytical and communication skills.
- Good understanding and experience with Software Development methodologies like Agile and Waterfall.
TECHNICAL SKILLS:
Programming Languages: Java, Scala, Python, shell Scripting, Cobol, JCL, JavaScript.
Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Spark, Oozie, PySaprk, Sqoop
Databases: DB2,MySQL, HBase, Cassandra, Oralce, AWS Athena
Operating Systems: Linux, UNIX, Windows, IBM Mainframe Z/OS
IDE/Testing Tools: Eclipse, Jenkins, JUnit
Tools: Maven, SVN, Change, SCCS, Event Engine
Distribution: Apache Hadoop, MapR, AWS and Hortonworks framework
PROFESSIONAL EXPERIENCE:
Confidential, Sunnyvale, California
Hadoop Tech Lead
Responsibilities:
- Involved in preparing design specifications.
- Developed API to change metadata using springboot.
- Developed unit test cases.
- Developed spark job to convert compressed CSV files into parquet format to compatible with AWS Athena.
- Developed Hive scripts to migrate data from Vertica to AWS Athena.
- Developed spark scripts for various multiple column level transformations.
- Automated the build process using Jenkins.
Environment: AWS EMR, Spark 2.1, Hive 2.3, AWS Athena, Scala, MySQL,Maven, Git hub, Jenkins.
Confidential, Bethpage, New York
Hadoop Lead
Responsibilities:
- Involved in architecture design, technical landscapedesign and validation.
- Performing data loading into ORC File Hive tables.
- Developed programs to perform data transformations using Pig, Spark and Python.
- Involved in environment set-up for off-shore team.
- Involved in historical data transfer from SAP HANA to Hadoop platform using Sqoop tool.
- Involved in analyzing existing workflows.
- Involved in data flow design in Hadoop technology, migration from Sap HANA.
- Involved in history data migration and incremental data migration.
- Developed programs/Scripts using Python and Unix shell script.
- Implemented changes for Sqoop to retrieve data from SAP HANA to Hadoop environment.
- Involved in creation of Hive databases and tables.
- Performed coding, Unit & Integration testing.
Environment: Hortonworks 2.5, Spark 2.1, Hive 1.2, Sqoop1.6,Scala2.1, Oozie4.2, Git, HDFS,Shell script.
Confidential, New York
Sr. Hadoop Developer
Responsibilities:
- Monitoring/Maintaining day-to-day batch jobs using Event Engine.
- Interacting with SOR team to fix production data issues.
- Performing data transformation using Pig and Hive.
- Performing Snapshot, full refresh and Load append data refresh.
- Providing support for CMDL raw data and ODL ingestion, EFS Big Data applications
- Providing weekly/monthly status reports to customer.
- Onsite Technical Lead - Delegate & review the tasks for offshore team.
- Involved in Knowledge transition activities between application owner and offshore team.
- Deploying the code into the various environments and monitored.
- Monitor workflows in production and RCA for failed workflows.
- Performing ETL process using Datameer.
- Performing code change management (Unix shell script, Pig and Hive).
- Involving in code changes in production environment.
- Performing data loading into Parquet File Hive tables.
- Analyzing code and fixing the production issues.
Environment: Pig, Hive, Apache Sqoop, Oozie, MapR, Datameer, SVN, Event Engine, HDFS,Shell script, Service Now.
Confidential
Hadoop Developer
Responsibilities:
- Understand the SRS and prepared technical specification document.
- Prepared the test data as per the requirement and load into Development/Test environment.
- Developed Data Ingestion and data clearing scripts using Shell script.
- Design and develop the Hive DDL according to Data source and DW Schema.
- Design and Developed Sqoop and file import jobs to HDFS.
- Developed Pig script to process Unstructured data and loaded into Hive tables using HCatalog.
- Performed Incremental and Full refresh tables.
- Data loaded RC, ORC, Parquet file types in Hive tables.
- Hive, Oozie metadata set up using Oracle database.
- Developed Hive UDF based on requirements.
- Developed real-time analytical application using spark and scala.
- Design and Develop ETL workflow in Hive using Oozie workflow.
- Developed Pig UDFs using piggybank and implemented in Pig.
- According to business flow change modify Oozie workflow.
- Understand the application architecture and explain current Platform architecture to the stakeholders.
- Deploying the code into the various environments and monitored.
Environment : HDFS, Pig, Map Reduce, Hive, Apache Spark, Oozie, Sqoop, Cloudera Manager, Flume & Scala.
Confidential, Charlotte, NC
Hadoop DeveloperResponsibilities:
- Analyzing the existing system, like, interfaces with different mainframe systems, functionality, critical logic and report details and documentation.
- Preparing weekly status report. Attended weekly status call with client and updated status, issues/concerns about the project progress.
- Involved in Release activities, like coordinating with release/deployment teams during production installation and performed post implementation validations.
- Performed code changes to existing programs.
- Preparation of Quality deliverables.
- Providing status report to client and senior Management.
- Onsite - Offshore co-ordination.
- Monitoring & supporting Control-M batch processing.
- Provided technical training and system knowledge to team members.
- Involved in analyzing production issues.
- Preparation of technical understanding documents.
- Performing Unit & System integration testing.
- Providing knowledge transfer to offshore team members.
Environment : COBOL, Z/OS, Pl/1,JCL,VSAM,DB2