We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Malvern, PA

SUMMARY

  • IT Professional with more than thirteen (13) years of experience in Hadoop/Bigdata ecosystems and related technologies
  • Excellent experience in Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode Data Node and MapReduce programming paradigm
  • Expertise in Spark using program interfaces Scala and Python (pyspark)
  • Extensive experience in ETL process, data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining and advanced data processing
  • Experienced in optimizing ETL workflows
  • Proficient in designing and coding Oozie workflows in the ETL and conformance layers in Hadoop ecosystems
  • Expertise in Hive, Impala and Hive UDFs
  • Exported and imported large volumes of data from the Hadoop ecosystem to various relational databases using Sqoop
  • Proficient in AWS EMR/EC2, AWS S3, Cloud Formation, CloudWatch and AWS lambda function etc.
  • Well versed in Atlassian tools like Bamboo, Bitbucket, Github and JIRA
  • Expertise in IBM Mainframe with deep knowledge in Mainframe based applications and Mainframe tools
  • Expertise in troubleshooting and in leading teams to fix production issues
  • Proficient in project management, production support, application development, programming, system analysis, software quality assurance and change management process with various clients
  • Conversant with all phases of project Life Cycle including requirement gathering, analysis, design, development, testing, implementation, software quality standards, configuration management, change management and quality procedures
  • Expertise in handling support and maintenance projects with hands on experience with ticket tracking tools like HP SMPO, ITSM, Remedy and JIRA
  • Hands on experience in the migration of mainframe applications to other technologies like SAP, UNIX etc., and re - hosing and decommissioning mainframe to Microfocus enterprise server

TECHNICAL SKILLS

Platforms/frameworks: Hadoop 2.7, IBM S/390, IBM PC Compatibles

Operating Systems: Linux, OS/390, Windows 10/7/XP/2000/Server, MS-DOS

API: Spark 1.6/2.x, MapReduce

Programming Language: Python, Scala, Java, VS COBOL, JCL, Easytrive, SAS

Scripting Language: Korn shell/UNIX shell scripting, XML, SQL

Workflow: Oozie

Databases: Hive, Impala, DB2, Oracle, IMS DB

ETL Tool: Sqoop, Flume, Kafka

Web Interface: Hue

File systems: Avro files, Parquet files, HDFS, VSAM

OLTP: CICS, IMS DC/TM

Middleware: MQ Series

Tools: /Technologies: Spring Tool Suite, Eclipse, Crucible, Changeman, Endavor, PanvaletPanvalet, Xpeditor, DB2/VSAM, Fileaid, Paltinum Startool, SARJobtrac, SPUFI, QMF, Tape Management system (TMS), OPC scheduler, Abendaid, DADS, IBM debugger, Mainframe Express

Tracking Tools: Atlassian Tools, Bit Bucket, Bamboo, JIRA Remedy, ITSM, HP SMPO, Version One

PROFESSIONAL EXPERIENCE

Sr. Hadoop Developer

Confidential, Malvern, PA

Responsibilities:

  • Migrated the various client score models developed in On-Prem Hadoop to AWS EMR
  • Refactored the existing score model logic from warehouse tables to enterprise table and mapped the warehouse logic to enterprise logic
  • Coded the new score model programs in Pyspark and Spark-Scala
  • Converted the existing Pyspark and Spark Scala programs from Spark 1.6 to Spark 2.2 version
  • Migrated the existing Sqoop tables from On-Prem to S3 bucket and built the new Sqoop pipeline for the newly added enterprise tables to S3 bucket
  • Converted the existing integration suites running in Impala to Hive/S3 bucket
  • Created and customized the Cloud Formation templates using troposphere and spin up the AWS EMR cluster
  • Integrated, built and deployed the Cloud Formation/delete templates like S3-copy, created stack and deleted stack using Bamboo
  • Created CloudWatch events for the AWS EMR logs and integrated the CloudWatch logs with Splunk

Environment: AWS EMR/EC2, AWS S3, Splunk, CloudWatch, AWS Lambda, Hadoop2.7, Spark 2.2, Scala, Python 3.7, Oozie, Sqoop, Hive Presto, UNIX shell scripting, Bamboo, Bitbucket, JIRA, Control M

Sr. Hadoop Developer

Confidential, Pennigton, NJ

Responsibilities:

  • Designed the data lake to pull HMDA loan details of various clients from the upstream system like Peaks, nCino
  • Designed and implemented the Sqoop process to pull the client data from various Oracle databases to Hadoop environment of Confidential
  • Implemented the ETL process and conformance codes for the HMDA data lake
  • Designed and implemented the Oozie workflow to import and export the client’s loan information to various loan processing and data analytical systems in Confidential
  • Created and worked with Hive tables in the Hadoop data hub region and stored the Sqoop data in the parquet format
  • Designed and coded the conformance logic using Spark-Scala which can be used for target or consuming systems
  • Optimized the Spark-Scala and Spark-SQL codes in the conformance layers for process improvement
  • Implemented the Oozie coordinator and scheduled the daily/weekly/monthly jobs
  • Created the test suites using JUnit and performed the units, integration and end to end testing using JUnit in QA and SIT regions
  • Optimized the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution

Environment: Hadoop 2.7, Spark, Scala, Oozie, Sqoop, Hive, Impala, Oracle, Hue, UNIX, shell scripting

Sr. Hadoop Developer

Confidential, Malvern, PA

Responsibilities:

  • Designed the ETL process to bring the client score details from Teleaf, data warehouse and enterprise system to Confidential Hadoop ecosystem
  • Worked with the business users to understand and clarify the business requirements and prepared the design documents
  • Designed, coded and implemented the Sqoop process and imported the score detailed Hadoop data hub
  • Performed cleanse and validated the imported data convert to the Avro file format which are accessible to the Hadoop data mart environments
  • Made the necessary changes to cleanse and validate programs using spark-scala
  • Designed and coded the score calculation logic for the Confidential clients using pyspark and executed the pyspark programs in Hadoop data mart environment
  • Designed and implemented the Oozie workflow for the daily/weekly/monthly client score calculation and Web interaction reports
  • Implemented the Oozie coordinator and scheduled the daily/weekly/monthly jobs
  • Created the test suites using pyspark and performed the units, integration and end to end testing using the pyunits
  • Converted the Avro files in the Hadoop data hub to parquet format using Hive scripts
  • Imported the data from Oracle and DB2 database to Hadoop ecosystem using Sqoop
  • Created the Hive tables in Hadoop data mart environment and validated the performance of Hive and Impala queries against the master tables
  • Optimized the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution
  • Fine-tuned the pyspark codes for the optimized utilization of Hadoop resources for the production run
  • Executed the comparison test in the production region and fine tune the end results to ensure accuracy
  • Troubleshot and fixed the daily Oozie workflow failure and implemented permanent fixes
  • Analyzed the Java-MapReduce program, prepared the analysis documents and performed the feasibility study to convert the Java-MapReduce programs to spark-python (pyspark)
  • Prepared the high level/low level design documents for the conversion Java, Map Reduce codes to pyspark
  • Coded the programs in pyspark for those in Java and performed the unit/integration/regression and comparison testing to ensure that the newly converted codes have the same functionality and performance with the codes in Java
  • Mentored team members and provided the application training for the new hires

Environment: Hadoop 2.7, Spark, Python, Scala, Oozie, Sqoop, Hive, Impala, Oracle, DB2, Hue, UNIX shell scripting, SAS

We'd love your feedback!