We provide IT Staff Augmentation Services!

Big Data Engineer Resume

4.00/5 (Submit Your Rating)

San, FranciscO

SUMMARY:

  • Over 8 years of experience in software industry under wide range of Legacy and Big Data technologies
  • Experience on Big Data Hadoop Development and Administration in Confidential , San Francisco,
  • Experience on Big Data Hadoop Development and Administration in Confidential
  • 5 Years 5 month of experience on Mainframes and Big Data Technologies under the Wealth Management Vertical in Confidential
  • Cloudera Certified Developer for Apache Hadoop
  • Have a Good Work exposure in Development, Maintenance, Reversed Engineering, Production stability and across all phases of SDLC
  • Personal skills include systematic problem analysis, quick learner, good communication skills, dedication towards work, good team player and effective time management

TECHNICAL SKILLS:

Mainframe Skills: COBOL, S - COBOL, VSAM, JCL, PL/1, REXX, Easytrieve

Hadoop Skills: Spark, Core Java, Python, Scala, Hadoop, Pig, Hive, Sqoop, Flume, HDFSMapReduce, YARN, Oozie, ZooKeeper, Cloudera Impala, Phoenix

Hadoop Administration: Hortonworks, Cloudera, Amazon EMR

BI Skills: Pentaho, IBM Data Stage

Operating Systems: Windows, Linux (Cent OS, Ubuntu, Red Hat)

OLTP: CICS, IMS Transactions, TGADP (Web to Mainframe connectivity)

Data Bases: IBM DB2 Database with SQL, HBASE, mongoDB

Data Base Tools: Query Management Facility (QMF), SPUFI, TOAD, SQL Workbench/J

Utilities: ISPF/TSO, DFSORT

Version Control: Changeman, Endevor, GitHub

Testaided Tools: Expediter, Smart Test, File-Aid, Abend-Aid

Quality Tools: Hp Service Center (Peregrine), Clear Quest (CQ), HP Quality Center (QC)

PROFESSIONAL EXPERIENCE:

Confidential, SAN FRANCISCO

Big Data Engineer

Responsibilities:

  • I re-architected the entire batch system into incremental processing and also set up the data pipeline to run under the Airflow scheduler.
  • Since most of the codebase was in Python, using Airflow (also built in Python) was a sensible choice than OOZIE.
  • The new incremental data pipeline system resulted in a direct EMR cost savings of 30% due to less hardware requirements and also ensured the most up-to-date data availability in the impala database for reporting.
  • Work is currently on to re-write the reporting modules from Python to Spark to enable distributed processing.

Confidential

Principal Engineer

Technologies Used: HBASE, HIVE, Impala, Sqoop, Spark, MapReduce, Pig, Apache Phoenix, Hortonworks Hadoop Distro, Python

Responsibilities:

  • Installing and Setting up the Hadoop cluster with Hortonworks distribution
  • Data Modeling in HBASE
  • Pre-processing scripts in Python to format the data for HBASE load
  • Work with the resource on SLS side to agree upon the benchmarking parameters
  • Decide upon the compression algorithm, splits and bloom filters in HBASE for the in alignment with SLS
  • Building bulk load scripts in HBASE using importtsv utility
  • Performing bulk loads and capturing various parameters for benchmarking
  • Execute the SQL scripts from SLS on the HBASE cluster using Apache Phoenix
  • Capturing, comparing and reporting various benchmarking parameters

Confidential

Principal Engineer

Technologies Used: Hive, Pig, Sqoop, Spark, Spark SQL, Oozie, Flume, Oracle, Java, Hortonworks Hadoop Distro

Responsibilities:

  • Installing and Setting up the Hadoop cluster with Hortonworks distribution
  • Development and maintenance of scripts in Hive, Pig and Sqoop
  • Setting up workflows in Oozie, scheduling them based on required time
  • Data Analytics
  • REST API’s for UI to Hive connectivity
  • Performance tuning of the batch scripts to ensure SLA
  • End to end performance and system testing

Confidential, New Jersey

Principal Engineer

Technologies Used: HDFS, MapReduce, Pig, Hive, Sqoop, Datastage, Impala and Shell scripting

Responsibilities:

  • Resolve production defects under Confidential and ensure business continuity
  • Data Analysis
  • Data Reporting
  • Data Clean Ups by executing special request single purpose COBOL programs
  • Defect Triaging
  • Applying tactical technical solutions to production problems and come up with strategic solutions for the same
  • Attend to and resolve any issues with batch jobs across Test, QA and Production environments
  • Performance tuning of batch jobs to increase the cost savings
  • Setting up and scheduling of news jobs and batch processes starting from Test till Production regions
  • Provide 24/7 support for all production problems

Analyst I - Apps Programmer

Confidential

Responsibilities:

  • Develop High Level Design document from SRS
  • Review the HLD with project teams and business
  • Prepare Low level design documents from HLD
  • Develop and modify COBOL programs based on the requirements
  • Prepare Test plans
  • Unit, Regression & Performance Testing in Test environments
  • Peer reviews
  • Promotion of code from Test to Production environment using the version control tools
  • Data mining and support testing teams in QA environments
  • Support guarantee period for new changes in Production

Confidential

Software Engineer

Responsibilities:

  • Analysis, Coding, Unit testing of individual components.
  • Preparation of test cases and testing the programs.
  • Review of other components.
  • Monitoring and Estimating Work Loads.
  • Support for delivered components.
  • Delivering the programs to client.

We'd love your feedback!