Big Data Engineer Resume SAN FRANCISCO - Hire IT People

SUMMARY:

Over 8 years of experience in software industry under wide range of Legacy and Big Data technologies
Experience on Big Data Hadoop Development and Administration in Confidential , San Francisco,
Experience on Big Data Hadoop Development and Administration in Confidential
5 Years 5 month of experience on Mainframes and Big Data Technologies under the Wealth Management Vertical in Confidential
Cloudera Certified Developer for Apache Hadoop
Have a Good Work exposure in Development, Maintenance, Reversed Engineering, Production stability and across all phases of SDLC
Personal skills include systematic problem analysis, quick learner, good communication skills, dedication towards work, good team player and effective time management

TECHNICAL SKILLS:

Mainframe Skills: COBOL, S - COBOL, VSAM, JCL, PL/1, REXX, Easytrieve

Hadoop Skills: Spark, Core Java, Python, Scala, Hadoop, Pig, Hive, Sqoop, Flume, HDFSMapReduce, YARN, Oozie, ZooKeeper, Cloudera Impala, Phoenix

Hadoop Administration: Hortonworks, Cloudera, Amazon EMR

BI Skills: Pentaho, IBM Data Stage

Operating Systems: Windows, Linux (Cent OS, Ubuntu, Red Hat)

OLTP: CICS, IMS Transactions, TGADP (Web to Mainframe connectivity)

Data Bases: IBM DB2 Database with SQL, HBASE, mongoDB

Data Base Tools: Query Management Facility (QMF), SPUFI, TOAD, SQL Workbench/J

Utilities: ISPF/TSO, DFSORT

Version Control: Changeman, Endevor, GitHub

Testaided Tools: Expediter, Smart Test, File-Aid, Abend-Aid

Quality Tools: Hp Service Center (Peregrine), Clear Quest (CQ), HP Quality Center (QC)

PROFESSIONAL EXPERIENCE:

Confidential, SAN FRANCISCO

Big Data Engineer

Responsibilities:

I re-architected the entire batch system into incremental processing and also set up the data pipeline to run under the Airflow scheduler.
Since most of the codebase was in Python, using Airflow (also built in Python) was a sensible choice than OOZIE.
The new incremental data pipeline system resulted in a direct EMR cost savings of 30% due to less hardware requirements and also ensured the most up-to-date data availability in the impala database for reporting.
Work is currently on to re-write the reporting modules from Python to Spark to enable distributed processing.

Confidential

Principal Engineer

Technologies Used: HBASE, HIVE, Impala, Sqoop, Spark, MapReduce, Pig, Apache Phoenix, Hortonworks Hadoop Distro, Python

Responsibilities:

Installing and Setting up the Hadoop cluster with Hortonworks distribution
Data Modeling in HBASE
Pre-processing scripts in Python to format the data for HBASE load
Work with the resource on SLS side to agree upon the benchmarking parameters
Decide upon the compression algorithm, splits and bloom filters in HBASE for the in alignment with SLS
Building bulk load scripts in HBASE using importtsv utility
Performing bulk loads and capturing various parameters for benchmarking
Execute the SQL scripts from SLS on the HBASE cluster using Apache Phoenix
Capturing, comparing and reporting various benchmarking parameters

Confidential

Principal Engineer

Technologies Used: Hive, Pig, Sqoop, Spark, Spark SQL, Oozie, Flume, Oracle, Java, Hortonworks Hadoop Distro

Responsibilities:

Installing and Setting up the Hadoop cluster with Hortonworks distribution
Development and maintenance of scripts in Hive, Pig and Sqoop
Setting up workflows in Oozie, scheduling them based on required time
Data Analytics
REST API’s for UI to Hive connectivity
Performance tuning of the batch scripts to ensure SLA
End to end performance and system testing

Confidential, New Jersey

Principal Engineer

Technologies Used: HDFS, MapReduce, Pig, Hive, Sqoop, Datastage, Impala and Shell scripting

Responsibilities:

Resolve production defects under Confidential and ensure business continuity
Data Analysis
Data Reporting
Data Clean Ups by executing special request single purpose COBOL programs
Defect Triaging
Applying tactical technical solutions to production problems and come up with strategic solutions for the same
Attend to and resolve any issues with batch jobs across Test, QA and Production environments
Performance tuning of batch jobs to increase the cost savings
Setting up and scheduling of news jobs and batch processes starting from Test till Production regions
Provide 24/7 support for all production problems

Analyst I - Apps Programmer

Confidential

Responsibilities:

Develop High Level Design document from SRS
Review the HLD with project teams and business
Prepare Low level design documents from HLD
Develop and modify COBOL programs based on the requirements
Prepare Test plans
Unit, Regression & Performance Testing in Test environments
Peer reviews
Promotion of code from Test to Production environment using the version control tools
Data mining and support testing teams in QA environments
Support guarantee period for new changes in Production

Confidential

Software Engineer

Responsibilities:

Analysis, Coding, Unit testing of individual components.
Preparation of test cases and testing the programs.
Review of other components.
Monitoring and Estimating Work Loads.
Support for delivered components.
Delivering the programs to client.

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

San, FranciscO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship