Data Engineer Resume
CA
SUMMARY:
- Overall 10+ years of USA experience with 2 years of experience on Big Data technologies and Hadoop stack.
- Experienced in manipulating and analyzing complex, high volume, high dimensionality data from varying sources.
- Strong experience working with HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Flume, Kafka and HBase
- Good exposure to hive queries, map - reduce jobs, spark jobs.
- Familiar working with Spark Streaming & Spark MLlib
- Worked on various Hadoop Distributions (Cloudera, Amazon AWS).
- Strong database skills, Object Oriented Programming, development and troubleshooting skills.
- Involved in different phases of SDLC (Analysis, Planning, scheduling, Effort Estimation, Design, Development, Testing and Delivery activities)
TECHNICAL SKILLS:
Languages: Python, Java Script, C, Scala
Databases: Oracle, SQL Server, MS Access, HBase, Terradata
Hadoop Environment: Spark / Streaming, Mapreduce, HiveQL, Pig, Sqoop, Kafka, Flume, AWS, MLib
Web: HTML, CSS, PHP, XM
Operating Systems: Linux, UNIX, Windows server, Windows OS, UbuntuData analysis Tools Python pandas
Methodologies: SDLC, Agile, Scrum
PROFESSIONAL EXPERIENCE:
Confidential, CA
Data Engineer
Technology: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Hive, Spark, Spark SQL, Python
Responsibilities:
- The system pulls information from multiple data sources and ingests it into the system data lake after which it helps in identifying trends/stock volume data for profitable stock purchasing or selling.
- Worked on a live 30 nodes Hadoop cluster running CDH5.
- Worked with unstructured and semi structured data of 23 TB in size.
- Developed python/Scala scripts to scrape data from sources and stored it into buckets to run analytics.
- Developed a Sqoop incremental Import Job, Shell script & CRON JOB for importing data into HDFS.
- Imported data from HDFS into Hive using Hive commands. Created Hive Partitions for Date and Stock using imported data.
- Developed programs using Scala for spark to handle mathematical calculations using RDDS and deployed it in the cluster.
Confidential, CA
Data Engineer
Responsibilities:
- Data-center engineer responsible for implementing day to day document migration for infrastructure projects using DWR Cosmos web application.
- Maintained dynamic MySQL database for Cosmos website that handles millions of dollars monthly.
- Provided admin support to end users by identifying and fixing bugs to access departments' website.
- Designed, coded and tested various apps for cosmos website.
Confidential, NY
Data Engineer /Analyst
Responsibilities:
- Data-center engineer responsible for implementing day to day document migration for infrastructure projects.
- Used/Tested PIMS Data Warehousing software based on Oracle to process monthly payments, budget change orders and overruns. Reported errors to development team by sending screen shots to rectify errors so that the software can be used to process payments.
- Create and manage used budget vs expected analysis and generate budget /variance report at various management levels to procure additional funding.
Confidential, NY
Engineer /Analyst
Responsibilities:
- Used/Tested PIMS Data Warehousing software based on Oracle to process monthly payments, budget change orders and overruns. Reported errors to development team by sending screen shots to rectify errors until they are fixed to use software to process payments.
- Evaluate approved bids for various infrastructure projects including civil, utility and electrical contractors; analyze change of scopes and overruns for technical / commercial review all in a timely manner to meet business unit expectations and deadlines for contract completion. Present negotiated change orders to the operating committee with to past project’s comparable pricing, economics and budget.
- Negotiate project savings by comparing procurement cost with pricing, current market/material cost, inflation, estimates/budget and conduct rigorous vendor negotiation techniques
Confidential
Project /Lead Engineer
Responsibilities:
- Worked as project lead/engineer on various projects including preparing/reviewing budgets and schedules for various projects.
Confidential
Research Assistant
Responsibilities:
- Prepared an application for passenger and fare information that would prompt passenger names, age, gender, their starting and final destination and fares between the stations using MySQL as backend. Prepared various SQL queries based on the required information.
- Prepared GIS application for South Carolina Department of Transportation using GIS and MS Access that marks deterioration of highways based on their pavement markings lifecycle. This research paper was submitted at annual Confidential
- Developed a software program using C/C++ that would design infrastructure slabs, based on a given criteria. The output can be used in architectural drawings to design any building slab without further calculations.
