Sr. Data Engineer Resume
SUMMARY:
- Involved in design, development, optimization and maintaining data/ETL pipelines automation, building dashboards, reports and Machine Learning.
- Expertise in developing applications using Agile/SCRUM methodology
- Strong domain experience in Banking, Health Care and E - commerce Domain.
- Experience on working with Machine Learning and Data science
- Strong knowledge on Hadoop and its internal architecture.
- Knowledge on installing and setting up clusters for Hadoop environment.
- Experience working with Hadoop ecosystem tools like Pig, Hive, HBase, Oozie, Zookeeper .
- Expertise applications in JAVA, HADOOP (Mapreduce, Hive), SPARK, SCALA, TABLEAU, R, Qlikview, Omniture, Machine Learning, Python, VBA, UNIX and QTP.
- Creating, Updating Incident and Solving problem tickets using HP Service Manager
- Experienced with cloud infrastructure such as AWS EC2, cloud front, S3 and RDZ and Server configuration.
TECHNICAL SKILLS:
Languages: HADOOP, MapReduce, Tableau, Qlikview, SPARK, SCALA, Python, Omniture, JAVA, PIG, Hive, Servlet, JSP, SPRING, QTP Automation and VBA
Online/GUI Tools: HTML, JQuery, Yahoo YUI and Alloy UI.
Operating Systems: Centos, Ubuntu, UNIX, Windows 2000/2003
File Organization: VSAM, PS, PDS, and GD
Databases: DB2, Oracle, MySQL and MongoDB
PROFESSIONAL EXPERIENCE:
Sr. Data Engineer
Confidential
Responsibilities:
- Participated in gathering the requirements, designing, developing of solutions.
- Responsible for design, development, optimization and maintaining data/ETL pipeline
- Handled tasks related machine learning (Data Science)
- Implemented ML platform by using Zeppelin
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables
- Built and maintained the backend of Business Dashboard, which is used by over 100 people daily for data/analysis purpose reducing the dependency on analytics teams.
- Worked with Omniture data and building Hadoop cluster for Omniture data
- Worked Omniture Site catalyst, Data ware house and Discover and Omniture Admin
- Built listing history fact which has been a challenge for analytics because of the high complexity.
- Worked with business analysts from various teams to define and calculate various marketplace metrics for analysis purpose.
- Worked with business analysts to build various reports which helped in better understanding and analyzing marketplace.
- Used HiveQL to build facts.
- Worked on debugging issues related to Hive QL and Hadoop for various team.
- Worked on scheduling jobs using Azkaban.
- Worked on Bigfoot (In-house product) support to help business users.
- Did training sessions for Graduate Trainees to teach basics of Hadoop and Hive.
- Responsible for building data pipelines, maintaining data models, automation, building dashboards, reports
Environment: Hadoop 2.x, Mapreduce, Hive 1.x, Azkaban, Spark, Scala, Python, Vertica, Tableau, Omniture, Qlikview, zeppelin and Data Science (Machine learning).
Hadoop Developer
Confidential
Responsibilities:
- Designing and building solution for agreed requirements
- Designing the Hadoop technical stack to be used for the project
- Applying Business login and transformations using Hive.
- Developing Visualizations in Tableau and publishing the workbooks to Tableau Server.
- Code Reviews.
- Query Optimizations.
- Preparing Technical specification documents-HLDs, LLDs
- Developing generic utility to move the raw data to AWS S3 using client side encryption on fly.
Environment: Hadoop 2.3.0, AWS EMR & S3, Hive, Sqoop, PIG, Java, Tableau, Symmetric encryption, Key management
Hadoop Developer
Confidential
Responsibilities:
- Testing Map Reduce programs for semi structured & unstructured data.
- Preparation of test plans and test cases.
- Reorder fields, and Correct inconsistencies in the input files by using Pig scripts
- Involved in testing the Pig scripts, tested in local and HDFS environments,
- Involved in testing the Hive script,
- We have written Map Reduce programs to explode the test data
Environment: Hadoop 2.3.0, Hive, Sqoop, PIG, Java, Tableau and R ( Machine learning )
Hadoop Developer
Confidential
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java or data cleansing and preprocessing.
- Importing and exporting data into HDFS using Sqoop.
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
Environnent: Hadoop, MapReduce, HDFS, Hive, Java, Pig, Sqoop, Flume, Oracle