We provide IT Staff Augmentation Services!

Sr Big Data/etl Engineer Resume

2.00/5 (Submit Your Rating)

Phoenix, AZ

PROFESSIONAL SUMMARY:

  • Extensive IT experience of over 10 years with multinational clients as Big Data/ETL Engineer in Data Warehouse and Data analytics platform Applications
  • Extensive experience in Big Data and Big Data Analysis tools.
  • In depth understanding of Hadoop Architecture and its components HDFS, Map Reduce and YARN.
  • Expertise with the tools in Hadoop Ecosystem including HDFS, Yarn, MapReduce, Hive, HBase, Sqoop, Oozie and Zookeeper.
  • Experience in working with data Extraction, Transformation and Loading process (ETL using: Ab - Initio, Informatica tools etc.)
  • Extensively worked on ETL Ab Initio Components like conduct it plan, pset, graphs etc.
  • Worked on Informatica ETL Tools-Source analyzer, Warehouse designer, Mapping designer, Transformation developer and Mapplet designer
  • Expertise in implementing Apache Spark Core, SQL and Streaming jobs for faster analyzing and processing of Bigdata using Scala/Python (PySpark).
  • Good experience in implementing Kafka Streaming Applications using Spark Kafka Streaming API.
  • Strong experience on Hadoop distributions like Cloudera, MapR and Horton Works
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data
  • Expertise in writing Hadoop Jobs for analyzing data using HiveQL.
  • Expertise in writing Hive UDFs to in corporate complex business logic into hive queries in the process of performing high level data analysis.
  • Very good understanding of partitions, bucketing concepts in Hive and have good experience on both managed and external tables in Hive to optimize performance.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Good experience in HBase NoSQL database design, Creating Column families and performing CRUD operations on data.
  • Good experience in implementing Standalone Kafka Producers and Consumers using Java API
  • Experience in working with Teradata and strong expertise in SQL queries and stored procedures .
  • Extensively worked with Teradata utilities like BTEQ, Fast Export, Fast Load, Multi Load and TPT scripts .
  • Experience in Core, Advanced JAVA technologies and Python.
  • Experience in working with RDBMS: Teradata, Oracle, Sybase and My SQL .
  • Strong experience and understanding of software development methodologies such as Agile Methodology (Scrum) and Waterfall Model
  • Exposed to all aspects of software development life cycle (SDLC) such as Analysis, Planning, Developing, Testing, implementing and Post-production analysis of the projects.
  • Handled several techno-functional responsibilities including estimates, identifying functional and technical gaps, requirements gathering, designing solutions, development, performance improvement and production support.

TECHNICAL SKILLS:

Big Data Tools: Apache Spark, Spark SQL, Spark Streaming, PySpark, Hadoop, HDFS, Yarn, MapReduce, Hive, HBase, Kafka, Oozie, Sqoop, Splunk, Zookeeper

AbInitio: 3.0.4.2 /2.15, Informatica 8.1/9.1

Databases: Oracle, Teradata, Sybase, MySQL

Operating Systems: Unix, Linux, Windows

Languages: Java, Scala, Python

Reporting Tools: Micro Strategy, Tableau

IDE / Tools: Rally, Jenkins, Eclipse, IntelliJ, Maven, Log4j, Quality Centre, GIT, SVN

RDBMS: Oracle, Teradata, Sybase, MySQL

SDLC Methodologies: Waterfall, Agile/Scrum.

Messaging: Apache Kafka

NoSQL Databases: HBase

WORK EXPERIENCE:

Sr Big Data/ETL Engineer

Confidential, Phoenix, AZ

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop including MapReduce, Hive, Kafka, HBase and Apache Spark.
  • Application development to support Existing: Ab Initio/Teradata BAU jobs
  • Involved in converting the existing ETL Jobs (Abinito/Teradata jobs) into Hadoop system: MapReduce Jobs/Hive.
  • Designed and developed data loading strategies, transformation for business to analyze the datasets.
  • Interaction with the Client and other vendors for integrating solution with other interfaces.
  • Review deliverables for technical completeness and to ensure the performance, operability, maintainability and scalability of the proposed technical solution.
  • Performed data integration, transformation, and reduction by developing Spark jobs.
  • Experience in creating Spark Session, Spark Contexts, Spark SQL/Hive Contexts to process huge sets of data.
  • Responsible in performing join, aggregations, filter and other transformations on the datasets using Spark.
  • Used Scala programming language to develop Spark core and Spark SQL jobs to process Big Data.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark Data frames.
  • Experienced in working with Spark eco system using Spark SQL and Scala queries on different formats like Text file, Parquet files.
  • Worked on writing transformer/mapping Map-Reduce pipelines using Java.
  • Responsible to write MapReduce Jobs to perform batch processing of Huge amount of data in daily scheduled jobs.
  • Involved in converting MapReduce Jobs into Spark transformations and actions using Spark Datasets/RDDs in Scala.
  • Developed and designed ETL Jobs which extracts data from Hive tables and do transformations and load into Hive External tables/HDFS.
  • Involved in creating External Hive tables and involved in data loading and writing Hive UDFs.
  • Written Restful web services which are responsible to interact with HBase system.
  • Experienced with performing CURD operations in HBase .
  • Responsible to implement Real Time Restful Webservice/API application using Spring Boot application.
  • Involved in configuring Maven builds for Java, Scala, Hadoop, Spark and Web API modules.
  • Involved in Repositories creation, Branches creation, Checkout branches, Push, Pull code or config files into GIT.
  • Integrated GIT, Maven into Jenkins to automate the code checkout process, Build and Deployment process.

Environment: HDFS, MapReduce, Spark, Abinitio, Teradata, Hive, Impala, Sqoop, Flume, Kafka, HBase, Oozie, Java, SQL scripting, Linux shell scripting, Eclipse

Technical Lead

Confidential

Responsibilities:

  • Application development to support Existing: Ab Initio/Teradata BAU jobs
  • Worked in Teradata SQL and stored procedures and Teradata utilities like BTEQ, Fast Export, Fast Load, Multi Load and TPT scripts .
  • Transformation initiate tasks to migrating existing Data warehousing application platform from IDN (Teradata) to Bigdata cornerstone platforms
  • Preparation of the technical Specifications (Logical Design, Physical Design and Unit Test Cases etc.)
  • Involved in converting the existing ETL Jobs (Abinito/Teradata jobs) into Hadoop system: MapReduce Jobs/Hive.
  • Migrated the all ETL feed Jobs from IDN (Teradata) platform into centralized Bigdata platform: Cornerstone
  • Migrated the IDN use case applications from Teradata into Hadoop: Map-Reduce and Hive.
  • Conducting quality reviews of all deliverables
  • Checking the standards of deliverables in the project
  • Prepare Unit Test Cases and do Unit Testing
  • Support QA / Production Warranty

Environment: Hadoop, HDFS, Yarn, MapReduce, Hive, Ab-Initio (GDE 1.15, Co>Operating System Version 2.15), UNIX, Sybase and Teradata.

ETL Senior Developer/Designer

Confidential

Responsibilities:

  • Requirement gathering from the Business team & understanding, Analyze and convert functional requirements into Extract, Transform and Load (ETL) technical design specifications.
  • Developed Ab Initio graphs using Graphical Development Environment (GDE) with various Ab Initio components: input and output dataset/database, Transformations, Partition and De-partition components etc.
  • Worked on AbInitio Conduct>IT plan for work flow of the pset, graphs and program tasks of the ETL use case jobs.
  • Developed AbInitio generic graphs to re-use the Extract, transformation and loading (ETL) common components.
  • Developed AbInitio graphs to extract the data needed from different source databases by using input file and database components.
  • Developed Complex Ab Initio XFR’s to derive new fields and solve various business requirements
  • Used phases and checkpoints in the graphs to avoid the deadlocks, improve the performance and recover the graphs from the last successful checkpoint
  • Worked on Re-Design/Modify Abintio graphs for the performance tuning to reduce the total ETL job process time.
  • Modified the existing Abintio graphs for enhancements of new business requirements i.e. change requests.
  • Extensively worked under the Unix Environment using Shell Scripts and Wrapper Scripts. Responsible for writing the wrapper scripts to invoke the deployed Ab Initio Graphs
  • Used Event Engine/Control -M scheduler to schedule the ETL Jobs
  • Participate in design reviews and code reviews and ensure that all solutions are aligned to pre-defined architectural specifications; identify/troubleshoot application code-related issues, and review and provide feedback to the final user documentation.
  • Develop test cases and plans to complete the unit testing and support system testing

Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse. Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse.

Environment: Ab-Initio- 3.0.4.2, LINUX, ORACLE 10g

ETL Sr Developer

Confidential

Responsibilities:

  • Application development in Ab Initio ETL. Design build and testing of technical solutions.
  • Preparation of the technical Specifications (Source - target maps, validations and Exception scenarios)
  • Design and Code Ab-Initio graphs based on the specifications
  • Conducting quality reviews of all deliverables
  • Checking the standards of deliverables in the project
  • Prepare Unit Test Cases and do Unit Testing
  • Support QA Test,

Environment: Ab-Initio (GDE 1.15, Co>Operating System Version 2.15), Unix, Oracle 10g

ETL Senior Developer

Confidential

Responsibilities:

  • Perform Analysis on new CRs/changes.
  • Developed AbInitio generic graphs to re-use the Extract, transformation and loading (ETL) common components.
  • Developed AbInitio graphs to extract the data needed from different source databases by using input file and database components.
  • Developed Complex Ab Initio XFR’s to derive new fields and solve various business requirements .
  • Prepare/modify HLD, LLD and technical documents
  • Prepare Unit Test Cases and do Unit Testing
  • Support Sys Test, BAT, Pre-Prod

Environment: Ab-Initio (GDE 1.15, Co>Operating System Version 2.15), UNIX, Oracle 10g

ETL Developer

Confidential

Responsibilities:

  • Responsible of business requirement and analysis from business users.
  • Used Ab Initio Components to develop graphs.
  • Worked with Partition components like partition by key, partition by Expression. Efficient use of Multi files system, which comes under Data Parallelism.
  • Used Lookups with Reformat component for fetching matched records based on the downstream process.
  • Used Sort component to sort the files and used de-dup Sort to remove duplicate values.
  • Worked with departition components like Gather & Merge.
  • Involved in Unit Testing of the developed components.

Environment: Ab-Initio (GDE 1.15, Co>Operating System Version 2.15), UNIX, Oracle 10g, JAVA

ETL Developer

Confidential

Responsibilities:

  • Developing design documents for the enhancements
  • Developing extraction routines using ETL tools: Informatica
  • Conducting quality reviews of all deliverables.
  • Checking the standards of deliverables in the project.
  • Testing the changes and supporting implementation

Environment: Informatica Power center 6.1/7.1, JAVA, Windows, Oracle 9i

We'd love your feedback!