We provide IT Staff Augmentation Services!

Big Data Engineer Resume

2.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • Over 11 years of experience in Information Technology, having in - depth knowledge in Analysis, Design, Development, Testing and Production/Maintenance in Distributed Environment and Data Warehouse Applications.
  • Around 4 years of working experience in Hadoop Big Data Applications and Worked extensively on systems having large volumes of data
  • Strong experience in Ab initio consulting with ETL, data mapping, transformation and loading from source to target databases in a complex and high-volume environment.
  • Hands on experience with major components in Hadoop Federation Ecosystem including HDFS, Hive, HBase, Pig, Sqoop, YARN, Apache Spark, Kafka, Oozie, Zookeeper and MapReduce frameworks.
  • Experience with Cloudera Hadoop Framework like Impala, Kudu, HUE & knowledge in Hadoop Cluster administration, monitoring and managing Hadoop clusters using Cloudera Manager.
  • Experience in Importing the Data using SQOOP from various heterogeneous systems like RDMS (MySQL, Oracle, DB2 etc.,)/Mainframe/XML etc., to HDFS and Vice Versa.
  • Working exposure to Ingest tools like Cloudera Stream-sets & Apache NIFI
  • Working experience on various file formats including Xml, Avro, Parquet etc., and
  • Working experience in creating real time data streaming solutions using Apache Spark/Spark Streaming & Kafka and built Spark Data Frames using Python.
  • Hands-on experience in end-to-end data warehousing ETL routines which includes Data Modelling/Mapping, Developing Abinitio Graphs, PSETS, Unix/Linux Scripts, testing and loading processes.
  • Well versed with various Ab Initio parallelism techniques and implemented Ab Initio Graphs using Component Parallelism, Pipeline Parallelism, Data parallelism and Multi File System (MFS) techniques.
  • Expertise in all components such as transform components, database components, sort, partition/de-partition in the GDE of Ab Initio for creating, executing, testing and maintaining graphs in Ab Initio and also experience with Ab Initio Co-operating System in application tuning and debugging strategies.
  • Have very good understanding of Agile Scrum & Waterfall Software Development Life Cycle.
  • Proficient in Structured Query Language (SQL) in RDBMS (DB2 UDB & ORACLE).
  • Have good experience in various Development, Maintenance, Testing and Production support activities on Mainframes platform (Cobol, VSAM & JCL ).
  • Working Experience in using EME Air Commands, Ab Initio Data Profiler/Data Quality .
  • Experience with relational and BI Data Warehouse database systems design & development.
  • Working experience in PL/SQL, SQL*Plus, Bulk Copy (BCP), Stored Procedures, Functions & Packages.
  • Have got experience of working for Banking and Property & Casualty Insurance domains.
  • Worked as a Team Lead in Enhancement/Maintenance Projects coordinating with Onshore / Offshore Team.
  • Self-motivated with result oriented approach. Rapid learning ability of new technologies.
  • Have good written, oral and interpersonal communication skills.

PROFESSIONAL EXPERIENCE:

Big Data Engineer

Confidential, Atlanta, GA

Responsibilities:

  • Importing the Data using Sqoop from various source systems like Mainframes, Oracle, MySQL, DB2 etc., to Atlas Data Lake Raw Zone.
  • Worked on creating Internal and External Tables using HIVEQL by Partitioning and Bucketing where SAS users can perform their Big Data Analytics.
  • Build HBASE tables by leveraging on HBASE Integration with HIVE on the Analytics Zone.
  • Hands on experience in using Kafka, Spark streaming, to process the streaming data in specific use cases.
  • Build workflows using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
  • Working experience on RDD’s & Dataframes (SparkSQL) using Pyspark for analyzing and processing the data.
  • Worked on spark which process the raw data into the data with lake standards and exports into Transformed & analytics Zone.
  • Developed Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Solved performance issues in Hive and Pig Scripts with understanding of Joins, Groups and aggregation and how does it translate to MapReduce jobs.
  • Developing code to extract, transform, and load (ETL) data from inbound HDFS files into various outbound files using complex business logic.

ETL Hadoop Specialist

Confidential, NYC, NY

Responsibilities:

  • Importing the Data using Sqoop from various source systems like Oracle, MySQL, DB2 etc.,to HDFS data zone.
  • Worked on creating HIVE and HBASE tables (Using HBASE integration) on the imported data based on the Line of Business (LOB).
  • Working in ETL methodology for supporting Data Analysis, Extraction, Transformations and Loading, in a corporate-wide-ETL Solution using Ab Initio.
  • Understanding the requirements of the end users/business analyst, reviewing the business Specs and developing strategies for ETL Process.
  • Participating in requirement meeting with the business analysts and ETL Architect to understand source and data warehouse model and cleansing/transformation rules.
  • Conducting required source system analysis and data availability audit.
  • Using data extract analysis to supplement understanding of data completeness and accuracy.
  • Preparing and continuously improving the Low Level Design Documents, High Level design Documents, and Run books for the project.
  • Working in an AGILE model, continuously improving the business specs.
  • Created a unique archival mechanism in this process which handles the ETL Archival Process.
  • Created a robust framework, using stored procedures and shell scripts, which articulate the whole ETL Process.

Sr. Ab Initio Developer

Confidential, Warren, NJ

Responsibilities:

  • Involved in analyzing business needs and document functional and technical specifications based upon user requirements with extensive interactions with business users.
  • Development of source data profiling and analysis - review of data content and metadata will facilitate data mapping, and validate assumptions that were made in the business requirements.
  • Making automate the development of Ab initio graphs and functions utilizing the Meta data from EME.
  • Performing transformations of source data with transform components in Ab initio
  • Wide usage of lookup files while getting data from multiple sources and size of the data is limited.
  • Involved in project promotion from development to QA and QA to production.
  • Involved in Production implementation best practices.
  • Used different EME air commands in project promotion like air tag create, air project export etc.
  • Developing Ab Initio graphs using GDE (Graphical Development Environment) to extract the data from DB2 tables & mainframe Files (legacy source system) to Oracle Tables (target database fields).

Ab Initio Developer

Confidential, Ohio

Responsibilities:

  • Created Ab-Initio graphs to load large volume of data.
  • Used the sub graphs to increase the clarity of graph and to impose reusable business
  • Restrictions and tested various graphs for their functionalities.
  • Involved in writing low level design documents.
  • Mentor for junior developers.
  • Developed Jobs and used Ab-Initio as an ETL tool to load the final loads.
  • The project is to centralize the reporting system, previously used a hard coded program.
  • Expertise in unit testing, system testing using of sample data, generate data, manipulate date and
  • Verify the functional, data quality, performance of graphs.
  • Documentation of complete graphs and its components.
  • Created unit test cases to identify the failure points.
  • Also involved in the production support on Mainframes Jobs.

We'd love your feedback!