Big Data Engineer Resume
2.00/5 (Submit Your Rating)
Atlanta, GA
SUMMARY:
- Over 11 years of experience in Information Technology, having in - depth knowledge in Analysis, Design, Development, Testing and Production/Maintenance in Distributed Environment and Data Warehouse Applications.
- Around 4 years of working experience in Hadoop Big Data Applications and Worked extensively on systems having large volumes of data
- Strong experience in Ab initio consulting with ETL, data mapping, transformation and loading from source to target databases in a complex and high-volume environment.
- Hands on experience with major components in Hadoop Federation Ecosystem including HDFS, Hive, HBase, Pig, Sqoop, YARN, Apache Spark, Kafka, Oozie, Zookeeper and MapReduce frameworks.
- Experience with Cloudera Hadoop Framework like Impala, Kudu, HUE & knowledge in Hadoop Cluster administration, monitoring and managing Hadoop clusters using Cloudera Manager.
- Experience in Importing the Data using SQOOP from various heterogeneous systems like RDMS (MySQL, Oracle, DB2 etc.,)/Mainframe/XML etc., to HDFS and Vice Versa.
- Working exposure to Ingest tools like Cloudera Stream-sets & Apache NIFI
- Working experience on various file formats including Xml, Avro, Parquet etc., and
- Working experience in creating real time data streaming solutions using Apache Spark/Spark Streaming & Kafka and built Spark Data Frames using Python.
- Hands-on experience in end-to-end data warehousing ETL routines which includes Data Modelling/Mapping, Developing Abinitio Graphs, PSETS, Unix/Linux Scripts, testing and loading processes.
- Well versed with various Ab Initio parallelism techniques and implemented Ab Initio Graphs using Component Parallelism, Pipeline Parallelism, Data parallelism and Multi File System (MFS) techniques.
- Expertise in all components such as transform components, database components, sort, partition/de-partition in the GDE of Ab Initio for creating, executing, testing and maintaining graphs in Ab Initio and also experience with Ab Initio Co-operating System in application tuning and debugging strategies.
- Have very good understanding of Agile Scrum & Waterfall Software Development Life Cycle.
- Proficient in Structured Query Language (SQL) in RDBMS (DB2 UDB & ORACLE).
- Have good experience in various Development, Maintenance, Testing and Production support activities on Mainframes platform (Cobol, VSAM & JCL ).
- Working Experience in using EME Air Commands, Ab Initio Data Profiler/Data Quality .
- Experience with relational and BI Data Warehouse database systems design & development.
- Working experience in PL/SQL, SQL*Plus, Bulk Copy (BCP), Stored Procedures, Functions & Packages.
- Have got experience of working for Banking and Property & Casualty Insurance domains.
- Worked as a Team Lead in Enhancement/Maintenance Projects coordinating with Onshore / Offshore Team.
- Self-motivated with result oriented approach. Rapid learning ability of new technologies.
- Have good written, oral and interpersonal communication skills.
PROFESSIONAL EXPERIENCE:
Big Data Engineer
Confidential, Atlanta, GA
Responsibilities:
- Importing the Data using Sqoop from various source systems like Mainframes, Oracle, MySQL, DB2 etc., to Atlas Data Lake Raw Zone.
- Worked on creating Internal and External Tables using HIVEQL by Partitioning and Bucketing where SAS users can perform their Big Data Analytics.
- Build HBASE tables by leveraging on HBASE Integration with HIVE on the Analytics Zone.
- Hands on experience in using Kafka, Spark streaming, to process the streaming data in specific use cases.
- Build workflows using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
- Working experience on RDD’s & Dataframes (SparkSQL) using Pyspark for analyzing and processing the data.
- Worked on spark which process the raw data into the data with lake standards and exports into Transformed & analytics Zone.
- Developed Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
- Solved performance issues in Hive and Pig Scripts with understanding of Joins, Groups and aggregation and how does it translate to MapReduce jobs.
- Developing code to extract, transform, and load (ETL) data from inbound HDFS files into various outbound files using complex business logic.
ETL Hadoop Specialist
Confidential, NYC, NY
Responsibilities:
- Importing the Data using Sqoop from various source systems like Oracle, MySQL, DB2 etc.,to HDFS data zone.
- Worked on creating HIVE and HBASE tables (Using HBASE integration) on the imported data based on the Line of Business (LOB).
- Working in ETL methodology for supporting Data Analysis, Extraction, Transformations and Loading, in a corporate-wide-ETL Solution using Ab Initio.
- Understanding the requirements of the end users/business analyst, reviewing the business Specs and developing strategies for ETL Process.
- Participating in requirement meeting with the business analysts and ETL Architect to understand source and data warehouse model and cleansing/transformation rules.
- Conducting required source system analysis and data availability audit.
- Using data extract analysis to supplement understanding of data completeness and accuracy.
- Preparing and continuously improving the Low Level Design Documents, High Level design Documents, and Run books for the project.
- Working in an AGILE model, continuously improving the business specs.
- Created a unique archival mechanism in this process which handles the ETL Archival Process.
- Created a robust framework, using stored procedures and shell scripts, which articulate the whole ETL Process.
Sr. Ab Initio Developer
Confidential, Warren, NJ
Responsibilities:
- Involved in analyzing business needs and document functional and technical specifications based upon user requirements with extensive interactions with business users.
- Development of source data profiling and analysis - review of data content and metadata will facilitate data mapping, and validate assumptions that were made in the business requirements.
- Making automate the development of Ab initio graphs and functions utilizing the Meta data from EME.
- Performing transformations of source data with transform components in Ab initio
- Wide usage of lookup files while getting data from multiple sources and size of the data is limited.
- Involved in project promotion from development to QA and QA to production.
- Involved in Production implementation best practices.
- Used different EME air commands in project promotion like air tag create, air project export etc.
- Developing Ab Initio graphs using GDE (Graphical Development Environment) to extract the data from DB2 tables & mainframe Files (legacy source system) to Oracle Tables (target database fields).
Ab Initio Developer
Confidential, Ohio
Responsibilities:
- Created Ab-Initio graphs to load large volume of data.
- Used the sub graphs to increase the clarity of graph and to impose reusable business
- Restrictions and tested various graphs for their functionalities.
- Involved in writing low level design documents.
- Mentor for junior developers.
- Developed Jobs and used Ab-Initio as an ETL tool to load the final loads.
- The project is to centralize the reporting system, previously used a hard coded program.
- Expertise in unit testing, system testing using of sample data, generate data, manipulate date and
- Verify the functional, data quality, performance of graphs.
- Documentation of complete graphs and its components.
- Created unit test cases to identify the failure points.
- Also involved in the production support on Mainframes Jobs.