We provide IT Staff Augmentation Services!

Data Engineer Resume

4.00/5 (Submit Your Rating)

SUMMARY

  • Data Engineer with strong experience in Cloud, Big Data, ETL and traditional RDBMS.
  • Highly skilled in AWS, Snowflake Database, Python, Oracle, Exadata, Informatica, SQL, PL/SQL, bash scripting, Hadoop, Hive, Databricks.
  • Extensive experience in envisioning the relevance of data and translating the business needs into a Data model.
  • Migrated on premises enterprise data warehouse to cloud based snowflake Data Warehousing solution and enhanced the data architecture to use snowflake as a single data platform for all analytical purposes.
  • Excellent experience at creating documents like Technical Design documents, Impact Analysis document, Source to Target Mapping Document, Test Cases.
  • Extensive experience in designing and developing high volume transaction - based systems, which support parallel processing.
  • Highly skilled in designing conceptual, logical, and physical database design.
  • Experience working on AWS Cloud services like Lambda, EMR, SNS, SQS, S3, CloudWatch, IAM, Redshift, RDS, EC2, DynamoDB.
  • Extensive knowledge on Spark Core APIs, Data Frames, SparkSQL.
  • Designed HIVE queries & Pig scripts to perform data analysis, data transfer and table design.
  • Expertise in Hive queries and have extensive knowledge on joins.
  • Extensive knowledge on Sqoop import and exports.
  • Create hive scripts to extract, transform, load (ETL) and store the data.
  • Strong expertise in designing and developing Informatica Mappings, Mapplets, Sessions, Tasks and Workflows.
  • Highly skilled in tuning and writing SQL queries, Data Profiling, writing Database procedures, functions, packages, objects, database triggers, exception handlers, cursors, cursor variables, Analytical functions, bind variable, Bulk Processing, performance tuning using Optimizer Hints, Explain Plan
  • Experience with Agile, Confluence, CI/CD, bitbucket, Gitlab, GitHub, Source tree.

TECHNICAL SKILLS

Databases: Snowflake, Exadata, Oracle, MySql, Redshift

Database Languages: SQL, SQL*Plus, PL/SQL

Programming Languages: Python, Shell Scripting

Data Processing: Informatica 10.1.0, Power Exchange 10.2, DataStage 8

Scheduling Tools: Control-M, CloudWatch, Appworks, Airflow

Cloud Technologies: AWS, Snowflake

Big Data Technologies: HDFS, Hive, Pig, Apache Spark, Sqoop

PROFESSIONAL EXPERIENCE

Confidential

Data Engineer

Responsibilities:

  • Design and Develop the Components to ingest the real time stream data into Snowflake using DynamoDB, Lambda, S3, Snow Pipe and Snowflake Tasks using Python as programming language.
  • Developed AWS lambda using python to process the DynamoDB stream events and write into S3 buckets.
  • Configure the event notification on S3 buckets for put events.
  • Setup Snowflake Stage and Snowpipe for continuous loading of data from S3 buckets into landing table.
  • Develop Snowflake Procedure to perform transformations, load the data into target table and purge the stage tables.
  • Create Snowflake Tasks for scheduling the jobs.
  • Design and develop python scripts that run on AWS EMR to load data into Snowflake.
  • Create views in Snowflake to support reporting requirements.
  • Design and develop data bricks notebooks to read, transform and load from different sources like S3 perform transformations and load into Snowflake.

Environment: Spark, Python, Databricks Delta lake, AWS, Airflow, Snowflake, Gitlab

Confidential, Chicago, IL

Data Engineer

Responsibilities:

  • Implementation of Google Analytics 360 project using Google Big query, Spark, Databricks Delta lake, Snowflake, Airflow and Python
  • Lead collaboration efforts in the administration of the Enterprise Data Lake/Warehouse including security, scheduling, and troubleshooting in Snowflake.
  • Design and developed process to extract the data from MYSQL RDS onto S3 using PySpark and ingest data into Kinesis Firehose which feeds elastic search to support Ecom search.
  • Perform Snowflake administration tasks like resource monitor, Access Control, Monitor Credits and Data Usage and Setting up virtual warehouses, Creating Stages, Create Schema and Tables, File Formats, Users etc.
  • Designed a data model on Snowflake database to support reporting requirements.
  • Design secure views for restricting access to data as well as underlying code.
  • Followed Agile Scrum methodology and involved throughout complete SDLC.
  • Developed Python Spark code in DataBricks to validate and load into Snowflake database.
  • Written adhoc SQL scripts using Snowsql for validating data loads.

Environment: Spark, Airflow, Data Bricks, Snowflake, AWS Cloud, Airflow, AWS CLI, SQL

Confidential, Chicago, IL

Data Engineer

Responsibilities:

  • Interact with business analysts and technology teams to create necessary requirements documentation such as ETL/ELT mappings, interface specifications, and business rules.
  • Participates in collaborating with end users on data and reporting requirements, business objectives, and data analytics needs.
  • Improved the performance of end of the day batch process by introducing objects features of Oracle and reducing the querying on the entity tables.
  • Responsible for information gathering by performing effective client communication
  • Performed functional specifications (FS) analysis, ensured effort estimation, and involved in prototype designing.
  • Developed Procedures in SAP Hana to load data for Product interface
  • Develop Informatica Workflows and mappings.
  • Worked on Oracle cursor, REF cursor, exception handling, and collections such as associated arrays, nested tables and VARRAYS
  • Tuned SQL query using Oracle advance features such as Oracle Hints, sub query, and scalar sub query with clause, outer join, bind variables, bulk collect, index skip scan, materialize view, and query rewrite feature
  • Create shell scripts to call oracle functions to load data into oracle tables from flat file using SQLLDR or External Table

Environment: Informatica, SAP Hana, Oracle, Autosys

Confidential, Greenville, SC

Oracle Developer

Responsibilities:

  • Involved in writing documents for various phases in the project lifecycle including: Analysis document, Functional specification, Technical specification, Unit test cases, and deployment plan.
  • Worked on Data extraction (Interface) into flat files using UTL File and Data Load from flat files using SQL*Loader and Unix Shell Script
  • Design and develop PL/SQL Stored procedures, functions, packages and/or triggers to perform the data validations on loaded data and transforming them to final objects.
  • Interacted with users on solving issues on the system and data-related problems.
  • Creating HIVE scripts for ETL, creating HIVE tables, writing HIVE queries.
  • Used Spark SQL to perform complex data manipulations, and to work with large amounts of structured and semi-structured data stored in a cluster using Data frames/ Datasets.
  • Written HiveQL queries and extended Hive functionality by writing custom UDFs, UDAFs, UDTFs to process large amounts of data sitting on HDFS.
  • Performed Hive incremental updates, merge, partitioning, bucketing, windowed operations, efficient and effective joins for faster data operations.

Environment: Hadoop, Hive, Python, Bash Scripting, SQL, PL/SQL, Oracle 10g/11g, SQL Server, LINUX/UNIX, Bit Bucket

Confidential 

Oracle Developer

Responsibilities:

  • Worked directly with the user groups in analyzing and specifying the business requirements and developing the software requirement.
  • Involved in Requirements gathering design of the logical flow of the system, preparation of the system requirements specification (SRS) and technical design document.
  • Successfully design and developed oracle forms and reports for in house ERP system
  • Developed test cases to conduct technical, functional and performance tests and involved in preparation of user training manuals.
  • Involved in development of Oracle packages, stored procedures, functions and triggers by using PL/SQL.
  • Optimized critical queries to eliminate Full Table scans reduce Disk I/O and Sorts
  • Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and formatted the results into reports and kept logs.
  • Involved in Gathering and Analyzing Business requirements with product management for new release cycles.

We'd love your feedback!