We provide IT Staff Augmentation Services!

Data Integration Engineer Resume

2.00/5 (Submit Your Rating)

Pleasanton, CA

SUMMARY

  • 9.9 years of experience in system analysis, design, development and implementation of Data Warehousing Systems, Data Lake Extraction, Cloud Storage & Snowflake
  • Strong knowledge DWH concepts likeEntity - Relationship Model, Facts and dimensions, slowly changing dimensions (SCD)andDimensional Modeling (Star and Snowflake Schemas)
  • Good understanding of Snowflake cloud technology
  • Experience in using Snowflake Clone and Time Travel.
  • Progressive experience in field of Big Data Technologies, Cloud Services which includes Design, Integration, Maintenance.
  • Prepare technical design and mapping documents & develop ETL data pipelines for error handling
  • Experience in various methodologies like Waterfall and Agile.
  • Experience in data conversion projects and data migration projects.
  • Prepare deployment plan and working on cut over activities
  • Studied, research and worked on Communication Systems such as networking, Imaging etc., in University of Portsmouth, UK and awarded the with Graduate degree with major credits on the In-house Project Re-Imaging of the portrait

TECHNICAL SKILLS

Hadoop Components: Hive, Sqoop, HDFS, PIG, HBase, Name Node, Data Node

Data Modeling: CA Erwin 9.x - Physical & Logical Modeling, Entities, Attributes, Cardinality

DW-ETL Tools: IBM Infosphere DataStage 11.5, 9.1.2/8.5.2/8.0.1 , Ascential DataStage 7.5.2, Information Analyzer, Quality Stage, Informatica 9.5 Power Center, FiveTran

Databases: Mainframe DB2, Teradata R6/R7, Oracle 9i/10g/11g/12c, SQL Server

Cloud Services: AWS S3, SNS, SQS, Snowflake, Snow SQL, Snowpipe, clone, Time Travel

Others: Python, Unix Shell Scripting, Tivoli Work Scheduler, PeopleSoft EPM 8.9, BMC Remedy, Zena, Service Now, SMP & MPP Servers

PROFESSIONAL EXPERIENCE

Data Integration Engineer

Confidential

Responsibilities:

  • Involved in Migrating Objects from Oracle to Snowflake.
  • Worked on the ETL jobs (FiveTran, Informatica) to migrate data from on premise to AWS cloud S3 and snowflake by generating JSON and CSV files to support Catalog API integration.
  • Setup the replication and clone for large tables by splitting into multiple tables based on the partitions to migrate data into Snowflake data warehouse
  • Used Maximized Warehouse cluster while running the queries and tested with different queries each time using multiple warehouse sizes
  • Worked on different ETL enhancements in data marts - Quotes, QMS, QNB, Policy, Claims, Corporate Legal and Legal.
  • Built Snowpipe pipelines for continuous data load to AWS S3 and Snowflake Datawarehouse
  • Cloned higher environment data for code change and performance testing and built Data Shares to review the data with Users
  • Involved & coordinated in setup of AWS SNS notifications to auto trigger Snowpipe at regular intervals for continuous load
  • Worked on building ELT Pipelines and use of Streams for the delta processing and load with Tasks scheduled into Snowflake
  • Co-ordinated with in configuration (Source & Sink Connectors) setup of Kafka for real-time streams into S3 & Snowflake

Environment: AWS S3, SNS, Snowflake, FiveTran, Snowpipe, Python, Informatica, Linux, Oracle, etc.

Data Integration Developer

Confidential, Pleasanton, CA

Responsibilities:

  • Worked on POC of snowflake cloud data warehouse migration from oracle and ETL connectivity with Snowflake connector
  • Built ETL data pipelines using Python & DataStage to aggregate, cleanse and migrate data across cloud Data Warehousing systems using staged data processing techniques, patterns and best practices
  • Built Reusable ETL frameworks for Source to Staging table load using Runtime Column Propagation.
  • Involved & coordinated in setup of AWS SNS notifications to auto trigger Snowpipe at regular intervals for continuous load
  • Worked on building ELT Pipelines and use of Streams for the delta processing and load with Tasks scheduled into Snowflake
  • Worked on Clone & Data Shares while doing Unit Testing and share the results with Business Users
  • Built data pipelines using Python for handling JSON files from AWS S3 External Storage to into Snowflake.
  • Worked on pulling the enrichment data using API with python requests and processed the JSON response into Snowflake.
  • Assist in creation of Enterprise Operation Data Store and Analytical Data Warehouse
  • Recommend normalized & dimensional data models and data integration techniques such as database replication, change data capture (CDC) etc. and involved in building Python data pipelines for semi structured data patterns JSON file formats to load into snowflake

Environment: AWS S3, SNS, Snowflake, IBM Infosphere, Informatica, Linux, DB2, Oracle, SQL Server, etc.

Data Engineer

Confidential, Chicago, IL

Responsibilities:

  • Written Hive queries for data analysis to meet the business requirements.
  • Creating Hive tables and using Hive QL and usage on HBase table for extractions
  • Worked on Sqoop scripts to export data to RDBMS Teradata from Data Lake
  • Good knowledge on Hadoop Architecture & Ecosystem components like Job Tracker, Task Tracker, Name Node & Data Node, Application Master, Resource Manager, Node Manager & MapReduce HDFS, Hive, Pig, HBase.
  • Very good understanding in ecosystems with Cloudera and Hortonworks HDP2.1 Hadoop Architectures.

Environment: Hortonworks HDFS, Hive, HBase, Sqoop, IBM Infosphere 8.5, TeradataR6, IBM AIX 6.1, Zena Scheduler

Technical Architect and Lead

Confidential, Pleasanton, CA

Responsibilities:

  • Involved in the solutioning of Architectural Design, Migration Topology with MPP System
  • Designed Topology distributing Services Tier, Client Tier, Engine Tier, Metadata Repository with 3 Physical Nodes and 3 Virtual Nodes to make 6 Node Configuration
  • Prepared a detailed analysis on the Job Inventory and identified the places to implement with new 11.5 Features
  • Implemented Connector Migration Tool to convert all the deprecated Stages like DRS, Oracle OCI, Teradata API, SQL Server, with Connectors like Oracle, Teradata, DB2, ODBC, SQL Connectors
  • Converted all the Staging Server jobs to Parallel using RCP Template
  • Converted all Server Dimension jobs to parallel conversion using DB Sequences for Surrogate Key Generation

Environment: IBM Infosphere 8.5, 11.5, Cognos 10.5, SQL Server, Oracle 11g, 12c, TeradataR6, DB2, IBM AIX 6.1, Linux, PeopleSoft EPM 8.9, BMC Remedy 8.6

ETL Technical Architect and Lead

Confidential, Pleasanton, CA

Responsibilities:

  • Compared the data models of financial and SCM data warehouses and integrated them accordingly to handle both the financial and SCM modules data.
  • Extracted the DB table SQLs and imported into Erwin through reverse engineering options and compared both systems. Also, established the relations as per the data warehousing setup.
  • Developed RCP template to run multiple staging jobs to run in parallel in DataStage.
  • Study of Informatica mappings to convert them into DataStage jobs with the same logic and merged the data into FDW without any impact on the existing functionality.
  • Implemented Dimensions and Fact using Informatica 9.5 as per the guide lines.
  • Improved performance of several ETL jobs by Table Partitioning, usage of proper Indexing, drop and rebuild non-unique Indexes for huge loads and SQL tuning with hints etc.
  • Extensive experience in working with cross functional teams in onsite-offshore development models to meet tight schedules, business SLA and deliverables.

Environment: IBM Infosphere 8.5, Informatica 9.5.1, Cognos, Business Objects, SQL Oracle 11g, IBM AIX 6.1, PeopleSoft EPM 8.9, BMC Remedy 8.6

ETL Technical Lead

Confidential, Pleasanton, CA

Responsibilities:

  • Played a key role in gathering statistics information and started the proof of concept on super cluster.
  • Executed the whole batch process on Solaris platform in Oracle Solution Center.
  • Executed the batch both in SQL Oracle 11g and Oracle 12C and compared the results
  • Implemented Checksum logic to capture the delta changes in ETL during month end and avoid huge volumes spikes for SLA

Environment: IBM Infosphere 8.5, SQL Oracle 11g/12C, IBM AIX 6.1, Solaris, PeopleSoft EPM 8.9, PeopleSoft 9.2, Tivoli Work Scheduler and BMC Remedy 8.6.

DataStage Developer and Technical Lead

Confidential, Pleasanton, CA

Responsibilities:

  • Played a role as a solution consultant assisting with data modeler in the full development cycle of financial data warehouse systems and responsible for creating project plans and executed them.
  • Coordinated with the client managers, business architects and data architects for various sign offs on data models, ETL design docs, testing docs, migrations and end user review specs.
  • Analyzed the data using SQL and developed complex SQL queries for data analysis and validation of data
  • Analyzed performance and monitored work with capacity planning.
  • Performed performance tuning of the jobs by interpreting performance statistics of the jobs developed.
  • Lead the Team involving DataStage, Informatica and Cognos and managed both offshore and onshore teams to meet the project deliverables.
  • Utilized the Web services to publish the data on the Website using XML stage.
  • Extensive experiences in working with the cross functional teams in onsite-offshore development models to meet tight schedules, business SLA and deliverables.

DataStage Developer and Team Lead

Confidential, Pleasanton, CA

Responsibilities:

  • Analyzed the source data (Oracle, DB2, flat files) and worked with the business users and developers to develop the model.
  • Identified list of DataStage jobs as per the category eligible for upgrades
  • Upgraded DataStage 7.5.2 to 8.5.2, set up the whole projects and scheduled through the Tivoli work scheduler.
  • Performed unit testing and supported in completing the (UAT) for every code change and enhancement.
  • Experience in parallel test runs for both versions in different servers in Prod and in different mount points for all non-Prod.
  • Worked with various source teams/business team/automation teams in executing the project.

Datastage Developer

Confidential

Responsibilities:

  • Involved in the troubleshooting, performance testing of DataStage server jobs & SQLs involved.
  • Extensively worked on the DataStage enterprise edition 7.5, administrator, designer, director and manager.
  • Extensively used Datastage designer to design and develop ETL jobs for Extracting, Transforming and Loading
  • Worked on improving the performance of the designed jobs by using various performance tuning strategies like Partition of Table, tuning the SQL as per the query plan.

We'd love your feedback!