ETL Tech Lead and Developer Resume

SUMMARY

Strong knowledge of dimensional modeling, DWH/system architecture, ETL/ELT framework, design, Implementation, best practices and troubleshooting.
Extensive experience in maintaining data mapping documents, business matrix and other data design artifacts that define technical data specifications and transformation rules.
Release management, code deployment across various environment and support/maintenance.
Ownership of overall deliverables, handle work distribution, time management, prioritizing work and assist in all the supporting tasks to ensure completion of project on time.
Worked in Waterfall and familiar with Agile methodologies
Excellent verbal and written communication with ability to effectively put forth ideas at functional/technical level.
Regular Contributor of ideas and process improvement suggestions by always looking for creative and innovative solutions/new features.
A proven hunger to learn new technologies and translate them into working solution by performing proof of concepts.
10 + years of strong demonstrated experience with IBM Infosphere Information Server Tools with excellent grip on
IBM client tools (Datastage&Qualitystage Designer, Administrator, Director, Admin Console, FastTrack, IA, Business Glossary, Metadata Workbench and ISD).
Have implemented SCD Type1, Type2 jobs.
Have implemented jobs using Hive Connector, File Connector, CFF, Sequential file, Dataset, Vertical Peek, Horizontal Peek, Join, Lookup, Funnel, Filter, Sort, CDC, SCD, Remove Duplicates, Copy, Transformations, Surrogate Key, Oracle Enterprise, Oracle Connector, Db2, Shared containers, Teradata Connector, Netezza Connector, Netezza Enterprises, Web Service and most of all sequence stages.
Have implemented Datastage server routines to update job related audit information tables and control tables.
ETL of data to/from heterogeneous technology platforms like HDFS Files, XML Files, major RDBMS platforms, flat files (with header & trailer) and SAP
Data integration with Mainframes
Dynamic parameterization using parameter sets and Global parameters, partitioning, performance tuning of Datastage jobs.
Handled Administration tasks like installations (server, client and Fix packs), create/update/delete users, project deployment, ODBC setup, start and stop services and kill the process ids of jobs.
3 + years of strong demonstrated experience with Talend Open Studio, Talend Integration Suite Enterprise - Big Data Edition and Hadoop Tools with excellent grip on.
Have implemented mappings in TALEND using tMap, tJoin, tReplicate, tParallelize, tConvertType,, tflowtoIterate, tAggregate, tSortRow, tFlowMeter, tLogCatcher, tRowGenerator, tNormalize, tDenormalize, tSetGlobalVar, tHashInput, tHashOutput, tAggregateRow, tWarn, tLogCatcher, tFilter, tGlobalmap, tDie etc.
Have experience in using Talend features such as context variables, triggers, and connectors for Database and flat files like tOracle, tTeradataFastLoad, tTeradataMulitLoad, tTeradtaTPTUtility, tHDFSInput, tHDFSOutput, tFileCopy, tFileInputDelimited, tFileExists.
Have Experience in creatingJoblets in TALEND for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
Created Context, Global and tem variable and used across the jobs.
Have implemented talend jobs using various RDBMS components like Oracle and Teradata.
Have implemented talend jobs to process XML files.
Have implemented sequences using OnComponentOk and OnSubjobOk trigger links.
Have involved build jobs, scheduling jobs through autosys and deployment using TAC.
Have implemented SCD Type1, Type2 jobs.
Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Expertise in Cloudera Hadoop Distribution, Talend hadoop data pipe line.
Have knowledge in Spark and Yarn.
Have Knowledge in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
Deep understanding and hands on experience with Informatica Power Center
Informatica client tools (designer, workflow monitor/manager, repository manager)
Creation of mappings, mapplets, transformations, workflows, worklets, sessions, tasks, etc.
Ability to handle highly complex mappings, debugging and efficient code migration
ETL of data to/from heterogeneous technology platforms like XML Files, major RDBMS platforms, flat files (with header & trailer)
Data integration with Mainframes
Dynamic parameterization, partitioning, performance tuning of Informatica objects, configuring concurrent workflows, defining target load order, constraint based loading, third part loading, etc.
Informatica admin console, create/start/stop different services, install repository
Strong understanding of Bi/ Data warehousing architecture, concepts, andETL strategies
Change data capture, audit tables, reject processing & data reconciliation, etc.
Data cleansing, build reference data, data normalization & de-normalization
Loading slowly changing dimensions (type I, II, III), constraint based loading, target load order, etc.
Dimensional modeling (star/snowflake schema) & DW BUS architecture, different types of dimensions, fact less fact tables, maintaining hierarchical attributes etc.
Pre-aggregation of fact tables
Deep understanding and hands on experience in Oracle with experience in
Writing medium to complex queries, sub queries, joins
Working with tables, views, synonyms, indexes, constraints, data types
Analytical functions, hierarchical queries and aggregate queries
Deep understanding and hands on experience in Teradata with experience in
Writing medium to complex queries, sub queries, joins
Working withTeradata utilities (SQL, B-TEQ, Fast Load, MultiLoad, FastExport)
Working with tables, views, synonyms, indexes, constraints, data types
Analytical functions, hierarchical queries and aggregate queries
Strong knowledge of UNIX Shell Scripting (bash) having experience on
Directory structure, permissions, NFS mounts
Major commands, pmcmd, variables, exit status, variable evaluation, processing command line arguments and environment variables
File operation, text editors, control structures, shell script functions, i/o redirection, logging mechanism
Process control and monitoring, foreground & background processes
FTP, NDM, DB connectivity & spooling data
Autosys Scheduling
Box jobs, command Jobs, file watcher Jobs
Setting up job dependencies, starting conditions, date conditions, calendar scheduling
Creating/updating/deleting Autosys box/command/file watcher jobs by creating and deploying jil (job information language)
Setting up Autosys virtual machines, machine definitions

TECHNICAL SKILLS

Tools: IBM InfoSphere DataStage&QualityStage 11.5.0.0.1/11.3/9.1/8.7/8.5/8.0/7.5 , IBM Business Glossary, Metadata Workbench, FastTrack, ISD and Information Analyzer (IA), Informatica Power Center 9.1/9.6.1, Talend Integration Suite - BigData Edition 6.1/ Talend Open Studio 6.1, Hadoop Cloudera, Hadoop Hartonworks, Ambari, SQL Developer, TOAD, Teradata SQL Assistance, WinSCP, Notepad++, Textpad, Autosys r11.3 & JIL, Putty, SVN, Business objects(BODS)

Databases: Oracle 10g/11g, Teradata 14.10, Netteza, Hive

Languages: SQL Programming, UNIX Shell Scripting (bash/ksh), Autosys JIL, C & C++

Others: Dimensional Data Modeling, ER Modeling

Domain Knowledge: Banking and Financial Services, Manufacturing, Scientific Sector

Operating Systems: UNIX, Suse Linux/Sun Solaris/IBM AIX), Win 9X/XP/7/8

Methodologies: Waterfall, Agile

PROFESSIONAL EXPERIENCE

Confidential

ETL Tech Lead and Developer

Environment: IBM Infosphere Datastage&Qualitystage 11.5 Fix Pack 1on Linux, Hartonworks Hadoop cluster, Hive, Oracle11g, TOAD, Putty, Autosys r11.3, SQL Developer, SVN, Unix Shell Script, SQL scripts, sqlplus, Unix/Linux, Windows XP/7

Responsibilities:

ETL Tech lead and Developer. Leading the team of 3 people on Datastage development team and 1 on ETL testing team.
Gather project requirements from client.
Interacted with End users to understand the business requirements.
Experience in leading a team of people (Onsite and offshore) during the Project life cycle.
Managed Data stage environments. (DEV, UAT & Production).
Prepared SRS, HLD and LLD documents based on client requirements.
Prepared project architecture document, presented/reviewed with client ETL architecture to get approvals to proceed further implementation.
Prepared Technical Specification Documents.
Prepared Unit Test Cases, System Integration Test Cases and Performance Testing documents.
Resolving the client issues spontaneously without any delays.
Keep on updating the current issues with client.
Assign the tasks to the team and maintain the status of assigned tasks using DML Tracker.
Preparation of Status reports on daily basis.
Developed complex ETL jobs.
Implemented complex mappings such as Slowly Changing Dimensions with Change data Capture (CDC) stage.
Data profiling is done with the existing tables before getting it from the source systems.
Developed the reconciliation jobs to match the record counts in source system with the target data.
Designed and developed the jobs for extracting, transforming, integrating, and loading data using DataStage Designer.
Designed the ETL processes using DataStage to load data from HDFS, Hive, Oracle, Fixed files&Flat Files.
Developed Parallel jobs using Parallel stages like: Hive Connector, File Connector, Horizontal Pivot, Vertical Pivot, Join, Lookup, Sort, Transformer,Funnel, Remove Duplicates, Aggregator, Datasets, Sequential Files, Filter,Oracle Connector, Checksum, Change capture.
Developed job sequencer with proper job dependencies, job control stages, triggers.
Developed DataStage job sequences used the User Activity Variables, Job Activity, Execute Command, Email Activity, Loop Activity, and Terminates DataStage Director and its run-time engine to monitor the running jobs.
Designed and Implemented audit related flow, tables and integrated in this project.
Prepared project Deployment and Back out plan documents.
Responsible to create IDD document for each datastage job flow and review the same
Performed the DataStage admin activities for the project.
Extensively worked on Error Handling.
Good understanding ofData warehouse concepts and Star Schema and Snowflake Schema model techniques.
Used theDataStageDirector to run, monitor, and test jobs on development and to obtain the performance statistics.
Developed Unix Shell Scripts to get the source files, to execute ETL jobs, Purging and Archiving the data.
Involving in taking care of versioning management of code and documents using SVN and VSS.
Involving in creation of Autosys JILs and scheduling the jobs.
Involving in SIT and UAT support.
Involved in IQA and EQA review of code and test cases.
Involved in Performance Tuning of Jobs using Performance Statistics.
Involved in the production implementation and post implementation support issues.

Confidential, Hartford, CT

ETL Tech Lead and Developer

Environment: IBM Infosphere Datastage&Qualitystage 11.3 on Linux, CloudEra Hadoop Cluster, Teradata 14.10,TOAD, Putty, Autosys r11.3, SQL Developer,SVN, Unix Shell Script, SQL NDM, BTEQ scripts, sqlplus, Unix/Linux, Windows XP/7, Talend Big Data 6.1

Responsibilities:

ETL Tech lead and DeveloperLeading the team of 8 people on Datastage development team and 4 people on ETL testing team.
Gather project requirements from client.
Interacted with End users to understand the business requirements.
Experience in leading a team of people (Onsite and offshore) during the Project life cycle.
Managed Data stage environments. (DEV, UAT & Production).
Prepared SRS, HLD and LLD documents based on client requirements.
Prepared project architecture document, presented/reviewed with client ETL architecture to get approvals to proceed further implementation.
Prepared Technical Specification Documents.
Prepared Unit Test Cases, System Integration Test Cases and Performance Testing documents.
Resolving the client issues spontaneously without any delays.
Keep on updating the current issues with client.
Assign the tasks to the team and maintain the status of assigned tasks using DML Tracker.
Preparation of Status reports on daily basis.
Developed complex ETL jobs using Talend and Datastage.
Implemented complex jobs inTalendusing components like tMap, tJoin, tReplicate, tParallelize, tAggregateRow, tDie, tUnique, tFlowToIterate, tSort, tFilterRow, tTeradataFastLoad, tTeradataMulitLoad, tTeradtaTPTUtility, tHDFSInput, tHDFSOutput, etc. And have created various complex mappings.
Created Context, Global and tem variable and used across the jobs.
Have implemented talend jobs using various RDBMS components like Oracle and Teradata.
Have implemented sequences using OnComponentOk and OnSubjobOk trigger links.
Have implemented talend jobs to process XML files.
Have involved build jobs, scheduling jobs through autosys and deployment using TAC.
Extensively used unix hadoop commands to files push, get and display, etc.,
Assign schemas and create Hive tables.
Loading the processed hadoop data files into Hive tables.
Implemented complex mappings such as Slowly Changing Dimensions with Change data Capture (CDC) stage.
Data profiling is done with the existing tables before getting it from the source systems.
Developed the reconciliation jobs to match the record counts in source system with the target data.
Experienced with Qualitystagefordataprofiling, standardization, matching and survivorship.
Designed and developed the jobs for extracting, transforming, integrating, and loading data using DataStage Designer.
Designed the ETL processes using DataStage to load data from Mainframe, XML, Fixed files&Flat Files.
Developed Parallel jobs using Parallel stages like: Horizontal Pivot, Vertical Pivot, Join, Lookup, Sort, Transformer,Funnel, Remove Duplicates, Aggregator, Datasets, Sequential Files, Filter,Oracle Connector, Teradata connector, Change capture.
Developed job sequencer with proper job dependencies, job control stages, triggers.
Developed DataStage job sequences used the User Activity Variables, Job Activity, Execute Command, Email Activity, Loop Activity, and Terminates DataStage Director and its run-time engine to monitor the running jobs.
Designed and Implemented audit related flow, tables and integrated in this project.
Prepared project Deployment and Back out plan documents.
Responsible to create IDD document for each datastage job flow and review the same
Performed the DataStage admin activities for the project.
Extensively worked on Error Handling.
Good understanding ofData warehouse concepts and Star Schema and Snowflake Schema model techniques.
Used theDataStageDirector to run, monitor, and test jobs on development and to obtain the performance statistics.
Developed Unix Shell Scripts to get the source files, to execute ETL jobs, Purging and Archiving the data.
Involving in taking care of versioning management of code and documents using SVN and BOA SharePoint.
Involving in creation of Autosys JILs and scheduling the jobs.
Involving in SIT and UAT support.
Involved in IQA and EQA review of code and test cases.
Involved in Performance Tuning of Jobs using Performance Statistics.
Involved in the production implementation and post implementation support issues.
Developed BTEQ scripts to load/update the data into target tables.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship