Senior Etl Lead Developer/data Engineer Resume
SUMMARY
- A passionate developer with 11 years of experience in Design, Development, Implementation, Testing and Production Support/Maintenance of Enterprise Data Warehouse (EDW), Data Marts, Data Lakes using Informatica, Oracle, Teradata, Bigdata Technologies.
- Solid experience in Healthcare & Pharmacy domains and taken HIPPA training to protect PHI/PII healthcare data, worked in both public and private sector projects.
- Good experience in Finance, Tele communications & GIS domains.
- Worked in Waterfall and Agile projects, good experience with Relational and Dimensional Data warehouse/Data Marts.
- Strong work experience with Data Extraction, Transformation and Loading using ETL Tool Informatica Power center.
- Experience in conversion of Oracle/Informatica projects to Big data environment and converting informatica ETL mappings/jobs to pyspark programs & loading data into HDFS.
- Good understanding of Hadoop file distributing system and using Hue portal to access HDFS files and Hive / Impala databases.
- Experience in creating Sqoop scripts for history data loads from Oracle to bigdata environment.
- Automated bigdata/informatica jobs using control - M scheduler.
- Strong experience in writing complex Oracle SQL/PLSQL queries, stored procedures, triggers and functions. Worked in both OLTP and OLAP environments.
- Experience with data management, data sets and data pipelines creation.
- Participating in multi-organizational teams to determine & codify business requirements, models/designs and is responsible for ETL design and development & support activities.
- Conducts facilitated discussions with customers/business users to review current & gather additional business requirements; analyzes & writes business requirements, design documents, scope documents, program specifications & system documentation.
- Designing test procedures & tests system for validity & reliability; ensures logical & systematic conversion of customer requirements into total systems solutions that acknowledge scope, schedule & cost constraints. Providing data analysis, data architecture & design assistance to functional/systems analysts & users.
- Proficient in data warehousing techniques and extensively working on complex mappings with mapping techniques for Type 1, Type 2 and Type 3 Slowly Changing Dimensions and CDC (change data capture), Audit table, Error Handling Techniques for ETL process.
- Expertise in performance tuning, gap analysis, debugging, identifying the bugs in existing mappings/workflows by analyzing the data flow and evaluating transformations.
- Resolving the issues and delivering projects on time to end customer. It also includes effective communication/co-ordination with client / business team.
- Expertise in Creating/Enhancing Informatica mappings, Mapplets, Sessions, worklets, workflows and worked with various transformations like Source Qualifier, Router, Filter, Expression, Joiner, Aggregator, Sequence generator, Lookups, Sorter, Union, Transaction Control etc.
- Developed transformation logic and design various complex Mappings and Reusable mapplets and used Parameters / Variables to implement Object Orientation techniques and facilitate the reusability of code.
- Worked on informatica advanced concepts like Informatica partitioning and Pushdown Optimization to improve the load performance, also created Drop/Create index procedure to use bulk load option in informatica to get high performance while loading history data into DWH.
- Good experience with Dimension modeling techniques like Star schema, Snowflake, Hybrid schemas and good knowledge on SDLC methodologies.
- Strong Experience on Types of Facts in Data Warehouse Additive, Semi-Additive, Non-Additive, Fact less Fact tables etc. Experience on Types of Dimensions in Data Warehouse Conformed Dimension, Junk Dimension, Degenerated Dimension, Role-Playing Dimension.
- Experience in using Teradata utilities like BTEQ, FEXP, MLOAD, FLOAD, Teradata parallel transporter (TPT) and develop dataflow paths for loading transforming and maintaining the Data warehouse. Extensively worked on Teradata SQL Assistant and Shell Scripting.
- Experience in troubleshooting Teradata scripts, fixing bugs and addressing production issues and performance tuning. Collected Statistics on all the tables used in the query as needed and created tables, temporary tables and views as required.
- Expertise in physical modeling with knowledge to use Primary, Secondary, PPI and join Indexes.
- Created Unix Shell scripts to automate and schedule Informatica jobs, Teradata scripts, shell scripts and error reporting in Teradata etc.
- Experience on data visualization, creating dash boards using Tableau, BO & SAS.
- Experience in Data Modeling and used Erwin/Microsoft Visio tools to develop new logical, conceptual & physical models for data and system flows, performing Data Profiling before starting the development.
- Experience in Informatica Admin activities like migrating the Informatica objects like mappings, workflows, source files across multiple environments.
- Involved in jobs implementation/Enhancements in Autosys, WLM, Control-M & DAC automation tools & running the production jobs in batch process using scheduling tools as per their schedule.
- Having Lead experience as well, managing team of 6 in my latest project.
- Dedicated, Proactive and enthusiastic in learning new Technologies and Tools.
- Hands on experience in handing/ Supporting Production systems at application level, ability in handling application level tickets at development phases and production environment.
- Exceptional verbal and written communication skills & excellent team player.
TECHNICAL SKILLS
RDBMS: Teradata V15/14/13//12, Oracle 11g/10g, SQL Server 2014/2013/2012
Bigdata Technologies: Hadoop (HDFS), Pyspark, Sqoop, Hive, Impala, Streamsets
ETL Tools: Informatica Power Center 10.1/9.x/8.x, Informatica Developer (IDQ)
Teradata Utilities: Import Utilities, Teradata FastLoad, MLoad, FastExport, TPT
Cloud Technologies: IICS (Informatica Intelligent Cloud Services), Oracle cloud
Scheduling Tool: Autosys, Control-M V9/8, WLM, DAC
Query Tools: Teradata SQL Assistant, SQL developer, Toad
Languages: Python, XML, Java basics
Reporting Tools: Tableau, Business Objects (BO), SAS
Version Tools: Tortoise SVN
Data Modeling Tools: Erwin
CR/SR/Incident Tools: ServiceNow, Remedy, RTC, TSRM, Pac2000
Other Tools: Putty, SSH, WinSCP
MS Office: Excel, Word, PowerPoint, Access, Projects, MS Visio
Operating Systems: Windows, Unix, Linux, Ubuntu
Datawarehouse Schemas: Star, Snowflake, Hybrid
PROFESSIONAL EXPERIENCE
Confidential
Senior ETL Lead Developer/Data Engineer
Responsibilities:
- Processing Provider, Managed Care, Finance, Recipients, Enrollments, Eligibility, Claims including Dental, Physician, Inpatient/Outpatient, pharmacy (NDC, ICD-9/10, DRUG, DRG, Diagnosis/Procedures, Payers/amounts data etc.), Prior Authorization, Electronic Visit Verification, Episode Care data into the warehouse on daily/weekly/monthly/quarterly basis. It uses Informatica PC ETL tool to pull the source data from various RDBMS systems & main source is MITS (Medicaid Information Technology System) & loading data into OHHSEDW.
- Also Converting all existing informatica jobs into Innovative Ohio big data platform using the technologies (Hadoop, Pyspark, Sqoop, Hive, Impala, Streamsets, Python, Unix shell scripting).
- Strong experience in analyzing/understanding the business requirements and converting to technical requirements & implementing the code using BRD and Mapping documents.
- Extracting data from multiple RDBMS sources like Oracle, Teradata, SQL Server.
- Processing XML, Jason, Delimited files using Informatica & transforming and loading into data warehouse/Data Mart.
- Development/Enhancement of Informatica Mappings using various transformations for enhancements and improved reusability of the code using mapplets/ worklets and variables/parameters concepts.
- Converting informatica ETL mappings to Pyspark programs, applying all the transformation logic in pyspark (Spark with python) and loading data into HDFS (Hadoop Distributed File System) in parquet file format.
- Creating Hive tables & good in writing HQL (Hive Query Language) queries.
- Creating Impala views in Big data for business users for the data analytics.
- Developed Sqoop scripts for history data loads from RDBMS (Oracle, Teradata, SQL server) to Hadoop.
- Involved from the scratch on bigdata environment setup for EDW projects, worked with admins on roles creation, project folders creation on edge node, as well as HDFS, 7078 forms submissions for access requests etc.
- Creating data pipelines in Streamsets for files FTP process from one server to another server.
- Good working knowledge on Hadoop map reduce & yarn jobs execution process.
- Collecting stats on Hive tables on weekly basis for good performance.
- Experience with data management and creating data pipelines using Streamsets and creating data sets using pyspark.
- Automation of Pyspark, Sqoop jobs using Control-M schedular and creating Unix scripts to call Pyspark/Sqoop programs through control-M.
- Designing/developing various complex informatica ETL mappings, used parameters, variables and developed reusable worklets, mapplets.
- Working on slowly changing dimensions (SCD) Type1, Type2.
- Performing CDC (Change Data Capture) loads using informatica MD5 or date/audit columns or oracle ora hash() functions.
- Experience in using connected/unconnected lookups, used transaction control transformation to split the source data into multiple files based on given criteria.
- Troubleshooting the Teradata scripts and analyzing informatica Session/ workflow logs for error handling and to find out the root cause.
- Expert in optimizing the informatica mappings/sessions, used informatica partitioning & Pushdown Optimization (PDO) to improve the ETL jobs performance. Identifying the performance bottlenecks and optimizing respective databases and processes for performance improvement.
- Preparing/Updating ETL Tech design documents for each release.
- Expert in writing complex Oracle/Teradata SQL queries to join multiple source tables to extract the data & loading into stage tables. Applying most of the transformation logic at the query level like simple transformations, joins, filters etc. for good performance.
- Working on Terada SQL Assistant for execute queries against Teradata system & and using Teradata Viewpoint for Teradata jobs/queries monitoring.
- Good experience in tuning the Teradata scripts to avoid spool space and other issues.
- Creating Global Temporary Tables for intermediate staging process instead of complex subqueries/derived tables.
- Creating BTEQ & Fast Export scripts to export data from Teradata system and loading into Staging area or DWH.
- Working with source teams on creating Secondary indexes on required columns which are frequently using in joins/filter conditions using explain plan /running diagnostic stats for better performance.
- Working on Traditional as well as Dimension modeling techniques such as Star schema & snowflake & Hybrid schemas.
- Performing unit testing in DEV and preparing unit test documents.
- Performing code reviews before migrating the ETL/DB/Control-M/SAS objects from DEV to higher environments.
- Working with Modelers/Architects on tables design & identifying and applying audit and redaction security policies on tables to protect PHI/PII data and to avoid unauthorized access to sensitive data.
- Working with Informatica, Oracle, SAS, Control-M admins for Change Tickets implementations, UAT migrations, Production fixes etc.
- Working with business team and tuning their SAS queries/Tableau reports for better performance.
- Giving production support for ODM - EDW projects & running the production jobs in batch process on daily/weekly/monthly/quarterly basis.
- Performing impact analysis, Root Cause Analysis (RCA) and resolving the production Issues/defects and closing the tickets without missing SLA.
- Handling with PHI/PII healthcare data & working with DBA to create Audit/Redaction policies, HIPPA rules on sensitive data tables.
- Processing COVID-19 data to provide Medicaid services to eligible recipients in Ohio.
- Creating change Requests (CR)/Service requests (SR)/Incidents (INC) using RTC/Remedy/ServiceNow tools. Handling CR submissions for Release and working with business team for B-CAB approvals to implement the changes in production.
- Taking initiatives to automate the ETL process/jobs in efficient way to avoid manual interventions during the history data loads.
- Managing Team of 6 and taking initiatives to train new resources, preparing documentation for KT sessions.
Confidential
Sr. ETL (Teradata/Informatica) Developer
Responsibilities:
- Analysis of the specifications provided by the business and converted to technical requirements & implemented the code using BRD, FSD, SIA, Mapping docs etc.
- Development/Enhancement of Informatica Mappings using various transformations for enhancements and improved reusability of the code using mapplets/ worklets and variables/parameters concepts.
- Analyzing the business requirements and performing the impact analysis to further develop the design and actual development.
- Designed various complex ETL mappings and written complex queries to join multiple source tables.
- Sourcing the data from OLTP/ODS RDBMS systems, delimited files and cleansing the data using various Data profiling techniques and loaded the data into warehouse using ETL process.
- Worked on slowly changing dimensions (SCD) Type1, Type2& Type 3.
- Developed Fast Load, Mload scripts and developed BTEQ scripts to process the data in staging server.
- Through explain plan, analyzed and modified indexes and modified queries with derived or temporary tables to improve the performance. Also utilized Teradata viewpoint.
- Used BTEQ and SQL Assistant (Query man) front-end tools to issue SQL commands matching the business requirements to Teradata RDBMS.
- Analyzed and designed USI and NUSI based on the columns used in Joins during data retrieval and to improve the performance of the scripts.
- Prepared Tech design documents for each release and used framework to generate one to one mappings and creating/inserting data to the metadata tables to make use of Framework.
- Creating and validating BTEQ scripts, UNIX Shell scripts and performed code reviews before migrating the code from lower to next higher environment.
- Strong in writing complex SQL queries to join multiple source tables to get the data into stage area.
- Tuning the Teradata scripts by collecting stats and defining indexes and Worked with DBAs on database performance tuning and analyzed queries using Explain.
- Worked on troubleshooting the Teradata scripts, fixing bugs and addressing production issues without missing SLAs.
- Performed unit testing at DEV and prepared test case documents with all the test cases.
- Performed code reviews before migrating the code (Informatica/Unix objects) from DEV to higher environments (ST, UAT, RT, PROD) to ensure that the code sink with all the environments.
- Create/Enhanced UNIX shell scripts to implement Bteq or to call Informatica workflows.
- Automated the ETL jobs using Autosys and created new/changes in existing jobs to sync with upstream changes.
- Involved in identifying the performance bottlenecks and optimizing respective databases and processes for performance improvement.
- Creating change tickets/assigning Work orders to Informatica/Teradata/Autosys teams to implement the code in Production.
- Creating databases/roles in Teradata as per the SOR requirement.
- Resolving production Issues, closing the tickets by documenting the solutions without missing SLA.
- Utilized PAC2000 tool to create/track change tickets/work orders/work requests/problem tickets etc. in Production.
- Creating/Maintaining Production Artifacts to log the production issues and update schedules.
- Used Tortoise tool to track the code versions and SQL Developer tool to check the data in source system (Oracle).
- Communicated with Teradata/Oracle DBAs in case of any DBA related issues.
- Involved in ECHO calls to handle the production issues effectively in case of any complicate issues which requires lot of other teams to work on it.
- Implemented Error reporting script to send Error tables list to the group to know the filtered records in production on daily basis.
- Prepared the User manuals to understand the process for the users.
- Given trainings to new/junior team members.
Confidential
ETL Developer
Responsibilities:
- As part of DELTA upgradation, converted all SQL server agent jobs to Informatica workflows as per the business requirement.
- Created/enhanced BTEQ/MLOAD/FLOAD scripts to load the data into Teradata database.
- Attended business meetings with client/SME for the clarification on requirements.
- Created/Modified mappings/sessions/workflows and developed Teradata scripts to load the data from files to tables and implemented new jobs in control-M tool.
- Performed unit testing & addressed all the defects by back tracing the code.
- Prepared control-M setup sheets to create new jobs/changes existing jobs in Test and production.
- Monitoring the jobs & fixed production issues without missing SLAs, Provided the RCA (Root cause of Analysis) for the issues.
- Logged the solutions to the production issues in prod artifacts for future reference.
- Used TSRM tool to create/update/track the change tickets/work orders/Issues in Test and Production.
- Provided support for outages and patching that are impacted to DELTA loads.
- Effectively handled all types of production issues like unavailability of source files, format/index related issues, N/W related issues, Teradata/SQL server connection related issues etc.
- Converted all WLM jobs to Control-M jobs as part of upgradation.
- Prepared Shell scripts to automate the ETL jobs in Control-M.
- Prepared DB Objects, code migration forms to migrate the code from DEV to higher environments.
- Effectively communicated with business users/SME in case of any delays/issues in loading the data.