Lead Consultant - Big Data Resume
Plano, TX
SUMMARY
- 9+ years of IT industry experience in Requirements gathering, Analysis, ETL Design, Development, Testing and Implementation of Data warehousing systems.
- Experience in working with Big Data Technology with Hadoop Implementation using Cloudera Hadoop using Hive, Pig, Map reduce, Spark and ETL tools - Snap logic
- Experience in Cloud implementation using Amazon Web Services components like S3 storage, Elastic Map reduce, AWS Redshift and AWS console management.
- Knowledge in Security governance using Sentry and AWS policies.
- Good knowledge on Job Scheduling via Cron, Oozie and Airflow.
- Extensive experience in development of various business warehouse applications using ETL Tool Informatica 9.0.1, 8.6.1/7.1.2.
- Good experience in working with various Data Sources like IBM AS400, DB2/400, Oracle, Teradata, SQL Server and Flat Files.
- Experience in working with Informatica Power Exchange for DB2 for IBM I series (AS/400) and configuration and implementation of Change Data Capture (CDC) option.
- Extensively worked on Informatica Power Center Components - Power Center Designer, Workflow Manager to create workflows and sessions, Workflow Monitor and Repository Manager.
- Extensively worked on Informatica Designer Components - Source Analyzer, Transformation Developer, Mapping Designer and Mapplet Designer.
- Extensively worked on Informatica Power Center Transformations such as Source Qualifier, Lookup, Filter, Expression, Router, Normalizer, Joiner, Update Strategy, Rank, Aggregator, Stored Procedure, Sorter, Sequence Generator and XML Source Qualifier.
- Good expertise in Performance Tuning at Database and Informatica side like Source, Target, Mapping, Transformation and Session to make them more efficient in terms of session performance.
- Having good experience in large volume of data, Complex business logic and Aggressive deadline environments.
- Good experience in Unix Shell Scripting and ETL Process Automation using Shell Programming and Informatica.
- Good understanding of ETL concepts such as Data Profiling, Data Quality, Push-Down Optimization, Slowly Changing Dimensions, Change Data Capture, Data Validation etc.
- Extensive experience in delivering critical solutions on various levels of project planning and execution, Quality Assurance and System/Business Analysis.
- Extensive experience in writing complex SQL queries, PL/SQL, Stored Procedures, Packages, Views, Synonyms, and Triggers.
- Experience in tasks like Project tracking, Mentoring, Version Controls, Software Change Request (SCR / SCM) management, Project Deliveries / Quality Control and Migration.
- Good experience in working with Onsite-Offshore model and having good leadership qualities to lead a team and guide them to execute projects with high standards.
- Good experience in various Industry verticals like Life Sciences, Manufacturing & logistics, Retail & Consumer and Banking and Financial Services.
- Good Communication and Interpersonal skills, self-starter and excellent Team Player.
- Ability to learn and adapt new technologies quickly.
- Proactive and self-starter with the great ability of leadership.
TECHNICAL SKILLS
ETL Tools: Big data - Hadoop - Map Reduce, Hive, Impala, Pig, Apache Sqoop Snap logic, Informatica Power Center 9.1/8.6.1/7.1.0 and Informatica Power Exchange
Databases & Tools: Oracle 11g/10g, IBM AS/400, DB2/400, MS SQL Server 2008/2005, Teradata, IBM DB 2 9.5/8.0/7.0 , MS Access, Toad, Erwin and SQL Plus Hadoop - HDFS, AWS Cloud - RDS, NOSQL DB - Dynamo DB and Redshift
Languages: Unix Shell Scripting, Perl Scripting, SQL, PL/SQL, Python
Reporting Tools: Business Objects and Cognos
Scheduling & Versioning Tool: IBM Tivoli, Autosys, Oozie & MKS, Clear case
Environment: IBM AS/400, UNIX, Linux, IBM AIX 4.2/4.3, Win XP/7
PROFESSIONAL EXPERIENCE
Confidential, Plano TX
Lead Consultant - Big Data
Responsibilities:
- Managing/Coordinating the Enhancement releases from requirement gathering, Design, Development, System testing, QA and Production deployment between teams in Agile mode.
- Involved in Requirement gathering, Conceptual, Logical and Physical Data model design by following the Enterprise Standards and Naming Conventions.
- Closely working with Operation Architect Group to design the Technical Specification.
- Gathering the GAPs during Technical Review Session and make sure that everything documented and signed off properly.
- Clearly defining the Scope and Assumption of Interfaces, Communicated the Risk and Mitigation plans proactively to Business Stakeholders.
- Involved in Coding the ETL flow using Snap logic and cloud implementations using AWS EMR and spark.
- Creating Test Plans for Unit Testing for designed jobs.
- Co-ordaining with offshore to get the daily updates on the tasks to make sure being on track with the project deadline
- Security governance is maintained using Sentry and AWS roles and policies in cloud and other internal tools.
Confidential, Portland OR
Sr. Hadoop Engineer
Responsibilities:
- Attended Business calls to gather and analyze the requirements.
- Interacted with business analysts and modelers for better understanding of individual subject areas and modified specifications to reflect accurate user needs.
- Helped to architect the production and pre-prod (development) Hadoop clusters.
- Helped set-up Cloudera's CDH4.3.1 and upgraded to CDH 5.4 after multiple interactions with the Cloudera team.
- Involved in data modelling and High Level design for Hadoop Transformation.
- Produced the Detailed Design document/Technical specification and got Business Approval.
- Designed & Developed ETL objects. Also Sqoop the data from source system to Hadoop Files system.
- Implemented the direct connect with AWS by opening the firewall between AWS and Confidential network for external source systems.
- Developed UNIX Shell for importing the data from source system and bringing data into HDFS through AWS S3 storage.
- Developed Hive queries to aggregate the click-stream data that was imported into HDFS using Sqoop.
- Developed PIG scripts which can perform multiple aggregations on a single data set.
- HDFS Data is archived in AWS S3 storage, also pushed to Redshift for all other downstream process.
- Configured the above jobs in Oozie and Airflow. Also used AWS Lamda functions to set up dependencies between external Data Sources.
- Security governance is maintained using Sentry and AWS roles and policies.
- Reviewed the developments & fixed any issues.
- Performed Unit testing & provided Test comments.
- Produced Data extracts during system testing and got Business Approval.
- Coordinated with other teams - Upstream and Downstream teams for any impacts and Testing Team during Development and System testing.
- Debugged issues & provided fixes during the UAT phase.
- Gathering the status on Daily scrum from other team members to ensure projects on schedule.
Confidential, Chicago IL
ETL Lead
Responsibilities:
- Analysis and design of ETL processes.
- As Onsite Technical Lead Interacted with Business Users and Business Analysts to understand and document the requirements and translate the requirements to technical specifications ETL Mappings in Informatica.
- Developed Business Information Model and identified the key transactions for each of the Source Systems.
- Provided technical assistance by responding to inquiries regarding errors, problems, or questions with programs/interfaces.
- Interacted with business analysts and modelers for better understanding of individual subject areas and modified specifications to reflect accurate user needs.
- Involved in sessions with SMEs and business users to analyze and gather the high level requirements.
- Managing/Coordinating the Enhancement releases from requirement gathering, Design, Development, System testing, QA and Production deployment between teams.
- Mapped the source and target databases by studying the specifications and analyzing the required transforms.
- As a Technical lead, involved Snowflake Data Modeling both LDM, PDM Preparation
- Involved in performance tuning in various level of ETL mappings.
- Drafted Test Plans, Back out plans and Contingency measures during the Production Patches/Fixes for the existing applications.
- Developed UNIX Shell scripts for scheduling the Autosys jobs.
Confidential
ETL Lead
Responsibilities:
- Managing/Coordinating the Enhancement releases from requirement gathering, Design, Development, System testing, QA and Production deployment between teams.
- Involved in Requirement gathering, Conceptual, Logical and Physical Data model design by following the Enterprise Standards and Naming Conventions.
- Closely worked with Operation Architect Group to design the Technical Specification.
- Gathered the GAPs during Technical Review Session and made sure that everything document and signed off properly.
- Clearly defined the Scope and Assumption of Interfaces, Communicated the Risk and Mitigation plans proactively to Business Stakeholders.
- Designed and Prepared Functional specification documents, Technical specification documents, Mapping Documents for Source to Target mapping with ETL transformation rules.
- Extensively involved in Requirement Gathering, Data Model Design and Technical Design.
- Involved in Analyzing the Risk and Benefits of each requirement and change request and prioritize them accordingly.
- Followed and Implemented Best Practice methodologies.
- Optimized/Tuned the existing Informatica mapping in various levels by avoiding bottleneck to get better performance without disturbing the existing system functionalities.
- Involved in loading huge volume of data from source to target.
- Written UNIX scripts to run the Autosys jobs from UNIX.
- Extensively work on CDC (Change data Capture) implementation involving loading of type 2 Dimension like Vehicle Dimension and Dealer Dimension.
- Used the Informatica tool to develop processes for extracting, cleansing, transforming, integrating, and loading data into data warehouse database.
- Created Test Plans for Unit Testing/Integration testing.
- Audited database logs of unsuccessful records that failed to be created or updated with appropriate reasons.
Environment: Informatica 9.1, 8.1, Erwin 7.2, Oracle 10G, 11G, Autosys, SQL/PLSQL.
Confidential
Sr. ETL Developer
Responsibilities:
- Analyze the requirement from the business and designed the ETL Jobs.
- Architecture design and review, Technical and Functional Design (High and Low Level).
- Helping team in resolving critical, unknown technical issue. Timely communication to multiple stake holders.
- Involved in preparing Logical and Physical data model, Source Data Analysis, ETL Process flow and the schema objects with necessary Indexes, Partitions.
- Worked in the implementation of DW Incremental (pull and load) that constitutes different subject areas like (Planning, Actual, Financial Transaction, Compensation Request, Agreement, Party, Contact Point) for different source systems.
- Gain Knowledge on Informatica Administration.
- Interacted with business analysts and modelers for better understanding of individual subject areas and modified specifications to reflect accurate user needs.
- Mapped the source and target databases by studying the specifications and analyzing the required transforms.
- Migrated 8.5 and 8.6 jobs to Informatica 9.1 and did successful testing of all jobs.
- Used stages like Transformer, Sequential File, Dataset, Oracle Enterprise, ODBC, Lookup, Join, Aggregator, Remove Duplicates, Sort and CDC.
- Involved in Analyzing the Risk and Benefits of each requirement and change request and prioritize them accordingly.
- Optimized/Tuned the existing Informatica jobs in various level by avoiding bottleneck to get better performance without disturbing the existing system functionalities.
- Involved in loading huge volume of data from source to target.
- Written UNIX scripts to run the Informatica jobs from UNIX.
- Used the Informatica Designer to develop processes for extracting, cleansing, transforming, integrating, and loading data into data warehouse database.
- Created Test Plans for Unit Testing for designed jobs.
- Audit Log database of unsuccessful records that failed to be created or updated with appropriate reasons.
Environment: Informatica 8.5/8.6 and 9.1, Oracle, Mainframes, UNIX Shell Scripting.
Confidential
ETL Developer
Responsibilities:
- Involved in identifying the Key Performance Indicators (KPIs), Dimension information by analyzing the BRD and discussing with Business Analysts to build the dimensional model.
- Involved in preparing Logical and Physical data model, Source Data Analysis, ETL Process flow and the schema objects with necessary Indexes, Partitions.
- Analyze the existing systems used by different users and establish the feeds to these systems from Data Warehouse.
- Designed, developed and tested the Data stage jobs using Designer and director based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- According to the business logic created various transformations like Source Qualifier, Lookup, Stored Procedure, Sequence Generator, Router, Filter, Aggregator, Joiner, Expression, Union, Update Strategy and Sorter.
- Involved in fixing invalid mappings, testing of Stored Procedures and Functions, Unit and Integrating testing of Informatica Sessions, Batches and the Target Data.
- Map existing sources/feeds to the Data Warehouse fields.
- Used Workflow Manager for Creating, Validating, Testing and running the sequential and concurrent Sessions and scheduling them to run at specified time.
- Build the ETL layer to get the feeds into the staging/ODS tables.
- Build the ETL layer to load data from ODS to Data mart.
- Actively participated in the code review meetings for each designed jobs and explain the code to Tech lead.
- Provided technical assistance by responding to inquiries regarding errors, problems, or questions with programs/interfaces
- Written Unit test cases and Prepared Unit test results and discussed with functional team.
- Created parameters and Environment variables to run the same job for different schemas.
- Did performance tuning at source, transformations, target and administrator.
- Data validation checking, and adding new columns
Environment: Informatica 8.1, UNIX, Oracle9i, SQL-Programmer.