Data Engineer Resume
Menlo Park, CA
SUMMARY:
- Have 11+ years of experience in data warehousing with in - depth knowledge using Informatica Power Center.
- Experienced working in Agile based environment.
- Extensively worked in all phases of Software development life cycle in various data warehouse projects (Requirements Analysis, Data profiling, Data Certification, High level and Low level design, Coding, performance tuning, Supported System Testing and UAT and Implementation).
- Strong knowledge of ETL processes using Informatica Power Center 9.6.0 and earlier versions.
- Experienced working on backend database programming with Oracle 9i, Teradata 13.11, Vertica version 8.1.0-3, DB2 and Netezza 7.1.0
- Good knowledge on other ETL tools like DataStage 7.5.2, Oracle Warehouse Builder 10g Release 2.
- Extensive scripting knowledge of SQL, PL/SQL, BTEQs, MLOADs, Fast Loads, etc. and tuning techniques.
- Extensive experience using data modeler tool CA Erwin r9.5, Power Designer 16.1.0, ERStudio 10.0.
- Basic knowledge on R scripting
- Have moderate knowledge on Python coding
- Have domain expertise in insurance domain, Electric and Utility domain, banking domain and retail projects.
- Expertise in Data Analysis, Reject Analysis and preparation of Data Mitigation Plan.
- Worked with transactions involving large data amounting to millions of records.
- Have expertise in Data Conversion, Data Migration and Data Warehousing projects.
- Possess strong analytical and problem solving skills.
- Experienced working in Reverse Engineering and documented the data lineage.
- Moderate knowledge on Informatica MDM 9.1
- Extensive Client Interaction and preparation of Business Requirements Design. Also have experience in the preparation of RFPs and resource planning.
- Experienced working proactively in identifying the issues, escalating and communicating the issues to the business clients.
- Experienced in preparing test Plan for all the business processes or application.
- Experienced in presenting trainings to the wider audience.
- Possess excellent business and communication skills, ability to master new skills and Technologies in Faster pace, Dedicated, Hard worker and Good Team player.
- Have a professional attitude and good communication skills, self-motivated and multi-task oriented, with strong work ethic and ability to work effectively.
- Very proactive and took initiative in taking up additional responsibilities in every project worked.
TECHNICAL SKILLS:
ETL Tools: Informatica Power Center 9.6.0, DataStage 7.5.2, DataStage Version Control tool, OWB (Oracle Warehouse Builder) 10g Release 2
Schedulers: Control-M, Crontab
Databases: Oracle 9i, Teradata 13.11, DB2, Microsoft SQL Server 2008, Netezza 7.1.0.
Operating Systems: Windows NT / 2000 / XP / 2003, UNIX
Front End Tools: TOAD, SQL Plus, SQL Developer, AQT, Teradata SQL, SquirrelSQL
Programming Languages: SQL, PL/SQL, Unix Shell Scripting, R Scripting.
Project Management: Estimates/Scheduling, Resource Allocation, Project Implementation Planning, Tracking and Reporting
S/W development Methodologies: Agile, Waterfall models
Scripting Languages: BTEQ, MLOAD, Fast Load, Fast Export, Unix scripting
Data Modelling: CA Erwin r9.5, Microsoft Visio, Power Designer 16.1.0, ERStudio 10.0.
Other tools: JIRA, Rally, GitHub, Mercurial Source Control Management tool.
PROFESSIONAL EXPERIENCE:
Confidential, Menlo Park, CA
Data Engineer
Tools: Python 2.7.5, Presto version 0.185-8-gbacbf06, Hive, Daiquery tool, UNIX scripting, Dataswarm Framework (Internal framework), Vertica version 8.1.0-3, Informatica Power Center 9.5.1, GIT hub, Mercurial Source Control Management tool.
Responsibilities:
- Build Python pipelines using internal Dataswarm operators to query, load and copy data into Vertica/hive tables present in multiple clusters.
- Provided solutioning and implemented logic to improve the landing times for the tables.
- Worked on optimizing Vertica and hive queries
- As a data engineer I provide solutions, design, code, optimize and implement the business logic in Python pipelines using internal dataswarm operators (HiveQLOperator / PrestoOperator / VerticaOperator /PrestoInsertOperator /VerticaClusterCopyOperator, etc)
- Enhance and provide support work for all the pipelines in production.
- Performed reverse engineering of physical data models from databases, SQL scripts and excel spreadsheets during Vertica to Presto conversion project.
- Supported in fixing all the data issues during the Vertica to Presto migration.
- Conducted design and review meetings to get design approval from different stake holders in the project.
- Responsible for translating requirements to technical specifications and worked on developing design documents like mapping documents and data dictionary.
- Involved in Data profiling and performed data analysis based on the requirements, which helped in catching many Sourcing Issues upfront.
- Created mappings and workflows in Informatica to query and load the data from Salesforce and Omega.
Confidential, San Francisco, CA
Technology Lead/Data Modeler
Tools: Netezza 7.1, SVN, ERStudio10.0, JIRA, Aginity, Teradata 13.11, Putty.
Responsibilities:
- Responsible for understanding the requirements in BRD, worked on developing technical design document and mapping document.
- Worked with the product team to understand the task priorities before the start of sprint cycle.
- Suggested and made changes to the data model based on the requirement and worked to get the required approvals from the Architecture team.
- Developed and fine-tuned SQL scripts in Netezza. Also implemented scenario based soft delete logic in the scripts to support common CDC automation script.
- Developed UNIX wrapper scripts to execute the SQL code snippets.
- Documented test results and execution support manual to help the system support team.
- Documents and communicated the source data issues and data outliers to the product team and source teams.
- Responsible for migrating the code to different environments, testing, supported QA, UAT and implementation.
Confidential, San Francisco, CA
Technology Lead/Data Modeler
Tools: Informatica Power Center 9.6.0, Oracle 11g, R V2.15.1, Github, ERStudio10.0, JIRA, Toad, Teradata 13.11
Responsibilities:
- Working with the business teams to understand the business need for the requests.
- Responsible for translating requirements to technical specifications, worked on developing design documents. Developed data model, data dictionary, mapping documents & support docs, developed mappings, sessions & workflows, performance tuning, scheduled workflows in crontab, developed UNIX scripts and R scripts, Testing, Supported UAT and Implementation.
- Conducted review meetings to get the sign off on the solution approach with Data Architect.
- Involved in Data profiling and performed data analysis based on the requirements, which helped in catching sourcing Issues upfront. Much appreciated by analytics team.
- Created mappings to import data from heterogeneous sources like VSAM files, XML files and delimited flat files.
- Developed stored procedures to implement business logic.
- Created mapplets and reusable transformations in Informatica. Also used several transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Developed BTEQ scripts and MLOAD scripts.
- Utilized Informatica respository tables to perform fast & accurate impact analysis and identified the impacted maps and tables.
- Worked on updating Data modeling for adhoc requests.
- Created execution support manual for the system support team.
- Used Github for Version control.
Confidential, Chicago, IL
Technology Lead/Data Modeler
Tools: Informatica Power Center 9.5.1, DataStage 8.1, Teradata 13.11, Oracle 9i, Microsoft SQL Server 2008, SVN tool, Sybase Power Designer 16.1.
Responsibilities:
- Working as a Technology Lead in onsite/offshore model and leading a team of four members.
- Technical consultant for the team to provide solutions, design reviews, code reviews and for any Informatica / DataStage related issues.
- Created conceptual, logical and physical data models using best practices and company standards to ensure high data quality and reduced redundancy.
- Performed reverse engineering of physical data models from databases, SQL scripts and excel spreadsheets.
- Conducted design and review meetings to get design approval from different stake holders in the project.
- Responsible for translating requirements to technical specifications and worked on developing design documents like mapping documents and data dictionary.
- Involved in Data profiling and performed data analysis based on the requirements, which helped in catching many Sourcing Issues upfront.
- Created mappings, mapplets and reusable transformations in Informatica. Also used several transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Created mappings to import data from heterogeneous sources like VSAM files, XML files and delimited flat files.
- Developed stored procedures to implement business logic.
- Fine-tuned the mappings using push down optimization, session partitioning, bulk load, fast load, etc.
- Worked with parameter files and also created parameter values across mappings.
- Used Informatica Workflow Manager, Workflow monitor and repository manager.
- Responsible to work with finance team, project stake holders, Data Architects and Data Definition Owners in different subject areas.
- Involved in all phases of development starting from Requirements Analysis, Data profiling, Data Certification, High level design, Low level design, Coding, performance tuning, and Testing, Supported System Testing and UAT and Implementation.
- Worked on updating system Architecture flow and Data modeling.
- Created execution support manual for the system support team.
Confidential, Dublin, OH
Senior ETL Lead
Tools: Informatica Power Center 9.5.1, Informatica Power Exchange, Teradata 13.11, Oracle 9i, SVN tool, Control M
Responsibilities:
- Worked in onsite/offshore model managing a team of 7 members. Part of a huge 45 member team.
- Involved mostly in analyzing the existing SQL/BTEQ/PL-SQL scripts and Informatica jobs, documented the data lineage and identified the actual SOR.
- Documented every layer of the SOR data lineage with all the transformations involved.
- There are 20 source systems identified by business and each system comprising of close to 500 SOR tables.
- Actively involved in coordinating with business teams to understand the priorities.
Confidential, Columbus, OH
Technology Lead
Tools: Informatica Power Center 9.5.1, Teradata 13.11, Oracle 9i, Microsoft SQL Server 2008, SVN tool, CA Erwin r9.5
Responsibilities:
- Working as a Technology Lead in onsite/offshore model and leading a team of six members.
- Key Player in the team, assigned with designing, extracts/mappings having millions of data and complex logics.
- Technical consultant for the team to provide optimal solutions, design reviews, code reviews and for any Informatica related issues.
- Developed several reusable components during the course of the Project which considerably reduced the development effort.
- Created mappings, mapplets and reusable transformations in Informatica. Also used several transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Created mappings to import data from heterogeneous sources like VSAM files, XML files and delimited flat files.
- Developed stored procedures to implement business logic.
- Fine-tuned the mappings using push down optimization, session partitioning, bulk load, fast load, etc.
- Worked with parameter files and also created parameter values across mappings.
- Used Informatica Workflow Manager, Workflow monitor and repository manager.
- Responsible to work with project stake holders, Data Architects and Data Definition Owners in different subject areas.
- Involved in all phases of development starting from Requirements Analysis, Data profiling, Data Certification, High level design, Low level design, Coding, performance tuning, and Testing, Supported System Testing and UAT and Implementation.
- Involved in Data profiling and performed data analysis according to the mapping/transformation rules, which helped in catching many Sourcing Issues upfront.
- Translated business needs into data models supporting long-term solutions.
- Worked with the Application Development team to implement data strategies, build data flows and developed conceptual data models.
- Created conceptual, logical and physical data models using best practices and company standards to ensure high data quality and reduced redundancy.
- Developed best practices for standard naming conventions and coding practices to ensure consistency of data models.
- Performed reverse engineering of physical data models from databases and SQL scripts.
- Evaluate data models and physical databases for variances or discrepancies and validated business data objects for accuracy and completeness.
- Conducted design and review meetings to get design approval from different stake holders in the project.
- Responsible for translating requirements to technical specifications and worked on developing design documents like mapping documents and data dictionary.
- Mentored team on Informatica and data modeling concepts.
- Involved in planning, effort tracking and estimations using SMC methodology.
- Worked on updating system Architecture flow and Data modeling.
- Supported System Testing and UAT.
- Created execution support manual for the system support team.
- Being technical team lead, took the responsibility of onsite/offshore coordination.
Confidential, Columbus, OH
Technology Lead
Tools: Informatica Power Center 9.5.1, Teradata 13.11, Oracle 9i, SVN tool, Microsoft SQL Server 2008, CA Erwin r9.5
Responsibilities:
- Worked as a Technology Lead in onsite/offshore model and leading a team of five members.
- Involved in all phases of development starting from Requirements Analysis, Data profiling, Data Certification, high level design, low level design, Coding, and Testing, Supported System Testing and UAT and Implementation.
- Conducted design and review meetings to get design approval from different stake holders in the project.
- Mentored team on Informatica and data modeling concepts.
- Involved in Data profiling and data analysis.
- Translated business needs into data models supporting long-term solutions.
- Worked with the Application Development team to implement data strategies, build data flows and develop conceptual data models.
- Created logical and physical data models using best practices and company standards to ensure high data quality and reduced redundancy.
- Developed best practices for standard naming conventions and coding practices to ensure consistency of data models.
- Evaluate data models and physical databases for variances or discrepancies and validated business data objects for accuracy and completeness.
- Handled critical projects with frequently changing requirements and provided fast turn around results.
- Developed several reusable components during the course of the Project which considerably reduced the development effort.
- Created mapplets and reusable transformations in Informatica. Also used several transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Created mappings to import data from heterogeneous sources like VSAM files, XML files and delimited flat files.
- Developed stored procedures to implement business logic.
- Worked on updating system Architecture flow and Data modeling.
- Used SVN as a version control tool.
- Using JCL scripting for scheduling the BTEQ execution.
- Developed BTEQs and MLOAD scripts to calculate the ARPH metrics.
- Developed Informatica mappings to load the data from files to Oracle data base and Teradata to Oracle data bases.
- Developed sophisticated design which helped to enhance the business needs in the change requests with minimal enhancements to the design.
- Developed reusable logic in stored procedures that were used by different BTEQs to load the data for all SOCS hierarchy levels.
- This has reduced the coding and testing efforts to a large extent.
Confidential
Technology Analyst
Tools: Informatica Power Center 9.5.1, Oracle 9i, Microsoft SQL Server 2008, SQL&PL/SQL, Control-M scheduler, CA Erwin r9.5
Responsibilities:
- Worked as a Technology Lead in onsite/offshore model and leading a team of 4 at offshore.
- Involved in preparing technical design specification, coding and testing. Involved in peer code reviews.
- Responsible in getting all the requirements from onsite coordinator and analyzed all the requirements with out any gaps.
- Developed standards document, Coding check list which helped the team to avoid defects in the early stage.
- Mentored new team members who joined the team on functional, technical and process aspects and data modeling concepts.
- Translated business needs into data models supporting long-term solutions.
- Worked with the Application Development team to implement data strategies, build data Developed several reusable components during the course of the Project which considerably reduced the development effort.
- Created mapplets and reusable transformations in Informatica. Also used several transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Created mappings to import data from heterogeneous sources like VSAM files, XML files and delimited flat files.
- Developed stored procedures to implement business logic.
- Developed logical and physical data models to support new and existing projects.
- Handled critical projects with frequently changing requirements and provided fast turn-around results.
- Anchored Time sheet management from offshore.
- Handled QA reviews and cutover activities very well.
- Delivered quality code with zero defects which was highly appreciated by the client.
Confidential, Irwindale, CA
Technology Analyst
Tools: Informatica Power Center 9.5.1, DataStage 7.5.2, Teradata 13.11, Oracle 9i, Microsoft SQL Server 2008, SVN, CA Erwin r9.5
Responsibilities:
- Tracked project level resource and effort estimates for the Data Profiling of each object.
- Analyzed legacy data and their behavior on transformation before loading into SAP database.
- During initial phase of the Project, data profiling was identified as necessary due to various data mismatches across the multiple legacy systems. During that time I was the only resource in Data Profiling. Based on the quality and importance of work we grabbed Data Cleansing Project which was 6 member team lead by me.
- Got 7/7 customer satisfaction for Data Cleansing Project.
- Data profiling results also helped development team in a great way in their Job designs.
- Involved in multiple presentations on Data Profiling reports, Data Cleansing approach, and daily status across different stakeholders such as Process Team, Development Team, SAP team, Business Team etc.
- Worked as a Technology Lead in onsite/offshore model and leading a team of six members.
- Involved in getting the requirements from business, data profiling, preparing technical design specification, coding, testing, and migration of code to QA and PROD environments.
- Translated business needs into data models supporting long-term solutions.
- Created logical and physical data models using best practices and company standards to ensure high data quality and reduced redundancy.
- Data profiling results were presented to client using presentations which had helped them in identifying the missing requirements. This was highly appreciated as it reduced the cost to the project.
- Involved in writing functional and technical test cases for validation.
- Responsible for migrating the existing DataStage mapping using Version Control tool to various environments like QA, PT and PROD.
- Expertized using the transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Created mappings to import data from heterogeneous sources like VSAM files, XML files and delimited flat files.
- Developed stored procedures to implement business logic.
- Worked with parameter files and also created parameter values across mappings.
- Used Informatica Workflow Manager, Workflow monitor and repository manager.
- Anchored Time sheet management and ensured on-time approval.
- Handled QA reviews are cutover activities very well.
- Delivered quality code with minimal defects which was highly appreciated by the client.
- I was a DP anchor for the project.
Confidential
Technology Analyst
Technologies: OWB ETL tool, Oracle 9i
Responsibilities:
- Worked as a developer and was a good team player.
- Mastered on OWB ETL tool in couple of weeks.
- Involved in coding and testing. Conducted peer code reviews.
- Worked in ODC environment which is very secure and dealt with very critical information
- Mentored new team members who joined the team on technical and process aspects.
- Experienced working in a project with stringent timelines and frequently changing requirements.
- Delivered quality code with minimal defects which was highly appreciated by the client.
Confidential
Senior Developer
Technologies: Informatica Power Center, DataStage 7.5.2, DataStage Version Control tool, Oracle 9i (SQL, PL/SQL), UNIX scripting
Responsibilities:
- Understood customer needs and established credibility with the customer by providing inputs on solutions.
- Integrated client feedback in agreed timelines to ensure highest level of client satisfaction. Proactively identified red flags and escalated them to the manager for immediate action.
- Prioritized tasks of different relative importance by understanding the criticalities and interdependencies in the project.
- Proactively planned and identified critical inputs for the planning process by analyzing historical data. Observe work output and identify contingencies and promptly escalate them.
- Designed software modules which adhere to standards for coding and design to solve business problems as outlined in the functional design specifications.
- Extensive Client Interaction and preparation of Business Requirements Design.
- Code built with an intension to have good performance based on execution time and readability.
- To ensure that the code built is made to ensure proper reusability of existing code.
- To ensure proper knowledge of testing techniques are being used to plan and execute unit testing. Unit Testing scripts are prepared with an intention to cover as many scenarios as possible.
- Performed the role of an onsite coordinator for understanding requirements and sending detailed explanatory mails to Offshore counter parts ensuring seamless execution of project delivery.
- Analyzed legacy data based on transformations before loading into SAP database.
- Tracked the defects to closure.
- Developed several reusable components In Informatica which helped the team till the Project end.
- Effectively designed Delta loads in the project which successfully went go-live with any issues from the ETL side. All the Delta jobs are planned during the last phase of the project and it has only one round of testing done before Go-Live.
- Handled critical objects with frequently changing requirements and provided fast turn-around results.
- Responsible for migrating DataStage mappings using Version Control tool to various environments like QA, PT and PROD.
- Developed standards document, Coding check list which helped the team to avoid defects in the early stage.
- Handled QA reviews and cutover activities very well.
Confidential
Senior Systems Engineer
Tools: DataStage 7.5.2, Oracle 9i, SVN tool, CA Erwin r9.5, Toad
Responsibilities:
- Worked as a developer and was a good team player.
- Responsible in getting all the requirements from onsite counterpart and identified the dependencies among the objects.
- Analyzed all the requirements and made sure to get all the clarifications answered, before going to the design phase.
- Involved in preparing technical design specification, coding and testing. Involved in peer code reviews.
- Responsible for migrating the mapping using Version Control tool to various environments like QA, PT and PROD.
- Prepared BOK on Source File Validations which was used extensively by the team. Took the initiative is developing the execution procedure manual for an object and enabled the team to develop for their objects.
- Handled QA reviews are cutover activities very well.
- Delivered quality code with minimal defects which was highly appreciated by the client.
Confidential
Developer
Tools: DataStage 7.5.2, Oracle 9i, Microsoft Visio
Responsibilities:
- Interacting with client and users for collection of requirements.
- Involved in Analysis, Design, and Development of all client requirements.
- Analyzed all the requirements and made sure to get all the clarifications answered, before going to the design phase.
- Extensively worked on development of test case, test pass, resource allocation, employee master and test strategy modules depending on the client requirements.
- Created and used Shared Containers, Local containers for DataStage Jobs. Also used several stages including Transformer, Aggregator, Lookup, Join, Merge, Pivot, CDC, Datasets, FTP stage etc…
- Created jobs in DataStage to import data from heterogeneous data sources like VSAM files, XML files.
- Used DataStage Administrator to setup some Project environment variables and other settings.
- Used DataStage Director to run/ schedule/ monitor jobs and analyze logs.
- Used Parallel Processing techniques to improve job performance.