We provide IT Staff Augmentation Services!

Data Engineer Resume

3.00/5 (Submit Your Rating)

Jersey City, NJ

SUMMARY:

  • 12+ years of work experience in Data Engineering, Data Visualization and Data Warehousing using cutting edge technologies.
  • 3 + years data engineering experience using Big Data technologies such as Hadoop, Spark, Scala, Python, Sqoop, Kafka, Hive, NoSQL DB, HBase, MapReduce and other big data technologies.
  • 2+ years data visualization experiences using Tableau, QlikView and Power BI.
  • 5+ years of strong experience in Data Warehousing and ETL using Informatica PowerCenter 10.1/9.1/9.0.1/8. x/7.x/6.1, Power Exchange 9.1/8.6/8.1, Oracle 11g/10g/9i, Teradata 13/12/V2R6 and Erwin.
  • 2 years of experience in IBM Netezza data warehouse development using Netezza Database 7.2.1.
  • 3 years of experience on real - time data Warehouse development using CDC tool Informatica PowerExchange 9.1/8.6/8.1.
  • Experience in Data Warehouse/Data Mart Development Life Cycle using Dimensional modeling of STAR, SNOWFLAKE schema, OLAP, ROLAP, MOLAP, Fact and Dimension tables, and Logical & Physical data modeling using ERWIN 7.5/4.2 and MS Visio.
  • Having Business Intelligence experience using OBIEE 11g/10g, Business Objects XI R2, MS Access Reports.
  • Extensive experience in using Oracle 11g/10g/9i, DB2 8/7, MS SQL Server 2008/2000, Teradata 13/12/V2R6, MS Access 7.0/2000, Erwin, XML, SQL, PL/SQL, SQL*Plus, SQL*Loader and MS SQL Developer 2000, Win 7/XP and Sun Solaris.
  • Worked with Teradata loading utilities like Multi Load, Fast Load, TPump and BTEQ.
  • Extensively worked on Oracle Function, Cursor, Store Procedure, and Package & Trigger.
  • Experience on data modeling and create LDP and PDM for Star schema and Snowflake schema using MS Visio and ERWIN 7.1/4.5.
  • Exposure in overall SDLC including requirement gathering, development, testing, debugging, deployment, documentation and production support.
  • Excellent working knowledge of UNIX shell scripting, job scheduling on multiple platforms, experience with UNIX command line and LINUX.
  • Experience in preparing ETL design documentations, user test case documentation and standard operating procedures (SOP) documentation.
  • Experience of working on onsite - offshore module.
  • Worked as Agile Scrum call facilitator and track status and discussion using BaseCamp and Rally.
  • Extensive experience on effort estimation, distribution of work, track & update status report, team coordination and update client.
  • Highly motivated to take independent responsibility as well as ability to contribute and be a productive team member.
  • Providing on Production Support, Resolution & Closure of Trouble issues tickets.
  • Experienced in using IDQ tool for profiling, applying rules and develop mappings to move data from source to target systems.
  • Worked on multiple projects using Informatica developer tool IDQ of latest versions 9.1.0 and 9.5.1.

TECHNICAL SKILLS:

ETL Technology: Informatica PowerCenter 10.1/9.5/9.1 /8. x/7.1/6.1, PowerExchange 9.1/8.6/8.1

Data Modeling: Star Schema Modelling, Snow Flake Modelling, MS Visio, Erwin

Databases: Oracle 11g/10g/9i, MS SQL Server 2008/2000, MS Access, DB2, Teradata 14/13/12/V2R6, Sybase, Netezza, NoSQL DB

Reporting Tools: Business Objects XIR2, OBIEE 11g/10g

Languages: Python, C, C++, SQL, PL/SQL, HTML, UNIX Scripting, Scala

Other Tools: Toad, Harvest, SCM, Putty, Tidal, Autosys, ESP, Tableau, QlikView, Power BI

Operating Systems: LINUX, SUN Solaris, AIX, Windows7/XP/2000/98, Mac OS

Version Control: GitHub, SVN (Tortoise as SVN client, Subclipse for Eclipse), Clear Case

PROFESSIONAL EXPERIENCE:

Confidential, Jersey City, NJ

Data Engineer

Responsibilities:

  • Developed data pipe line using Python, Hive to load data into data link. Perform data analysis data mapping for several data sources.
  • Created real time streaming using Kafka .
  • Worked with data scientist to create table in Hive to run data models.
  • Converted several informatica workflows into Hadoop, Spark and Scala .
  • Conversion EDW to Big data.
  • Implemented Pass Through, Auto Hash, User defined Hash Key and Data Base Partitions for performance tuning.
  • Analyzed Bulk Load option, Third party Loaders suggested by Informatica.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
  • Perform data mapping of source-to-target data sets.
  • Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analysis.
  • Overseeing the inbound and outbound interfaces development process by closely working with functional, developer and tester.
  • Extracted data from legacy systems into staging area using ETL jobs & SQL queries.
  • Developed the SQL scripts and Procedures for the business rules using Unix Shell and NZSQL for Netezza.
  • Assessed the Netezza environment for implementation of the ELT solutions.
  • Structural harmonization is done by extracting the data from different staging tables into alignment area table by integrating multiple source tables from staging into alignment schema table.
  • Performed Requirement Analysis, Designing and Creating Data Services Layer in DENODO Express and feeding it to downstream Dashboard systems.
  • Created custom DENODO views by joining tables from multiple data sources.
  • Designed and developed high-quality integration solutions by using DENODO virtualization tool (read data from multiple sources including Oracle, Hadoop, MySQL)
  • Mastered the ability to design and deploy rich Graphic visualizations with Drill Down and Drop down menu option and Parameters using Tableau.
  • Strong Dashboard design experience and passionate practitioner of effective data visualization. Familiarity with best practices around visualization and design.
  • Designed, developed, tested, and maintained Tableau functional reports based on user requirements.
  • Experience in Agile Methodology project execution model and expertise in coordinating the teams between multi locations.
  • Created Jira ticket, code check in to SVN branch using Tortoise.

Environment: Hadoop, Python, Spark, Scala, Hive, Kafka, Informatica Power Center 10.2/10.1, Oracle 11g, Netezza, NoSQL, NZSQL, DENODO, UNIX Shell Scripting, SQL, PL/SQL, TOAD, MS Access, MS Visio, Utilities BTEQ, FLOAD and TPUMP, Putty, WinScp, WinCvs, Linux, Tableau, QlikView, Tortoise, Ultra Edit.

Confidential, San Francisco, CA

Data Analyst

Responsibilities:

  • Involved and proficient in defining and validating protocols for clinical studies and handling trial responsibility throughout the data-management lifecycle.
  • Worked on CED-EDC source system (Electronic Data Capture) and database design and hypothesis.
  • Support clinical trials for CRO (Contract Research Organizations) by providing meticulous data management. Design and maintain databases, queries, reports, and graphics and data-analysis tools; perform data entry, check reviews, database audits and coding; and define and validate study protocols.
  • Worked on Oracle Clinical development on the design, testing and implementation of study databases.
  • Develop clear clinical data sets enabling the standardized collection and analysis of massive amounts of cross-boundary data content in a timely manner and with a high level of accuracy.
  • Track progress of clinical studies, ensuring projects meets timelines and quality expectations.
  • Oversee data-management lifecycle of large clinical trials, composing and verifying reports and results.
  • Strong exposure and Involved in writing Simple and Complex SQLs, PL/SQL Functions and Procedures, Packages and creation of Oracle Objects - Tables, Materialized views, Triggers, Synonyms, User Defined Data Types, Nested Tables and Collections.
  • Interacted with the business users, collected the requirements, analyze the requirements, design and recommend solutions
  • Extensive worked on using SQL, PL/SQL, ORACLE Database, and many other ORACLE facilities, such as Import/Export, SQL*Loader and SQL*PLUS.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Prepared BRS (Business Requirement Specifications) document that gives the detailed information about the requirements.
  • Good understanding of database objects and ability to triage issues.
  • Involved in PL/SQL code review and modification for the development of new requirements.
  • Created materialized views required for the application.
  • Involved in handling the changes in compiling scripts according to the database changes.
  • Developed stored procedures to extract the data from different sources and load it into data warehouse.
  • Analyzing the data and Mapping the data requirements developing Stored Procedures, Functions and Triggers.
  • Involved in uploading of the data from flat files into Databases and validated the data with PL/SQL procedures.
  • Maintaining daily batch cycle and providing 24-hour Production support.
  • Preparation of the Test Cases and involvement in Unit Testing and System Integration Testing.
  • Utilized SQL*Loader to load flat files into database tables.
  • Created SQL*Loader scripts to load data into temporary staging tables.
  • Worked on ETL process of data loading from different sources and data validation process from staging area to Actavis data warehouse.
  • Worked with ETL team involved in loading data to staging area to data warehouse. Provided all business rules for the database for loading data.
  • Proficient in ETL (Extract - Transform - Load) using SQL Server Integration Services 2012(SSIS) and Informatica Power Center tool

Environment: Informatica Power Center 10.1/9.5/9.1, Oracle 11g, UNIX Shell Scripting, SQL, PL/SQL, TOAD, MS Access, MS Visio, Utilities BTEQ, FLOAD and TPUMP, Putty, WinScp, WinCvs, Linux, QlikView, SSIS, SSRS, SCM, Rally, UltraEdit, BizTalk, Dell Boomi.

Confidential, Marlborough, MA

Sr. Informatica Developer

Responsibilities:

  • Interacting with client on a regular basis to discuss day-to-day Issues and matters.
  • Provide support for Informatica workflow/mapping (developed used Informatica 9.1.0) which are running into Production environment at client location.
  • Conduct training and Knowledge Transfer (KT) session onsite and offshore developers and testers on domain and functional areas.
  • Used clinical trials data and compound data to lunch blockbusters drugs more quickly, frequently and cost effectively.
  • Persistent data quality to ensure the ongoing of clinical data and compound data.
  • Integrate clinical trials data and compound data from operational and analytical application to enterprise data integration.
  • Configured flexible data model for all clinical trails data.
  • Ensuring top-quality deliverables from HCL to the client.
  • Provide support for code developed using Data Warehouse Administration Console (DAC) in order to achieve scheduling for ETL jobs.
  • Involved in data analysis and handling the ad-hoc request by interacting with business analyst, client and customers and resolve the issues as part of production support.
  • Reviewing project/task, status/issues with the HCL offshore team and ensuring completion of project on time.
  • Developing UNIX shell script for automation and enhancing/streamlining existing manual procedure used at client location.
  • Involved in the development of Informatica mappings and preparation of design document (DD), technical design document (TDD) and unit acceptation testing (UAT) documents.
  • Actively participating in proving technical proposal for upgraded existing ETL and OBIEE code at client locations (In order to make use of advanced features of Informatica newer version).
  • Making use of various HCL proprietary frameworks and techniques for requirements gathering and business process maps for understating the current process.
  • Experienced in using IDQ tool for profiling, applying rules and develop mappings to move data from source to target systems.
  • Created test plans, test data for extraction and transformation processes and resolved data issues following the data standards.
  • Strong Data analysis to ensure accuracy and integrity of data in the context of Business functionality.
  • Worked on Informatica developer tool IDQ of latest versions 9.1.0 and 9.5.1.

Environment: Informatica Power Center 9.5/9.1, Informatica Power Exchange 9.1/8.6, IDQ 9.5.1, Oracle 11g, UNIX Shell Scripting, SQL, PL/SQL, TOAD, MS Access, MS Visio, Tidal, Utilities BTEQ, FLOAD and TPUMP, Putty, WinScp, WinCvs, Linux, BaseCamp, SCM, Rally, SmartCapa, UltraEdit.

Confidential, Watertown, MA

Sr. Informatica Developer

Responsibilities:

  • Developed technical specifications of the ETL process flow.
  • Worked on design and development of Informatica mappings, workflows to load data into staging area, data warehouse and data marts in SQLServer and Oracle.
  • Worked on various issues on existing Informatica Mappings to Produce correct output.
  • Debugged mappings by creating logic that assigns a severity level to each error, and sending the error rows to error table so that they can be corrected and re-loaded into a target system.
  • Analyzed existing system and developed business documentation TRD on changes required.
  • Analyzed existing mapping and Reverse Engineering created DLD.
  • Analyzed existing Health Plan issues and Re Design on change required.
  • Involved in the Unit Testing, Event & Thread Testing and System testing.
  • Involved in gathering of business scope and technical requirements and created technical specifications.
  • Implemented Slowly Changing Dimensions - Type I & II in different mappings as per the requirements.
  • Performed unit and integration testing in User Acceptance Test (UAT), Operational Acceptance Test (OAT), Production Support Environment (PSE) and Production environments.
  • Created HLD, LLD, UTC doc and Migration documents.
  • Designed and developed Workflows as per ETL Specification for Stage load and Warehouse load.
  • Worked with production support systems that required immediate support.
  • Monitor and tune ETL processes for performance improvements; identify, research, and resolve data warehouse load issues.
  • Created synonyms for copies of time dimensions, used the sequence generator transformation type to create sequences for generalized dimension keys, stored procedure transformation type for encoding and decoding functions and Lookup transformation to identify slowly changing dimensions.

Environment: Informatica Power Center 9.5/9.1, Informatica Power Exchange 9.1/8.6, Oracle 11g, UNIX Shell Scripting, SQL, PL/SQL, TOAD, MS Access, MS Visio, Tidal, Utilities BTEQ, FLOAD and TPUMP, Putty, WinScp, WinCvs, Linux, BaseCamp, SCM, Rally. BizTalk, Dell Boomi, Cognos Framework Manager.

Confidential, New York

Sr. Informatica Developer

Responsibilities:

  • As a lead member of ETL Team, responsible for analyzing, designing and developing ETL strategies and processes, writing ETL specifications for developer, ETL and Informatica development, administration and mentoring.
  • Participated in business analysis, ETL requirements gathering, physical and logical data modeling and documentation.
  • As Scrum Master I manage: Stand-ups, Backlogs, sprint Planning Meetings.
  • Delivery of quality solutions on time, within budget using approved scheduling tools, techniques and methodologies has been critical to success. Accustomed to working well with internal and external stakeholders at multiple levels, navigating the organization, successfully completing complex projects under tight deadlines.
  • Doing self and peer review for Informatica and oracle objects.
  • Designing the data transformation mappings and data quality verification programs using Informatica and PL/SQL.
  • Designed the ETL processes using Informatica to load data from Mainframe DB2, Oracle, SQL Server, Flat Files, XML Files and Excel files to target Teradata warehouse database.
  • Designed Reusable Transformations, Mapplets, and reusable Tasks and designed Worklets as per the dependencies of various sessions and parent-child tables.
  • Worked on performance tuning of Informatica code. Extensively work on customization of cache, partitioning, push down optimization and transformation tuning.
  • Created PowerExchange registration and configured in PowerCenter to load data in real-time mode.
  • Worked on Teradata utilities BTEQ, MLOAD and TPUMP and tuned SQL.
  • Created Oracle stored procedure, package and triggers. Worked on analytical query to format report. Created materialized view to store summarized data.
  • Investigate, debug and fix problems with Informatica Mappings and Workflows.
  • Implemented Slowly Changing Dimensions - Type I & II in different mappings as per the requirements.
  • Performing ETL, Unix script and database code migrations across environments.
  • Created Mapping Parameters, Session parameters, Mapping Variables and Session Variables. Partitioned sessions and used incremental aggregation for fact load.
  • Participated in Decision support team to analyze the user requirements and to translate them to technical team for new and change requests.
  • Performed unit and integration testing in User Acceptance Test (UAT), Operational Acceptance Test (OAT), Production Support Environment (PSE) and Production environments.
  • Created HLD, LLD, UTC doc and Migration documents.

Environment: Informatica Power Center 9.1/8.6, Informatica Power Exchange 9.1/8.6, Oracle 11g, UNIX Shell Scripting, SQL, PL/SQL, TOAD, MS Access, MS Visio, Autosys, Teradata 13, Utilities BTEQ, FLOAD and TPUMP, Putty, WinScp, DB2 Mainframe, Linux, BaseCamp, SCM, Rally.

Confidential, Minnetonka, MN

Informatica Developer

Responsibilities:

  • Designed and developed Logical/physical Data Model, Forward/Reverse engineering Using Erwin 7.2.
  • Designed and developed Workflows as per ETL Specification for Stage load and Warehouse load.
  • Providing on call production support and efficiently tracked heat-tickets, timely resolving prod issues and proactively escalation (if appropriate), resolution and closure of trouble issues and tickets.
  • Designed ETL functional specifications and converting them into technical specifications.
  • Interacted with management to identify dimensions and measures.
  • Review source systems and propose data acquisition strategy.
  • Developed ETL methodology to custom fit the ETL needs of sales.
  • Data collection and transformation mappings and design of the data warehouse data model.
  • Responsible for developing the mappings for the pre-existed procedure in the CDW as we were removing the procedures from the cloned CDW.
  • Responsible for the data analysis of the target systems as there are target systems, which are dependent on CDW data.
  • Delivered the test plan, test specification and test report document as per the Guidant system document management system.
  • Coordinated with team members for collecting information about system testing of the clone CDW.
  • Created Web services source and targets. Customized web services mappings and workflows. Extensively worked on XML source, targets and transformations.
  • Actively involved in coordinating all the testing related issues with the end users and the testing team.
  • Executed Test Cases to ensure the product meets the specifications and the Life Cycle Services.
  • Worked on Data Analysis reports Business Objects.
  • Developed shell scripts for job automation, which will generate the log file for every job.
  • Prepared Run books, migration documents and production monitoring and support handbook for daily, weekly and monthly processing.

Environment: Informatica PowerCenter 8.6, Informatica PowerConnect, Informatica PowerExchange 8.6, Oracle 11g, SQL * Loader, Data Pump, TOAD, Business Objects XI/R2, MS SQL Server 2005, Sun Solaris, UNIX, Windows NT 4.0, MQ series.

Confidential, Santa Clara, CA

ETL Developer

Responsibilities:

  • Involved in Requirement gathering and studying of Source-Target mapping document
  • Involved in Data transfer from OLTP systems forming the extracted sources.
  • Interpreted logical and physical data models for Business users to determine common data definitions and establish referential integrity of the system.
  • Analyzed the sources, transformed the data, mapped the data and loading the data into targets using Power Center Designer.
  • Designed and developed Oracle PL/SQL Procedures.
  • Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
  • Participated in the design of Star & Snowflake schema data model.
  • Tested Informatica workflows using various transformations for extracting data from flat files, oracle and loading aggregated data into staging and target data mart.
  • Involved in creation of Test Cases, Test Scripts and logging of defects using Test Director.
  • Developed the materialized views, organized indexes to improve the performance of SQL queries.
  • Tested Informatica mappings and also tuned them for better performance and implemented various Performance and tuning techniques.
  • Responsible to tune ETL procedures and Star Schemas to optimize load and query performance.
  • Written stored procedures and triggers.
  • Performed System testing, Regression testing, functional testing manually.
  • Created Test plans that describe each test in minute detail.
  • Developed defect tracking report using Test Director.
  • Created MQ series Source and Target and used XML parser and XML generator.

Environment: Informatica Power Center 8.6, Informatica PowerExchange 8.6, Oracle 11g, XML Files, SQL*PLUS, SQL*Loader, TOAD, Windows 2000, UNIX, Import/Export Utilities, MQ Series, Shell Scripts.

Confidential, St. Louise, MO

ETL Developer

Responsibilities:

  • Involved in creation of Logical Data Model for ETL mapping and the process flow diagrams.
  • Worked with SQL developer to write the SQL code for data manipulation.
  • Worked on Informatica versioned repository with check in and checkout objects feature.
  • Used Debugger extensively to validate the mappings and gain troubleshooting information about data and error conditions.
  • Provided guidance to less experienced personnel. Conducted quality assurance activities such as peer reviews.
  • Participate in the business analysis process and the development of ETL requirements specifications.
  • Worked with production support systems that required immediate support.
  • Develop, execute and maintain appropriate ETL development best practices and procedures.
  • Assisted in the development of test plans for assigned projects.
  • Monitor and tune ETL processes for performance improvements; identify, research, and resolve data warehouse load issues.
  • Involved in unit testing of the mapping and SQL code.
  • Developed mappings to load data in slowly changing dimensions.
  • Involved in performance tuning of source & target, mappings, sessions and workflows.
  • Worked on Teradata various utilities like BTEQ, FLOAD and created procedures.
  • Worked with connected, unconnected lookups and reusable transformations and mapplets.
  • Utilized Unix Shell Scripts for adding the header to the flat file targets.
  • Involved in designing the star schema and populating the fact table and associated dimension tables.

Environment: Oracle 11g, SQL Developer, SQL, Informatica Power Center 8.1, Sybase, Windows XP, Visio 2000, Business objects XIR2, ESP, SCM, Putty, WinScm, Teradata V2R5.

Confidential, Minneapolis, MN

ETL Developer

Responsibilities:

  • Developed complex mappings to extract source data from heterogeneous databases SQL Server Oracle and flat files, applied proper transformation rules and loaded in to Data Warehouse.
  • Involved in identifying bugs in existing mappings by analyzing data flow, evaluating transformations using Debugger.
  • Implemented various Performance Tuning techniques on Sources, Targets, Mappings, and Workflows.
  • Worked closely with Production Control team to schedule shell scripts, Informatica workflows and PL/SQL code in Autosys.
  • Conducted Database testing to check Constrains, field size, Indexes, Stored Procedures etc.
  • Defects were tracked, reviewed and analyzed.
  • Conducted UAT (User Acceptance Testing) with user community.
  • Developed K-shell scripts to run from Informatica pre-session, post session commands.
  • Extracted data from VSAM file and XML files.
  • Involved in Data transfer from OLTP systems forming the extracted sources.
  • Interpreted logical and physical data models for Business users to determine common data definitions and establish referential integrity of the system.
  • Analyzed the sources, transformed the data, mapped the data and loading the data into targets using Power Center Designer.
  • Designed and developed Oracle PL/SQL Procedures.
  • Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
  • Participated in the design of Star & Snowflake schema data model.

Environment: Informatica PowerCenter 7.1/6.1, Oracle 10g, SQL Server 2000, Erwin 3.5, XML, TOAD, HP - Unix 11.11, Harvest, Sun Solaris, DB2 Mainframe.

We'd love your feedback!