We provide IT Staff Augmentation Services!

Data Integration/hadoop Consultant Resume

Horsham, PA

SUMMARY:

  • Over 10 years of experience in development of various Data Warehousing projects using Business Intelligence tools. Adept at combining business - driven objectives with technology, using the best practices, integrating solutions into corporate systems and deliver on time, creating strategic and tactical direction focused on realistic solutions. Expertly plan and coordinate detailed tasks within quality standards and established deadlines. Understand the business process and help turn technology into solutions that provide effective results.

TECHNICAL SKILLS:

BI/ETL Tools: Informatica Power Center, Informatica Data Quality (IDQ), Informatica Power Exchange, SSIS

Big Data Technologies: Hadoop, Map Reduce, Hive, Pig, Spark, Scala, HBase, ZooKeeper

Reporting Tools: Tableau, SSRS

Databases: Oracle, SQL Server, Teradata, DB2, Sybase, MS Access

Languages: SQL, PLSQL, Java

Data Warehousing: Star Schema, Snowflake Schema

Unix: Shell Scripting

Query Tools: Toad, Sql Developer, Sql Server Management Studio, Teradata Sql Assistant

Teradata Utilities: Fastload, MLoad, BTEQ scripts

Data Modeling: Relational/Dimensional Data Modeling

XML Tools: Altova XML Sp

Editors: Notepad++, Editplus, Textpad

Unix GUI: Putty, WinSCP, CoreFTP

Defect Tracking Tools: HP Service Now, Rational Clear Quest, Jira

Operating Systems: Windows, Unix

Office Tools: MS Word, Excel, Power Point, MS Visio

Other Tools: Web Services, SOAPUI, Erwin, Robo-FTP, SVN, DOS, Batch Files

PROFESSIONAL EXPERIENCE:

Confidential, Horsham, PA

Data Integration/Hadoop Consultant

Responsibilities:

  • Involved in the Business Requirement Analysis, ETL technical design discussions and prepared ETL high level technical design documents.
  • Conducting data analysis, capturing data mapping specifications documents.
  • Working with business analysts to identify the appropriate data elements for required capabilities.
  • Update the status and plan the releases through the daily status meetings.
  • Perform data reconciliation to identify data anomalies, escalating data issues requiring process re-engineering and developing solutions to monitor and report quality performance indicators metrics.
  • Conduct root cause analysis to prevent future data quality issues and implement new standards or processes to prevent future data quality issues, including taking necessary corrective action.
  • Experienced in optimizing Hive Join Queries to have better results for Hive ad-hoc queries.
  • Involved in executing Pig scripts, Hive Queries for optimal data driven solutions.
  • Exporting the result set from Hive to SQL using Sqoop.
  • Loaded the dataset into Hive for ETL Operation.
  • Used Pig as ETL tool to do transformations such as joins, filter and data sets-aggregations.
  • Designed Hive tables to load data to and from external files.
  • Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
  • Created RDDs using Spark, applied transformations and actions to manipulate and load data.
  • Developed different set of mappings in Informatica for Data Integration.
  • Worked on major components in Hadoop Ecosystem including Hive, PIG, HBase, HBase-Hive Integration, Scala, Sqoop and Flume.
  • Responsible for migration of objects into higher environments and complete data validation.
  • Used Informatica Cloud connectors for uploading Organization data to Amazon Redshift, sales force for reporting analytics.
  • Implemented different Informatica Cloud apps for data synchronization, data replication and schedule tasks.
  • Responsible for providing status on data loads to the management on daily basis.
  • Actively respond on the data issues reported by the end users.
  • Created Dashboards, Scorecards, Prompts and KPI metrics in OBIEE Reporting.
  • Tuning of Data Loads for performance improvement in Informatica and Database.

Environment: Informatica Power Center 9.6, Informatica Data Quality, Hive, Pig, Spark, Scala, Map Reduce, Cloudera CDH5.1, My SQL, Flat Files, Amazon Redshift, Oracle, Oracle Sql Developer, SQL, PL/SQL, Shell Scripting, OBIEE Reporting, DAC scheduler.

Confidential, Newark, NJ

ETL Technical Lead/Hadoop Consultant

Responsibilities:

  • Lead the ETL development for new projects and Enhancements in Health Care Data warehouse.
  • Involved in the Business Requirement Analysis, ETL technical design discussions and prepared ETL high level technical design documents
  • Conducting data analysis, capturing data mapping, and implementing DQ rules using IDQ.
  • Good understanding of Data Profiling and Data Quality rules and implement using IDQ.
  • Data related issues are remediated on the LOB/SOR side and JIRA issues opened for tracking.
  • Test and validate the DQ rules implementation results by executing SQL against the data.
  • Implemented Partitioning, Dynamic Partitions and bucketing in Hive for efficient data access.
  • Designed Hive external tables using shared meta-store instead of derby with dynamic partitioning &buckets.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Implemented solutions for ingesting data from various sources and processing the Data utilizing Big Data Technologies such as Hive, Pig, Sqoop, Hbase, Map reduce.
  • Developed Hive Scripts, Pig scripts, Unix Shell scripts, programming for all ETL loading processes and converting the files into parquet in the Hadoop File System.
  • Generated Dashboards with Quick filters, Parameters and sets to handle views more efficiently using Tableau.
  • Published Workbooks by creating user filters so that only appropriate teams can view it.
  • Generated context filters and used performance actions while handling huge volume of data.
  • Used Informatica Cloud connectors for uploading Organization data to Amazon Redshift, sales force for reporting analytics.
  • Implemented different Informatica Cloud apps for data synchronization, data replication and schedule tasks.
  • Designing ETL jobs and preparing the dependencies for the jobs to run through Tivoli Scheduler.
  • Coordinating with SQA team during defect status calls and resolves the defects by ensuring smooth execution of the test plans.
  • Worked in resolving project/production bugs and provided on call support.
  • Developed prototype mappings for processing XML sources using XML transformations and Web Service transformations to fetch the live information.
  • Creating the deployment documents and migrating the code to the production environment.

Environment: Informatica Power Center 9.6, Informatica Data Quality 9.1, XML, Web Services, Oracle, Sql Server, Pig, Hive, Spark, Scala, Map reduce, Zoo Keeper, Oracle, Amazon Red shift, Sql Developer, Sql Server Management Studio, SQL, PL/SQL, Shell Scripting, Tableau reporting, Tivoli scheduler.

Confidential, Indianapolis, IN

Data Quality Developer

Responsibilities:

  • Requirement Gathering and Analysis by coordinating with the Business Users and preparing the Technical Specification Documents.
  • Provide technical knowledge of Extract/Transform/Load (ETL) solutions for Business Intelligence projects.
  • Used Informatica Data Quality to cleanse, standardize data, perform column and rule profiling on multiple legacy files.
  • Responsible for full ownership of assigned tasks, issues/risks, other project demands to meet project milestones.
  • Actively participate in regularly-scheduled IT project team status meetings.
  • Schedule DQ rules to be executing using scheduling software using ESP Scheduler.
  • Develop Data Conversion mappings from legacy systems using custom ETL tools Informatica.
  • Migration of data quality rules to UAT and execution of rules and turn over the rules to Data Stewards for UAT validation.
  • Migrate the rules to PROD and data stewards will validate the results in tableau.
  • Data Quality rules will be monitored based on the frequency which was requested and Data quality rules failures will be communicated to the Data stewards for root cause analysis.
  • Leveraging Reference tables to standardize data.
  • Responsible for the implementation and Release Management Activities.

Environment: Informatica Power Center 9.6, Informatica Data Quality 9.5, Flat Files, Teradata, Sql Server, Teradata Sql Assistant, BTEQ Scripts, Sql Server Management Studio, SQL, PL/SQL, and Shell Scripting, ESP scheduler

Confidential, Newark, NJ

Technical Lead/Data Analyst

Responsibilities:

  • Responsible for reviewing the BRD documents and translate into technical specifications.
  • Work from technical specifications to independently develop, modify and maintain moderately complex software solutions.
  • Analyze and troubleshoot problems and make recommendations for remediation.
  • Leads cross functional teams in technical capacity.
  • Leads design/code reviews, architecture discussions and infrastructure reviews.
  • Manages interactions with internal and external stake holders.
  • Responsible for writing the Unix shell scripts for different aspects to handle CBR Audit Balancing process.
  • Responsible for processing the mainframe file into Sql server using the PowerExchange as part of Nasco Pod extract.
  • Created Data maps for the mainframe source, row test of data and integrate into Powercenter client tool.
  • Active participate during the data issues, provide resolution and deploy the code to higher environments.

Environment: Informatica PowerCenter 9.6, Informatica Power Exchange 9.1, Flat Files, Oracle, Sql Server, Main Frame Sources, Oracle Sql Developer, Sql Server Management Studio, SQL, PL/SQL, and Shell Scripting, Tivoli scheduler

Confidential, Wilmington, DE

Database Developer

Responsibilities:

  • Responsible for gathering the requirements, and translated the BSD into technical documents by interacting with the business units and delivered the application code that is fully tested and meets the business requirements.
  • Worked on various data sources such as Oracle, Sql Server and Flat Files.
  • Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
  • Created and maintained PL/SQL scripts and stored procedures.
  • Investigated data quality issues and implemented appropriate solutions.

Environment: Flat Files, Oracle, Sql Server, Toad v13, SQL, PL/SQL, Unix Shell Scripting

Confidential, New Jersey, NJ

ETL Lead Developer

Responsibilities:

  • Responsible for gathering the requirements, interact with the business units to understand the requirements during the design, development and testing of solutions.
  • Worked on various data sources such as Teradata and Flat Files.
  • Creation of Mapping Specification documents required for various feeds like WKH, IMS, MCK.
  • Used the Informatica Power center 9 to pull the data from the different sources and load them into ABBI system.
  • Responding quickly and effectively to production issues and taking responsibility for seeing those issues through resolution.
  • Prepared the issue tracker during different phases and passing to the business to fix the issues.
  • Wrote Sql Queries as part of Data Validation on the Source databases and checked the Target Files generated.
  • Interact with the offshore, assigning the tasks and responsible for deliverables.

Environment: Informatica PowerCenter 9, Flat Files, Teradata V2R5,Sybase, Teradata Sql assistant v12, SQL, PL/SQL, and Shell Scripting

Confidential, Phoenix, AZ

ETL Lead

Responsibilities:

  • Involved interactively with the business units to understand the requirements and gather the feedback during the design, development and testing of solutions.
  • Worked on various data sources such as Sybase, Teradata and Flat Files.
  • Creation of Mapping Specification documents required for various feeds like ICruse, DQME.
  • Used the Informatica Powercenter 8.6.1 tool to pull the data from the different sources and create the target Flat files.
  • Responding quickly and effectively to production issues and taking responsibility for seeing those issues through resolution.
  • Wrote Sql Queries as part of Data Validation on the Source databases and checked the Target Files generated.

Environment: Informatica PowerCenter 8.6.1, Flat Files, Teradata V2R5,Sybase, Sql, Teradata Sql assistant v12, WLM Scheduling Tool, Toad 9.1, SQL, PL/SQL, and Shell Scripting

Confidential, Richmond, VA

Technical Lead/Developer

Responsibilities:

  • Creation of Mapping Specification documents required for various feeds like Virginia, Empire and Seniors.
  • Used the Informatica Powercenter 8.6.1 tool to pull the data from the Master table and create the target Flat files.
  • Developed the mappings as per the Informatica Standards, and created sessions to run the logic developed in the mappings from the Workflow Manager.
  • Used different types of tasks such as Email, Command and Event Wait tasks in the Workflow Manager.
  • Wrote Sql Queries as part of Data Validation on the Source databases and checked the Target Files generated.

Environment: Informatica Power Center 8.6.1, Flat Files, Teradata V2R5, Oracle 10g, Teradata Sql assistant v12, SQL, PL/SQL, and Shell Scripting

Confidential, Groton, CT

Sr ETL Developer

Responsibilities:

  • Gathering of the Design requirements and analysis of the source system data.
  • Created the mapping design documents based on the business requirement.
  • Used the Informatica Powercenter 8.1 to pull the data from the existing source system data and generated the target flat files.
  • Developed and implemented the coding of Informatica Mapping for the different stages of ETL.
  • Implementing performance tuning methods to optimize developed mappings.
  • Debug the sessions by utilizing the logs of the sessions.
  • Created the parameter files for the connection objects.
  • Done extensive testing and wrote queries in SQL to ensure the loading of the data.
  • Run the Data Sync utility by placing the target flat files and load them into P2L.
  • Used the Training Migration Utility (TMU) to load the activities and courses completion in to the backend tables.

Environment: Informatica Power Center 8.1, Flat Files, MS SQL Server 2005, Oracle 10g, Data Sync Utility, Training Migration Utility, Toad 9.1, SQL, PL/SQL, and Shell Scripting

Confidential, North Haven, CT

Sr Informatica Developer

Responsibilities:

  • Involved interactively with the business units to understand the requirements and gather the feedback during the design, development and testing of solutions.
  • Worked on various data sources such as Oracle, Teradata and flat files.
  • Extracted the data from various data source systems into the Landing Zone area by creating the Informatica mappings using the Teradata fast Loader Connections.
  • Created the BTEQ Scripts to process the business logic from the Landing Zone to the Common Staging Area(CSA).
  • Created mapplets and reusable transformations.
  • Done the Unit testing after successfully loading the data into Landing Zone (LZ) area.
  • Wrote Sql Queries as part of Data validation in the Target tables.

Environment: Informatica Powercenter 8.1.6, Oracle 10g, Teradata V2R5, Flat Files, Toad, SQL, PL/SQL, BTEQ Scripts, Unix Scripting.

Confidential, Indianapolis, IN

Informatica Developer

Responsibilities:

  • Creating interfaces from existing global systems to the new JD Edwards enterprise one system and vice versa.
  • Creating onetime data migrations of all conversions to the JDE Edwards enterprise one system.
  • Employing the IDQ to help identify data issues and implementing the cleansing procedures with existing interfaces/migrations.
  • Working with business units to identify mapping specifications.
  • Utilizing the MS Access database for the conversions that are (one time data conversions) for translations that are in Excel or CSV format.
  • Creating mapplets and reusable transformations.
  • Implementing the error handling strategy process for all interfaces to avoid errors during the live environment.
  • Performing data masking transformation, which masks sensitive data by transforming it into de-identified, realistic data that resembles the original data.
  • Using Robo FTP software to encrypt and decrypt files.
  • Implementing performance tuning methods to optimize developed mappings.

Environment: Informatica PowerCenter 8.1.1, Informatica Data Quality 8.1, DB2, Sybase, Flat Files, XML, MS Access 2003, MS SQL Server 2005, Oracle 10g, ISeries Navigator, Robo-FTP, Toad 3.2, SQL, PL/SQL, and Shell Scripting

Confidential, Thousand Oaks, CA

Informatica Developer

Responsibilities:

  • Analyzing business requirements and performing source system analysis.
  • Preparing documents and creating the detailed mapping description to specify the logic for interfaces.
  • Extracting data from the EDH Schema (Enterprise data hub) and applying the logic according to client specifications required and loading them into the staging table and flat files.
  • Using Informatica PowerCenter 7.1.4 for extraction, transformation, and loading of data into the staging area and target flat files.
  • Working on various data sources such as Oracle, Sybase, XML, and flat files.
  • Designing, developing, testing, and documenting Informatica mapping, sessions, and workflows based on standards and specifications.
  • Developing mappings with different transformations in the Mapping Designer and using Workflow Manager to run sessions and scheduling them.
  • Creating SQL Loader scripts to load test data into EDH using the SQLLDR for development and QA testing.

Environment: Informatica Power Center 7.1.4, Oracle 9.2, Sybase, SQL, PL/SQL, SQL*Loader, Toad 7.2, MS Windows 2003, and Unix

Confidential

Warehouse Management

Responsibilities:

  • Analyzing business requirements and gathering information for the development of several small applications
  • Designing and developing user interfaces
  • Writing PL/SQL packages
  • Using SQL Loader for loading data into the Oracle database
  • Writing PL/SQL stored procedures/functions to build business rules for loading data into the database
  • Participating in Oracle performance tuning
  • Implementing unit and system integration testing

Environment: SQL, PL/SQL, SQL Loader, Oracle, and MS Windows NT

Confidential

Graduate Assistant

Responsibilities:

  • Duties include maintaining Windows workstations throughout the Engineering Department.
  • Installing, Configuring and Security Updates are the daily duties.
  • Tracking down network issues and checking LAN.
  • Assisted students in solving problems pertaining to C, AUTO CAD assignments and general computer related issues.
  • Make sure that no one in the room abuses any of the equipment in the lab in any manner.
  • Assisting the Instructors and mentoring the students in the lab.
  • Maintaining the systems properly by not giving the permissions not to download software on to any computer and not to reconfigure the computers.( i.e. change screen settings, etc)

Hire Now