We provide IT Staff Augmentation Services!

Bi Data Engineer Resume

San Francisco, CA

SUMMARY

  • Data warehouse/Business Intelligence professional with 9 years of experience; Worked in all phases of project like Business Requirements Gathering/Analysis, Development, Testing, Implementation, Deployment, Maintenance and Production Support.
  • Expertise in Extraction, Transformation, and Loading (ETL) data from various sources into Data Warehouse and Data Marts using Informatica Power Center 9.6/9.1/8.1/7.1 (Administrator activities, Repository Manager, Mapping Designer, Workflow Manager, Workflow Monitor, Worklet, Mapplets, Transformations, Partitions, version control and performance tuning).
  • Experience in Big Data technologies like AWS and Hadoop - have in depth knowledge of Hadoop Architecture, JobTracker, TaskTracker, NameNode, Data Node, Map Reduce, manage and external tables.
  • Experience in working on multi-terabyte analytical systems.
  • Extensive experience in Travel, Banking, Logistics and Retail domains.
  • Hands-on experience with AWS technologies including EMR, EC2, S3; and using framework to launch the cluster dynamically and cost effectively.
  • Experience in Hadoop eco-system components such as HDFS, MapReduce, HBase, Sqoop, Pig, ZooKeeper, Yarn and Oozie.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Extensive experience in performance tuning at Informatica and Oracle level using advance techniques for faster execution.
  • Expertise in scheduling BI/ETL jobs using Control-M, Cron jobs and Informatica Scheduler.
  • Skilled in writing and debugging Unix Shell Scripts.
  • Experience working with Informatica Data Quality tool IDQ end-to-end workflows from profiling to loading it as golden record.
  • Sound knowledge on Data Warehouse concepts - Data Modeling using Star Schema/Snowflake Schema, Normalization, OLAP, ODS, EDW, Data Marts, Facts & Dimensions tables.
  • Strong skills in Oracle Programming which includes SQL, PL/SQL Packages, Stored Procedures, Functions, Cursors, Triggers, Materialize Views, indexes, partitions, Temporary tables, Collections, Exception handling.
  • Experience in visualization tools like MicroStrategy 8.x/9.x (MicroStrategy Desktop, Web Interface), MicroStrategy Narrowcast, QlikView and OBIEE.
  • Experience in Production Support (rotational On-Call) for ETL and Reporting jobs - debugging issues and fixing within SLAs. Also involved in maintenance work, coordinating with other teams and managing services for all BI Servers.
  • Experience in working closely with business product owners within the Agile/Scrum development process to deliver as per business requirements.
  • Quick learner and ability to work in a team as well as individually, who likes to work on cutting edge technologies, problems and leverage opportunity to enhance skills.

TECHNICAL SKILLS

ETL: Informatica 9.x, 8.x, 7.x, SQL Loader

Big Data: AWS, S3, EC2, EMR, Hadoop, Hive, Sqoop, Oozie, HBase

OLAP: MicroStrategy, OBIEE, QlikView

RDBMS: Oracle 9i/10g/11g/12c, Teradata, MySQL, Sybase, SQL Server

Languages: Shell Script, SQL, HQL, PL/SQL, Python, Java

Scheduling: Control-M, Cron, Autosys

Version control: GIT, Stash, VSS 6, Perforce

Streaming: Kafka, storm

Other Tools: MS Office, TOAD, SQL Developer, IDQ, Rapid SQL, Jira, Splunk

PROFESSIONAL EXPERIENCE

Confidential, San Francisco, CA

BI data Engineer

Responsibilities:

  • Played major role in migrating the jobs from Hadoop ecosystem to AWS cloud and performed testing to make sure data loaded as expected.
  • Played lead role for on-call production support and help in forming new Support team at Offshore; also worked closely with product owners for multiple development projects.
  • Created jobs for QA framework to push the JSON messages to Apache Kafka topic which passes through the storm topology and loads into S3 buckets, making sure s3 data is as expected.
  • Created script to launch the EMR cluster and add steps to run the hive jobs for loading the data into S3 buckets.
  • Created sqoop jobs to read data to/from data warehouse and importing/exporting them from/to hdfs.
  • Played major role in migration of Informatica jobs from 9.1 to 9.6 and oracle 11g to 12c.
  • Created alerts in Splunk for all BI API jobs for invalid messages.
  • Working closely with business product owners within the Agile/Scrum development process framework to deliver as per business requirements.
  • Extensively worked on Informatica Performance tuning to find potential bottlenecks and taken appropriate actions which improved the performance drastically.
  • Created QA framework for DW and AWS jobs to improve the data quality and automated the same in Control-M to run daily.
  • Developed Informatica mappings to process high volume of data and build a data warehouse from various sources like Oracle, Teradata, flat files and XML files.
  • Created script for API call to get the source files from ftp servers and transfer it to ETL server.
  • Installed and configured Power Center 9.6 on UNIX platform. Upgraded to Informatica Power Center 9.6 from version 9.1. Installed Hotfix, utilities, and patches released from Informatica Corporation.

Confidential, San Francisco, CA

BI Engineer

Responsibilities:

  • Ensured that all support requests are properly approved, documented, and communicated and closed within SLA, Also handled rotational on-call production support for the ETL and Reporting batch runs, and debugging issues whenever they occur.
  • Provided time savings of 15% by implementing a functionality in Informatica that helped a faster and efficient solution to restart and recover failed ETL jobs.
  • Created Run Book for developments/projects so team can be familiar with job process and have this documentations handy during production support.
  • Created shell script to clean up all the unwanted old files as per retention policy to free up the space in Servers and scheduled in cron job, this helped in term of cost and getting out of space issue from almost all BI servers.
  • Created Hive Managed and External tables defined with static and dynamic partitions.
  • Developed Hive scripts to migrate ETL jobs and load into HDFS.
  • Created Deployment group in Informatica and Label to promote the code to another environment like QA and PROD.
  • Managed users account and providing required access to BI systems such as Informatica, MicroStrategy, QlikView, Hadoop, AWS, also involved in maintenance work, coordinating with other teams and managing services for all BI Servers.
  • Involved in creating dashboards/scorecards in QlikView and MicroStrategy.
  • Created Visio diagram for Production mappings to understand the load process for any new user.
  • Created jobs in Control-M and Cron to automate the ETL jobs as per required dependencies.

Confidential, NJ

ETL Consultant

Responsibilities:

  • Collaborated with Business Users for requirements gathering, business analysis, functional and technical design.
  • Coordinated with offshore team to provide technical and functional direction to ensure quality deliverables.
  • Extensively used Informatica (ETL) to load data from wide range of sources such as Oracle database and flat files (fixed-width or delimited).
  • Responsible for Extracting and transforming data from different source systems and loading to staging area using Informatica and PL/SQL.
  • Worked on Performance tuning and optimization using indexes, partitions, bulk collect, for all, explain plans, temporary tables, optimizer hints, SQL trace, tkprof, DBMS Profiler.
  • Utilized Informatica IDQ to complete initial data profiling and matching/removing duplicate data.
  • Created packages and procedures to load the data from source to staging and same called in Informatica.
  • Responsible for deploying the Informatica objects into another environment using deployment group.
  • Responsible for user and folder creation in Informatica and assign users to group accordingly and supporting integration services, hot fix etc.
  • Responsible for creating and importing all the required sources and targets to the shared folder.
  • Created Test Plan Document-UTC (Unit Test Cases) and executed them.
  • Extensively used the mapping parameter and variables in simplifying the code during development phase
  • Extensively worked on Informatica Performance tuning to find potential bottlenecks in the Source, Informatica and Target systems.

Confidential, CT

ETL Consultant

Responsibilities:

  • Worked closely with the Clients to gather the requirements and understand the business.
  • Used shortcuts to reuse objects without creating multiple objects in the repository and inherit changes made to the source automatically.
  • Developed numbers of Complex Informatica Mappings, Mapplets and Reusable Transformations to facilitate one time, Daily, Monthly and Yearly Loading of Data.
  • Worked with Informatica Data Quality (IDQ) toolkit, Analysis, data cleansing, data matching, data conversion, duplicate elimination and exception handling and monitoring capabilities of IDQ.
  • Migrated the existing mappings to new environment like version and environment migration.
  • Created mapping to extract the data from source and load into data warehouse and extensively used Transformations like Lookup (static/dynamic), Router, Aggregator, Source Qualifier, Joiner, Expression, Aggregator and Sequence generator.
  • Worked with different sources such as Oracle, MS SQL Server and flat files.
  • Responsible for deploying the Informatica objects into QA and DEV environments.
  • Optimizing the mappings by changing the logic to improve the performance.
  • Analyzed the source data coming from different sources (Oracle, SQL Server, Flat Files) and working with business users and developers to develop the Model.
  • Removed bottle necks at source level, transformation level, and target level for the optimum usage of sources, Transformations and target loads.

Confidential

ETL developer

Responsibilities:

  • Used Informatica Power Center to create mappings, sessions and workflows for populating the data into dimension, fact, and lookup tables simultaneously from different source systems (Oracle DB and Flat files).
  • Designing the dimensional model and data load process using SCD Type II.
  • Extensively used Transformations like Static/Dynamic Lookup, Router, Aggregator, Source Qualifier, Joiner, Expression, Aggregator and Sequence generator.
  • Interacted with client and gathered requirement for further enhancement of project.
  • Deploying and monitoring the informatica workflows in UAT environment.
  • Created IDQ mappings and profiles for Source Data profiling and produced scorecard, charts etc to view number of null values, wrong date formats etc
  • Involved in identifying bottlenecks in source, target, mappings and sessions and resolved the bottlenecks by doing Performance tuning techniques like increasing block size, data cache size, sequence buffer length.
  • Developed UNIX shell scripts to create parameter files, rename files and compress files.
  • Developing test plan and scripts, conducting testing, and dealing with business partners to conduct end-user acceptance testing.
  • Developed the various report with conditional formatting by using Presentation Services in OBIEE.
  • Developed hierarchies to support drill down reports in OBIEE.
  • Formatted the reports to meet the user requirements using Multi-Dimensional Analysis like slice and Dice, Drill down.
  • Created the unit test cases (UTC) and test scenarios to test the reports according to the business requirement.

Hire Now