Solution Engineer - Big Data Resume
Atlanta, GA
SUMM A R Y:
- 20+ years of experience in the IT indus try with 15+ years in Enterprise Data technologies building EDW and Big Data solutions for world renowned companies in Entertainment, Healthcare, Hospitality and Retail
- Work ed on all EDW phases including Data Analysis, Data Transformation, Data Migration, Data Movement, Data Profiling, Data Quality, Cloud ETL/ELT products, and SQL-based tools
- Migrated Big Data from Teradata to Hortonworks Big Data Platform (Hadoop) on Google Cloud for Data warehouse offloading using Syncsort DMX-h and native Hadoop ecosystem tools
- Migrated Big Data from Matrix to Amazon Redshift (AWS Cloud) for EDW data migration using Informatica and Amazon native API tools
- Hands on development experience in Informatica (all major product ver sions); hands on experience in external (CAWA, Control-M) and built-in (Informatica) scheduler tools
- Well ver sed with Normalized, Star-Schema, Snowflake and Sc hema-on-Read data models
- Extensively used ETL/ELT methodology for ar chite c ting Data Extraction, trans formations and loading pro c e ss, in a corporate-wide Data Integration Solution us ing Informatica software produc ts and Velo city Best Pra c tice methodologies
- Experience with Streamsets, Snaplogic, Talend and Pentaho tools for building pipelines in Big Data platform
- Experienced in traditional databases, Cloud databases and Big Data Hadoop (Cloudera and HW)
- Excellent SQL tuning and SQL c oding knowledge, Oracle PL/SQL programming & Unix Shell Scripting.
PROFESSIONAL EXPERIENCE:
Confidential, Atlanta, GA
Solution Engineer - Big Data
Responsibilities:
- Provided Informatica BDM administration support for Cloudera Big Data Hadoop platform
- Designed and Developed ETL/ELT processes in Informatica BDM for data offloading in Hadoop using Native, Hive, Blaze and Spark streaming modes
- Designed and built Tableau dashboards for adhoc reporting
- Hands-on experience in code deployment using Informatica, Python and Bit Bucket
- Involved in troubleshooting of production ETL/ELT processes for problem resolution
- Hands-on development and administration experience with Streamsets Data Collector tool for data ingestion into Hadoop using Kafka and Spark streaming
- Designed and Developed pipelines for data ingestion into Hadoop platform for both RT and batch use cases.
- Setup Kerberos security model in Hadoop for user authentication and authorization for Informatica BDM and Streamsets
- Hands-on experience with Cloudera Manager, Hue and Beeline tools
Environment: Informatica BDM 10.1.1, Streamsets 2.7.2, Teradata, Sybase, Cloudera Hadoop ecosystem tools - HDFS, Hive, HBase, Sqoop, Kafka, Spark, Tableau
Confidential, Centreville, VA
Sr. Architect - Big Data
Responsibilities:
- Migrated on-premise EDW data from Matrix Paraccel to AWS Redshift Cloud DW platform using Informatica Powercenter and Amazon native APIs .
- Migrated on-premise data from Oracle to AWS Redshift using Informatica Powercenter and Amazon native APIs
- Designed and Developed ETL code using Informatica and Amazon native APIs.
- Built data encryption routines and embedded in the ETL code to encrypt sensitive data
- Setup job schedules in Control-M to load dimensions and facts in Amazon Redshift
- Performed end- to-end ETL testing for successful production deployment and post-production support
- Created mappings and mapplets to integrate Carfax's data from varied sources like Oracle, flat files and MySQL databases and loaded into landing tables of Informatica MDM Hub.
- Defined the Trust and Validation rules and set up the match/merge rule sets to get the right master records
- Used Hierarchies tool, for configuring entity base objects, entity types, relationship base objects, relationship types, profiles, put and display packages and used the entity types as subject areas in IDD.
- Successfully implemented IDD using hierarchy configuration and created subject area groups, subject areas, subject area child, IDD display packages in hub and search queries.
- Deployed new MDM Hub for portals in conjunction with user interface on IDD application.
Environment: Informatica MDM Hub , Informatica Powercenter, Amazon APIs, AWS Redshift, Control-M
Confidential, Atlanta, GA
Sr. Advisor - Technical
Responsibilities:
- Migrated data from Teradata to Hortonworks Hadoop Big data platform using Syncsort and native Hadoop tools
- Designed and developed end-to-end data integration solutions to offload data in Hadoop
- Converted Informatica ETL processes to Syncsort DMX-h ETL processes
- Designed and Developed Real-Time streaming data solutions to load reservation data using Kafka and Spark
- Architected, Designed, Developed and Supported ETL processes in Informatica Powercenter for data loads in EDW
- Performed end - to-end ETL and data testing for successful produc tion deployment
- Extensively involved in Data Migration efforts from Oracle database to Teradata Databa se
- Work ed with business and technical teams to plan, design, and implement phases of ETL initiatives
Environment: Hortonworks Hadoop, Oracle 11g RAC, Teradata 6700, Informatica Pow erCenter 9.6, Syncsort DMX, Identity Re solution, Initiate, Trillium
Confidential, Cincinnati, OH
Application Support Manager
Responsibilities:
- Involved in capacity planning, ens uring a c curate and complete system and user testing by managing the documentation of test plans
- Work ed on defect identification, resolution and produ c tion migration
- Involved in post - produ c tion ac tivities to make a smooth transition to produc tion support teams
Environment: Oracle 9i, Informatica Power Center 7.1.2, Solaris 10