Sr. Staff Etl And Datawarehouse Resume
Newark, CA
SUMMARY:
- Over 10 years of extensive experience in IT industry in the areas of analysis, design, development in client server environment. With focus on data warehousing applications using ETL and OLAP tools like Talend, Informatica, Tableau with Oracle, SQL Server.
- Worked in Data Warehouse and Business Intelligence Projects along with the team of Informatica, Talend (ETL).
- Love to deliver compelling story thru data, collaborate and connect with customer, stakeholders and other data scientists.
- Excellent knowledge in data warehouse development life cycle, gathering requirements and analysis, dimensional modeling, metadata management and implementation of STAR, Snowflake schemas, Incremental loads/Change Data Capture and slowly changing dimensions.
- Worked on Multi clustered HA base environment and setting up sustainable Big Data solutions using Hadoop, Greenplum and Postgres.
- Set up cloud base system using Amazon AWS for processing massive volumes of data on Hadoop and visualization.
- Involved in designing high performance and fault tolerant ETL/ELT process, Data Analysis, Data modeling, and performance tuning and version control.
- Expert in performance tuning methods, partitioning and pushdown optimization with large volumes of data load.
- Provide leadership and best practices for installation, administration, backup & recovery, Unit test, release, management, design features and functionality on DV and BI Project.
- Experience in integration of various data sources with multiple relational databases like Postgres, Oracle, SFDC, Teradata, SQL Server, MS Access, DB2, and XML System.
- Maintaining Talend Codes, maps, jobs, Parameters, creating and managing users, set up LDAP environment, troubleshooting and resolving issues.
- Creating appropriate indexes, usage of hints, re - building indexes and used the Explain Plan and SQL Tuning.
- Worked with heterogeneous data sources like postgres, Oracle, SQL Server 2005, Essbase, Sybase flat files, XML files, DB2 files.
- Strong experience with client interaction, understanding business applications, business data flow and data relations.
- Good communication skills with strong ability to interact with end-users, customers, and team members.
TECHNICAL SKILLS:
Big Data & Database: Postgres, Greenplum, Apache Hadoop, Cloudera and HortonWorks, redshift.
ETL Tools: Talend, Informatica Power Center, Pentaho PDI, Kettle .
Sub Projects: Hive, Pig, Spark, Flume, Zookeeper, Sqoop, Spark
Linux Flavors: Ubuntu 12.04.02 LTS, CentOS-6.3, RedHat Linux
Monitoring tool: Cloudera Manager, Apache Hadoop Web UI, Ganglia, New Relic
Database Tools: TSQL, phpPgAdmin, PgStudio, Toad for Cloud DB, Teradata SQL.
Languages: SQL, UNIX Shell Script (Bash, ksh), Python.
Cloud Base: Predix, OpenStack, AWS, S3.
Visualization Tools: Tableau, Domo, Qlikview, Pentaho DV.
PROFESSIONAL EXPERIENCE:
Confidential
Sr. Staff ETL and datawarehouse
Responsibilities:
- Worked with different BUs and accommodated their needs for data warehouse and ETL using Talend.
- Processing, cleansing, and verifying the integrity of data used for Talend and SQL.
- Worked on creating data pipeline from various source and keeping the performance in mind for visualizations.
- Work on administering ETL tools, defining data retention policies.
- Design and create Data model for datawarehouse and ETL architect to populate tables.
- Work on real time processing with Spark and other supported technologies.
- Identifying and Eliminated bottleneck in various programs and processes for enterprise wide High Volume data.
- Responsible for keeping up of various projects on sprint basic. Working with challenges with constantly changing requirement.
- Develop best practice methodology and documentation to support in different aspect of project.
Environment: Predix, Talend, Tableau, AWS, Service now, Postgres, MySQL, Python.
ConfidentialLead Data Engineer ETL
Responsibilities:
- Gathering and analysis requirements definition after meetings with business users and document meeting outcomes.
- Design and implement the ETL Data model and create staging, source and Target tables in Oracle database.
- Trouble-shoot query build issues and errors.
- Worked on Talend ETL tool, develop/ scheduled jobs in Talend integration suite. Extensively used the concepts of ETL to load data from postgres, flat files to postgres tables.
- Extensively worked with tools like phppgAdmin, Toad and SQL developer to do testing and implementation of ETL process.
- Modified reports and Talend ETL jobs based on the feedback from QA testers and Users in development and staging environments.
- Interfaced with data miners and analysts to extract, transform and load data from a wide heterogeneous data sources using SQL and Talend.
- Experience working with legacy files, structured, semi-structured and unstructured data sets, determining the key takeaways or action items and communicating them pictorially.
- Worked on connecting ETL to Hadoop HDFS to pull, and push data for data science team.
Environment: Talend, Postgres, MySQL, Oracle, HFS (Hitachi File System), Hadoop, Shell Script, Hive, Pig, Flume.
ConfidentialSr. Talend Consultant
Responsibilities:
- Design, develop, unit test, and support ETL mappings and scripts for data marts using Talend.
- Design and create Data model for datawarehouse and ETL architect to populate tables.
- Work and negotiate with different Source Vendors/Managers within VMware to access their data set and implement ETL Process to continuously obtain data with defined parameters
- Created multi-year road map for Talend to ingest data on Hadoop HDFS and create Visualization on Tableau.
- Created visualization dashboards for more than a dozen online reports helping clients identify opportunities to optimize products, services and contracts.
- Worked closely with the UX group to upgrade and enhance our platform and the user experience by creating new interfaces and infographics depicting complex data sets.
- Eliminated bottleneck in various programs and Talend jobs processed Enterprise wide High Volume data.
- Works on different admin activity and infrastructure team including security implementations.
Environment: Talend, Oracle, Pivotal Hadoop, ESXi, Java, Cassandra, Shell Script, Hive, Hbase, Pig, Flume.
ConfidentialSr. Consultant
Responsibilities:
- Evaluate different ways to visualization data and store in best possible way to sustain in given environment.
- Build Platform for in memory reporting system to report hdfs and respective sub projects,
- Worked in Initial implementation of Informatica Power Center 9, Shell Scripts and other tools.
- Implemented file management systems for XMLs and unstructured excel files.
- Extracted data from XML system for Master and Details record set and extablished one to many relationship in database using Informatica while populating in database.
- Contributed on doing the technical documents for dimensions and Fact tables.
- Wrote Test Plan and Script at high level, tested the build of Informatica, Scripts.
- Create, update and maintain project documents including business requirements, functional and non-functional requirements, functional design, Big Data etc.
- Creating different Metrics to analyze data for business users like Service Attach metric, Multiyear etc.
- Worked as Point of contact for all Data related, Big Data Implementations.
Environment: Informatica PC, OBIEE, HDFS, Apache Hadoop, CDH3, Shell Script, Hive, Hbase, Pig, Flume, UC4.
Confidential, Newark, CALead Consultant ETL and Hadoop Implementation
Responsibilities:
- Started Cloud base service with AWS and HDFS, worked on Installation of Hadoop and different components (HDFS, Map Reduce, Pig, Hive, NoSQL DBs like MongoDB etc).
- Perform Data analysis and understand Business requirements.
- Wrote Map Reduce Jobs and processed billions of records for analysis.
- Worked with large scale Hadoop environments build and support including design, performance tuning and monitoring.
- Implemented File validation and File management system for Informatica to process.
- Manage and Monitor Hadoop Distribution using Ganglia and Puppet ED.
- Create, update and maintain project documents including business requirements, functional and non-functional requirements, functional design, data mapping, etc.
- Identify opportunities for process optimization, process redesign and development of new process.
Environment: Hadoop, CDH, Map Reduce, Pig, Hive, HP Vertica Database, Squirrel SQL.
Confidential, San Ramon, CASr Staff Software Analyst ETL and Data Visualization
Responsibilities:
- Design and Development of ETL routines using Informatica Power Center and data flow management into multiple targets.
- Worked on Information gathering from different sources within PGE for implementing ETO Project.
- Participate in BI and Database product evaluations, POC's and business decisions.
- Analysed different source like XML, Database, excel based existing system to extract and aggregate to new system and experience with Agile Methodology.
- Worked as Informatica Administrator and performed Admin Activity e.g Install, Patch, creating environment.
- Worked in Initial implementation of Informatica Power Center 9, Shell Scripts and other tools.
- Implemented file management systems for XMLs and Unstructured EXCEL Files.
- Extracted data from XML system for Master and Details record set and extablished one to many relationship in database using Informatica while populating in database.
- Contributed on doing the technical documents for dimensions and Fact tables.
- Wrote Test Plan and Script at high level, tested the build of Informatica, Scripts.
- Worked on Performance Improvement and tuned mapping, Session and Workflow for better performances.
Environment: Oracle 10g, PL/SQL, Informatica Power Center 9.0.1, Business Object reports, XML Source, Toad, Linux
Confidential, MiamiHadoop Consultant
Responsibilities:
- Done an implementation of ECTL process for existing legacy system to move data.
- Identify opportunities for process optimization, process redesign and development of new process.
- Leading BI Team and Design the Data model for Sales, and implemented shell Scripts.
- Evaluated tools and worked on POC for Big Data implementation on Hadoop environment setup.
- Worked on Map Reduce Jobs, Hive, Hbase, Pig and NoSQL- MongoDB.
- Designed the Star Schema, Physical and Logical Model. Also, the Parameter tables and control tables.
- Responsible for the design, development, testing and documentation of the Informatica mappings, Transformation and Workflows based on LIME standards.
- Finally populating the data to data warehouse in Oracle thru Informatica PC 8.6 . Participated on End to end development and involved in decision making process, different meetings and conf calls.
- Designed Mappings for Stage area and responsible to develop and analyze the requirement work out with the solution.
Environment: Oracle 10g, PL/SQL, Informatica Power Center 8.6.0, Hadoop, Hbase, Hive and Pig, NoSQL DBs
ConfidentialLead ETL Consultant
Responsibilities:
- Design, develop and implemented ECTL process for existing tables in Oracle to move to DW. Performed data-oriented tasks on Master Data Especially Patient, customers like standardizing, cleansing, merging, de-duping rules along with UAT in each state.
- Responsible for the design, development, testing and documentation of the Informatica mappings, Transformation and Workflows based on Gene standards.
- Identify opportunities for process optimization, process redesign and development of new process.
- Initiate, define, manage implementation and enforce data QA processes across, Interact with another Team. Interacted with data quality team. Finally populating the data to data warehouse in Oracle thru Informatica PC 8.1 . Performance tuning in Informatica, Oracle SQL.
- Used Informatica to populate Oracle Table from Flat File System BI System.
- Designed Mappings for Landing, CADS, CDM area and responsible to develop and analyze the requirement work out with the solution. Tuning the SQL Queries, Mapping and PL/SQL Blocks.
- Done Incremental Load in Dimension tables and worked in highly normalized schema. Used Stage work and Dw table concept to load data, applied Start Schema concept. Used Informatica to do scheduling the workflow thru workflow manager.
Environment: Oracle 9i, PL/SQL, Informatica Power Center 8.1.6, UNIX Shell Script, B O.
Confidential, Costa Mesa, CATeam Lead
Responsibilities:
- Responsible for preparing the BRD, technical specifications from the business requirements and participated to develop ETL environment, processes, programs, and scripts to acquire data from source systems to populating to target system followed by feeding to data warehouse.
- Analysis of the Business and documenting it for development. Creating Tables and used Informatica to populate Oracle Table from Cobol GDG file System EBS System.
- Mentored the team of 3 Informatica Developer and prepared them for development.
- Worked in different tool in OBIEE like Admin Tool, Answers, Dashboards, Pivot Tables etc.
- Created Physical and Logical Layer extensively worked in different layers in OBIEE along with reports. Designing Start Schema Model. Extract and load DB2 tables for a customer.
- Wrote Teradata BTEQ Scripts, Fast Load, Mload etc in order to support the project.
- Performed Administrative talk as Informatica Administrator and participated on data-oriented tasks on Master Data projects especially members/Payment, like standardizing, cleansing, merging, de-duping rules along with UAT in each state.
- Responsible for the design, development, testing and documentation of the Informatica mappings, Reports and Workflows based on AAA standards. Implemented DAC in this Project to schedule.
Environment: Oracle R12, PL/SQL, Informatica Power Center 8.6, Oracle EBS, OBIEE, Answers, Dashboard, DAC, DB2 Source and target, Cobol Copy Book.
Confidential San Francisco, CASr ETL Consultant
Responsibilities:
- Provide technical guidance to programming team on overall methodology, practices, and procedures for support and development to ensure understanding of standards, profiling and cleaning process and methodology.
- Interact with the team to facilitate development, provide data quality reports, and perform software migration activities and accumulated the data for B2B solution.
- Worked in Informatica 8.x to create and deploy the business rules to populate data into tables .
- Created Snapshots, Summary tables and views in Database to reduce the system overhead and provide best quality of data for report, worked on cash management and configuration of DAC.
- Creation of presentation layer tables by dragging appropriate BMM layer logical columns in OBIEE.
- Provide overall direction and guidance to ETL development and support for the Prescription Solutions’ Data Mart. Applying Velocity Best Practice for the Project work.
- Created and used reusable Mapplets and transformations using Informatica Power Center .
- Responsible for the design, development, testing and documentation of the Informatica mapping,
Environment: Windows, UNIX, Informatica 8.x, Oracle 10g, SQL, UNIX Shell Script, OBIEE, Answers, Admin tool.
Confidential, San Jose, CASenior Data Warehouse ETL Developer
Responsibilities:
- Responsible for preparing the technical specifications from the business requirements.
- Analyze the requirement work out with the solution. Develop and maintain the detailed project documentation.
- Responsible for the design, development, testing and documentation of the Informatica mappings, PL/SQL, Transformation, jobs based on PayPal standards.
- Initiate, define, manage implementation and enforce DW data QA processes across, Interact with other QA Team. Interacted with data quality team.
- Identify opportunities for process optimization, process redesign and development of new process.
- Anticipate & resolve data integration issues across applications and analyze data sources to highlight data quality issues . Did performance and analysis for Teradata Scripts.
- Migrate SAS Code to Teradata BTEQ Scripts to do the scouring for score taking in account various parameters like login details, transaction $ amount etc. Playing with Marketing Data for various reports.
Environment: Oracle 9i, Informatica PC 8.1, PL/SQL, Teradata V2R6, Teradata SQL Assistant, Fast load, BTEQ Script, SAS Code, Clear case, JAVA Language, Pearl Scripts, XML Source.