Etl Pentaho Developer Resume
Bensalem, PA
SUMMARY
- Strong experience in Complete Software Development Life cycle (SDLC) which includes Business Requirements Gathering, System Analysis, and Design, Development, and Implementation of Data warehouse.
- Strong experience in installation of Pentaho Data Integration 7.0.1
- Extensively used Pentaho, Jasper, BIRT client tools Pentaho report designer, Pentaho Data Integration, Pentaho scheme workbench, Pentaho design studio, pentaho kettle, Pentaho BI server, Jasper ireport, Jasper server, BIRT report designer.
- Experience in using Oracle 10g/9i/8i, Netezza, Google Big Query, Postgres Sql, Sybase, Teradata, MS SQL Server 2005, MS Access 2007, SQL, PL/SQL.
- Strong experience in SQL Server 2008R 2/2008/2005/2000 with Business Intelligence in SQL Server Integration Services(SSIS)
- Excellent hands on experience in creating measures groups, Scorecards and Key Performance Indicator (KPI) and defining actions, translations and perspectives in SSAS.
- Worked with Dimensional Data warehouses in Star and Snowflake Schemas, slowly changing dimensions and created slowly growing target mappings, Type1/2/3 dimension mappings.
- Proficient in transforming data from various sources (flat files, XML, Sybase, Oracle) to Data warehouse using ETL tools.
- Extensively worked on transformations such as Source Qualifier, Joiner, Filter, Router, Expression, Lookup, Aggregator, Sorter, Normalizer, Update Strategy, Sequence Generator and Stored Procedure transformations.
- Extensively experience in developing Informatica Mappings / Mapplets using various Transformations for Extraction, Transformation and Loading of data from Multiple Sources to Data Warehouse and Creating Workflows with Worklets & Tasks and Scheduling the Workflows.
- Extensively involved in Informatica performance issue and Database performance issue, Expertise in error handling, debugging and problem fixing in Informatica.
- Experience in design of logical and physical data modeling using E/R studio. Expertise in using SQL*LOADER to load Data from external files to Oracle Database.
- Extensive experience in writing PL/SQL programming units like Triggers, Procedures, Functions and packages in Unix and windows environment.
- Exposure to on - shore and off-shore support activities
- Excellent verbal and communication skills, has clear understanding of business procedures and ability to work as an individual or as a part of a team.
TECHNICAL SKILLS
ETL Tools: Pentaho Data Integration (Kettle) 7.0.1, Informatica Power Center 9.5.1/9.1/8.6/8.5/8.1/7.1/6.1 , Informatica Data Analyzer 8.5/8.1, Informatica Power Exchange, IDQ, Data cleansing, SSIS.
Data Modeling: Erwin 7.2/7.0, Toad, Oracle Designer, PL/SQL Developer 5.1.4.
Databases: PostgreSQL, AWS Cloud, Netezza, MariaDB SQL,, Teradata, Sybase, Oracle 11g/10g/9i/8i, MS SQL 7.0, SQL Server, Google Big Query, plx,2005/2000, MS Access.
Scheduling Tools: Cron Tab,Google Clarinet, Tidal, Autosy's and Control-M, Pentaho Scheduler, Rundesk.
Programming: C, C++, SQL, PL/SQL, T-SQL, SQL Plus, HTML, UNIX Shell Scripting. (AIX), java, java script.
Reporting Tools: Pentaho Reporting Tool, Tableau, Business Objects XI 3.1/X1 R2, Crystal Reports 2008/XI, Cogno's 8.4, SSRA, SSAS
Methodologies: Data Modeling-Logical/Physical/Dimensional, Star/Snowflake schemas, Fact and Dimension Tables
PROFESSIONAL EXPERIENCE
ETL Pentaho Developer
Confidential - Bensalem, PA
Responsibilities:
- Designing document to create ETL pipeline Transformations and Jobs.
- Created various user accounts in Pentaho Enterprise Console and assigned various roles for different users
- Created various mapping documents for source and target mapping
- Used various input types in PDI for parallel accessing.
- Created various connections with various databases in PDI.
- Identified performance issues in existing sources, targets and mappings by analyzing the data flow, evaluating transformations and tuned accordingly for better performance.
- Used various types of inputs and outputs in Pentaho Kettle including Database Tables, Text Files, Excel files and CSV files.
- Automated data transfer processes and mail notifications by using FTP Task and send mail task in Transformations.
- Worked on various types of CDS global internal applications using Pentaho tool. validated that the environment settings and Database connection are recognized within the Pentaho environment.
- Worked in Cron environment settings need to be in the execution scripts in all ETL jobs and transformations.
- Good experience in design the jobs and transformations and load the data sequentially for initial and incremental loads.
- Good experience in using various PDI steps in cleansing and load the data as per the business needs
- Good experience in configuration of Data integration server to run the jobs in local, remote server and cluster mode.
- Environment: Pentaho Data Integration (PDI), MariaDB SQL, My SQL, in FOCUS Reporting Solutions, FTP, Slack, Shell, Cron Tab, Unix, SVN, Java, Apache Ant,
ETL Informatica / Pentaho Developer
Confidential - Downingtown, PA
Responsibilities:
- Implemented various loads like daily loads, weekly loads, and quarterly loads and on demand load using Incremental loading strategy and concepts of changes Data Capture (CDC).
- Involved in analysis of source systems, business requirements and identification of business rule and responsible for developing, support and maintenance for the ETL process using Informatica
- Identified performance issues in existing sources, targets and mappings by analyzing the data flow, evaluating transformations and tuned accordingly for better performance.
- Made use of various Power Center Designer transformations like Source Qualifier, Connected and Unconnected Lookups, Expression, Filter, Router, Sorter, Aggregator, Joiner, Rank, Router, Sequence generator, Union and Update Strategy transformations while creating Mapplets/mappings.
- Automated data transfer processes and mail notifications by using FTP Task and send mail task in
- Used various types of inputs and outputs in Pentaho Kettle including Database Tables, MS Access, Text Files, Excel files and CSV files.
- Designed and implemented Business Intelligence Platform from scratch. Integrated it with upstream systems using Hadoop, Pig, Hive and other Big Data component for various functionalities. Make platform more resilient, also make it less configurable so that with minimal setting, we can onboard clients.
- Specific Dashboard of the Google BigQuery Database by cleaning up the metadata of completed courses in addition to educating category owners on how to classify classes in gLearn.
- Working the conversion of an internal learning system (gLearn) from a SQL Database to a non-relational database, specifically Google BigQuery.
- Working stakeholder by providing the business requirements for the ETL Layer in BigQuery, testing the ETL and working with the Metrics and Internal stakeholders to ensure acceptance.
- Created and ran SQL queries on a weekly/monthly basis, which enable internal teams to analyze course related data.
- Supported QA and various cross functional users in all their data needs. experience in Google Cloud Platform services such as API's Services, storage, (Classic/Application), Cloud Watch and IAM
- Scheduled jobs using Clarinet tool to run all Pentaho jobs and transformations nightly, weekly and monthly.
- Identify, document and communicate BI and ETL best practices and industry accepted development methodologies and techniques.
- Troubleshoot BI tool problems and provide technical support as needed. Perform other tasks as assigned.
- Worked very closely with Project Manager to understand the requirement of reporting solutions to be built.
- Good experience in configuration of Data integration server to run the jobs in local, remote server and cluster mode.
- Good Experience in designing the advance reports, analysis reports and Dash Boards as per the clients requirements.
Environment: Informatica Power Center (Power Center Repository Manager, Designer, Workflow Manager, and Workflow Monitor, Power Exchange, Cloud),Pentaho Data Integration, Pentaho Reporting Designer, Google Big Query, GCP, Plx Visualization Tool, Oracle, Sybase, Clarinet, Cider, Critique, GUTS, Shell, Unix, Git, Tableau Server, Java Script.
ETL Developer
Confidential - Denver, CO
Responsibilities:
- Hands on experience with Pentaho Data Integration 7.1 to create jobs and transformations hierarchically from development to QA and UAT servers
- Identify and analyze data discrepancies and data quality issues and works to ensure data consistency and integrity
- Created ETL transformations and jobs using Pentaho Data Integration Designer (Kettle-Spoon) and scheduling them using Cron Job
- Used JavaScript, Regex and Java to build custom filters and steps which would satisfy the business requirements of the end user
- Identify and analyze data discrepancies and data quality issues and works to ensure data consistency and integrity
- Created ETL transformations and jobs using Pentaho Data Integration Designer (Kettle-Spoon) and scheduling them using Cron Job
- Created Pentaho jobs to all environments
- Working on existing etl tool Pentaho code changes and deploying as client requirements
- Working on poc project on snowflake cloud Datawarehouse project to migrate all data into snowflake
- Working on s3 bucket to using snow pipe to pulling the data in to snowflake
- Validating and testing performance etl tools Matillion and Fivetran
- Created various user accounts in Pentaho Enterprise Console and assigned various roles for different users
- Created various mapping documents for source and target mapping
- Used various input types in PDI for parallel accessing.
- Created various connections with various databases in PDI.
- Identified performance issues in existing sources, targets and mappings by analyzing the data flow, evaluating transformations and tuned accordingly for better performance.
Environment: Pentaho Data Integration (PDI), PostgreSQL, MicroStrategy, Flat Files JIRA, My SQL, FTP, Slack, Shell, Cron Tab, Unix, Bitbucket, SnowSQL, Snowflake,S3,IAM.