Etl/bigdata Developer Resume
CharlottE
SUMMARY:
- Over Ten years of experience in data management, ETL architecture, design and development of complex IT applications in large enterprises by using Informatica PowerCenter, Hadoop Bigdata, SQL, Unix etc.
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time. Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse. Created mapping documents to outline data flow from sources to targets.
- Seven Years of experience in ETL Informatica, Oracle, Unix environment including design, development, testing and scheduling.
- Strong working experience in Data Warehousing concepts, Star Schema and Snowflake Schema methodologies. Experience in Data Modeling, Dimensional modeling of large databases. Working Experience with SQL,PL/SQL, Informatica MDM, Business Objects, Hyperion Essbase
- Worked extensively with Informatica complex mappings using different transformations like Expressions, Filters, Joiners, Routers, Union, Lookup, Stored Procedure, Aggregator, Update Strategy, Normalizer, Sorter, HTTP, XML, SQL, Union, Java transformations etc.
- Good Experience in moving the code between different environments, understanding the Production issues and fixing it on time
- Good Experience in writing the Unix/Linux Scripts. Good Working Experience with Scheduling tools like Autosys,UC4 and Tidal
- Good Working Experience with Teradata database, Oracle and Natezza database.
- Experience in using Teradata SQL Assistant, Teradata Administrator and data load/export utilities like BTEQ, Fast Load, Multi Load, Fast Export, and Exposure to T pump on UNIX/Windows environments and running the batch process for Teradata.
- Experienced in data cleansing, data profiling and data analysis along with troubleshooting Teradata scripts, fixing bugs and addressing production issues.
- Experience with development in Data integration programs that involve ETL tools including Teradata, Informatica, SHELL programming, and ETL Scheduling tools in UNIX environment. Experienced in troubleshooting Teradata scripts, fixing bugs and addressing production issues.
- One and half Years of experience in Hadoop Bigdata development environment. Experience in Big Data development using open source tools including Hadoop Core, Sqoop, PIG, Hive, Map Reduce, Scala and Spark
- Experience in creating Hive Internal/External tables, creating Oozie Workflows, pulling structured semi structured, unstructured and streaming data from different sources and loading. Good amount of experience in Core Java Development Environment
- Good Experience in identifying the Performance bottlenecks and experience in Performance tuning of Source, Target, Mappings, Sessions and System resources bottlenecks.
TECHNICAL SKILLS:
ETL Tools: Informatica 10.1, Informatica MDM
OLAP/Reporting Tools: Business Objects/ Hyperion Essbase
Database Tools: Hive, Oracle,Toad, SQL Server, Teradata
Hadoop Eco Systems: HDFS, Hive, HBase, Spark, Kafka, Flume, Sqoop, Pig, Oozie, Zookeeper, HDFS, YARN, R, Mahout, Map Reduce
Programming Languages: Unix, Java, Python
Operating Systems: Windows 2012,AIX, Solaris, Linux, Unix
Core Skills: Requirements gathering, Design and redesign techniques, Database Modeling, Development, Testing and Performance tuning
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte
ETL/Bigdata Developer
Responsibilities:
- Analyzed existing databases and data flows. Developed data flow diagrams.
- Involved in lift and shift of existing mappings,workflows from Confidential to Confidential environment
- Involved in creation of new mappings as per business requirement and move the code to different environments.
- Developed Autosys Scheduler scripts to automate the archival process and scheduling the jobs
- Created Autosys jobs for new workflows and updating the JIL’s if any modifications required
- Created Mapplets, Reusable transformations, Parameter files which contains mapping parameters and variables.
- Performance tuning of mappings, transformations and sessions to optimize session performance.
Confidential, TX
ETL/Hadoop Consultant
Responsibilities:
- Developed Complex ETL mappings to extract data from Flat Files and Hana database and load it into EDW and Teradata Database
- Understanding the Business requirement and converting them to Technical Specifications.
- Worked on migrating Informatica mappings between environments for development, testing and production implementation purposes. The mappings involved extensive use of transformations like Aggregate, Filter, Join, Expression, Lookup, Update Strategy, Expressions, Sequence Generator Transformations. Used debugger to test and fix mapping.
- Created complex mappings by using Aggregator, Lookup, Joiner, Update Strategy, Router, Filter, Expression, Stored Procedure, Union, HTTP, SQL, Normalizer Tranformation, XML Source Qualifier etc.
- Created Mapplets, Reusable transformations, Parameter files which contains mapping parameters and variables.
- Performance tuned the workflows by identifying the bottlenecks in targets, sources, mappings, sessions and workflows and eliminated them.
- Responsible for implementing the Informatica CDC logic to process the delta data.
- Responsible to identify the bottlenecks and do the performance tuning of long running Jobs. Optimizing performance tuning at source, target, mapping and session level
- Implemented Concurrent workflow process to run workflows in parallel
- Loading data from various data sources and legacy systems into Teradata Production and development warehousing using BTEQ,FASTEXPORT,MULTILOAD,FASTLOAD and Informatica.
- Developed validation rules, Error Messaging system to users if any unexpected data from the source.
- Worked extensively with the connected lookup Transformations using dynamic cache
- Troubleshooting the production issues and resolving the problems on time
- Developed Autosys Scheduler scripts to automate the archival process and scheduling the jobs.
- Expertise in creating mappings and testing in Hadoop environment
- Expertise in creating Hive Connections, Sqoop Connections
- Expertise in reading the source data from HDFS and loading it into HIVE tables
- Expertise in UNIX shell scripting, FTP, SFTP and file management in various UNIX environments.
- Coordinating with remote teams for Production Support along with any Informatica enhancements.
- Involved in migrating Informatica ETL application and database objects through various environment such a Development, Testing, UAT and Production environment.
Environment: Informatica Power Center 10.1, Workflow Manager, Workflow Monitor, Informatica Power Connect / Power Exchange, Data Analyzer, PL/SQL, SQL Developer, Teradata 13.10, Teradata SQL Assistant,UNIX, Oracle 11g, Erwin, SQL Server
Confidential, NY
Informatica Lead/Developer
Responsibilities:
- Involved in Analysis, coding, testing, user acceptance testing, production implementation and system support for the Enterprise Data Warehouse Application.
- Understanding the Business requirement and converting them to Technical Specifications
- Created complex mapping by using Aggregator, Expression, Filter, Sequence Generator, Update Strategy, SQL, Union, Lookup, Joiner, XML Source Qualifier and Unconnected Lookup transformation.
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer. Used Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension Tables.
- Developed mappings to load into staging tables and then to Dimensions and Facts.
- Creating Informatica Mapping, writing UNIX shell scripts and also modifying and changing the PLSQL scripts.
- Created Pre/Post Session SQL commands in sessions and mappings on the target instance.
- Responsible for Performance tuning at various levels during the development.
- Identified performance issues in existing sources, targets and mappings by analyzing the data flow, evaluating transformations and tuned accordingly for better performance.
- Giving support to adhoc requests from Application and Business users in the process of User Acceptance testing
- Migration of code to QA and PROD which includes UNIX shell scripts, source files, and Informatica mapping and parameter files.
- Execute complete end to end run and debug issues.
- Creating, Modifying, debugging, UNIX shell scripting
- Responsible for moving the code from Dev/QA/Production and validation testing.
Environment: Informatica Power Center 9.2, Oracle,Teradata, Unix
Confidential, KS
Hadoop Data Analyst/ ETL Developer
Responsibilities:
- Analyzed existing databases and data flows. Developed data flow diagrams
- Analyzed existing data marts, removed duplicates & consolidated key data attributes
- Developed Sqoop import jobs for data migration from legacy platforms to big data platforms
- Worked with data stewards and created metadata repository for single source of truth
- Prepared migration strategy for data ingestion from legacy platforms to HDFS
- Develop data ingestion jobs using PIG, Sqoop, Hive and Unix Shell in HDFS
Environment: Hadoop, HDFS, Sqoop, Spark, Pig, Hive, Oracle, Cloudera, and R Programming
Confidential, Chicago, IL
Java Developer
Responsibilities:
- Role in this project is as a Developer environments and expand Hadoop cluster
- Coding using Struts framework, JSP, Java Servlets.
- Interact to client meetings and checking test cases.
- Preparing the LLD, Bug fixing at ST, and UAT stages.
- Coding using Struts framework, JSP, Java Servlets.
- Involved in resolving the Production Issues
- Interact to client meetings and checking test cases
Environment: Java, Servlets, JSP, Struts, Hibernate, SQL Server 2008, Jboss
Confidential, Evendale, OH
Informatica & MDM Developer
Responsibilities:
- Used Informatica Manager for managing Informatica repository define custom routines & transforms, import and export items between different Informatica systems or exchange metadata with other data warehousing tools. Dealt extensively with the data from Oracle 11g, SQL Server. Worked on Extraction, Transformation and Loading of data.
- Used Informatica Manager for managing Informatica repository define custom routines & transforms, import and export items between different Informatica systems or exchange metadata with other data warehousing tools.
- Performance tuning of mappings, transformations and sessions to optimize session performance.
- Tuned performance of Informatica session for large data files by increasing block size, data cache size, sequence buffer length and target based commit interval. Involved in developing the Python Scripts.
Environment: Informatica MDM 9.1, Oracle 10g, Python 3.1
Confidential, Greensboro, NC
ETL Developer
Responsibilities:
- Worked with the business users to get the business rule modifications in development and testing phases and analyzed claims data through rigorous evaluation methodology.
- Wrote PL/SQL, stored procedures & triggers for implementing business rules and transformations.
- Worked on Informatica tool Source Analyzer, Data Warehousing designer, Mapping Designer, Transformations, Informatica Repository Manager and Informatica Server Manager.
- Contributed to Performance Tuning of the mappings/sessions/workflows
- Conducting and participating in the Peer review of the project deliverables.
- Documented the Informatica Mappings and workflow process, error handling of ETL procedures
- Used Business Objects for reporting. Interacted with Users for analyzing various Reports.
Environment: Informatica Power center 8.1/8.6, Microsoft SQL Server 7, SQL, Oracle10g,Teradata
Confidential, New York, NY
Informatica Developer
Responsibilities:
- Did extensive analysis on the business requirements and gathered information for the development of several small applications.
- Developing ETL mappings to extract data from the Oracle and load it into the Natezza database.
- Do the error validation of the data moving from Oracle to the Natezza database.
- Test the mappings and check the quality of the deliverables.
- Conducting and participating in the Peer review of the project deliverables.
- Bug Analysis and fixing.
- Contributed to Performance Tuning of the mappings/sessions/workflows
Environment: Informatica Power Center 7.6, SQL Server 2005, Oracle 9g, TOAD, SQL Plus and Windows
Confidential
ETL Production Support Engineer
Responsibilities:
- Daily loads and handling production failure issues for Alizes application.
- Worked on SQL stored procedures, functions and packages in Oracle.
- Checking for the Data Quality on a daily basis.
- Wrote Test Cases for ETL to compare Source and Target database systems.
- Production Monitoring and Support.
Environment: Informatica 8.1, Oracle 10g, SQL, UNIX, SQL Plus, Testing, MS Access, Windows XP.