Talend / Hadoop Integration Developer Resume
Brentwood, TN
PROFESSIONAL SUMMARY:
- High skilled ETL/BI developer with 9 years of experience in tools like Talend (6.x,5.x), DataStage 113/8.7, Informatica and in developing & Administering ETL mappings.
- Good experience in all phases of software development life cycle (SDLC) including system design, development, integration, testing, deployment and delivery of applications.
- Extracted the data using Pentaho Data Integration Designer Kettle from the flat files and other RDBMS databases like SQL Server 2005/2008 and Oracle 10g into staging area and populated onto Data warehouse.
- Used Kettle designer to design and create Pentaho Transformations and jobs.
- Created source and target connections in Talend Integration cloud to pull files from mainframe and place it into ESB Server.
- Experience in Big Data technologies like Hadoop/Map Reduce, Pig, Hive, sqoop.
- Experience in building Data warehouses using AWS redshift, S3, Hbase etc.
- Experienced in integration of various data sources like Oracle 11g/10g/9i, IBM DB2, MS SQL Server, MySQL, Snowflake, Teradata, Netezza, XML files, Mainframe sources into staging area and different target databases.
- Expertise in creating mappings in TALEND using tMap, tJoin, tReplicate, tParallelize, tConvertType, tflowtoIterate, tSortRow, tFlowMeter, tLogCatcher, tRowGenerator, tNormalize, tDenormalize, SetGlobalVar, tHashInput, tHashOutput, tJava, tJavarow, tAggregateRow, tLogCatcher, tFilter, tGlobalmap etc.
- Expertise in Informatica MDM Hub Match and Merge Rules, Batch Jobs and Batch Groups.
- Experience using Informatica IDQ for qualifying the data content and MDM to filter duplicate data as well as to deploy the project as well as Meta Data Management.
- CreatedSnowflake Schemasby normalizing the dimension tables as appropriate, and creating aSub Dimensionnamed Demographic as a subset to the Customer Dimension.
- Experience in all stages of ETL requirement gathering, designing and developing various mappings, unit testing, integration testing and regression testing.
- Extensive Experience in designing and developing complex mappings applying various transformations such as Expression, Aggregator, Lookup, Source qualifier, Update strategy, Filter, Router, Sequence generator, Rank, Stored procedure, Joiner and Sorter.
- Experienced in Data Analysis/Validation and Profiling based on the business and functional requirements of the Project.
- Good experience in installation of Talend and Informatica Power Exchange.
- Hands on experience in Pentaho Business Intelligence Server Studio and PHP.
- Hands on experience in developing and monitoring SSIS/SSRS Packages and outstanding knowledge of high availability SQL Server solutions, including replication.
- Proficient in the implementation of Data Cleanup procedures, Stored Procedures, Scripts and execution of Test plans for loading the data successfully into the various Target types (Relational and flat file).
- Experienced in writing stored procedures/SQL scripts with various Teradata utilities like MLOAD, FASTLOAD, TPUMP, FASTEXPORT.
- Good experience in developing jobs for OLTP & OLAP databases.
- Extensive experience in SQL scripting, shell scripting in Linux and windows based environments.
- Experience on Data Modeling Tools Erwin, Visio, Sybase Power designer.
- Experience in working on Enterprise Job scheduling tools like Autosys.
TECHNICAL SKILLS:
Operating Systems: Windows 2008/2007/2005/NT/XP,UNIX, MS - DOS
ETL Tools: Talend6.x/5.x,InformaticaPowerCenter9.x/8.x,InformaticaMDM,Pentaho,SSIS.
Databases: Oracle 12c/11g/10g,MS SQL Server 2012/2008/2005,MySQL,DB2v8.1,Netezza, Teradata,Snowflake,Hbase.
Methodologies: Data Modeling - Logical Physical Dimensional Modeling - Star / Snowflake
Programming Skills: C++,Eclipse,Shell Scripting(K-Shell,C-Shell),PL/SQL,Hadoop,Pig,Hive,JAVA, JAVA Script,CSS.
Web services: Microsoft Azure.
Testing Tools: QTP,WinRunner,LoadRunner,Quality Center,Test Director,Clear test,Clear case.
PROFESSIONAL EXPERIENCE:
Confidential, Brentwood, TN
Talend / Hadoop Integration Developer
Responsibilities:
- Participated in Requirement gathering, Business Analysis, User meetings and translating user inputs into ETL mapping documents.
- Worked closely with Business Analysts to review the business specifications of the project and also to gathered ETL requirements.
- Implemented new users, projects, tasks within multiple different environments of TAC.
- Created complex mappings in Talend using tMap, tJoin, tXML, tReplicate, tParallelize, tJava, tjavarow, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher etc.
- Created joblets in Talend for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
- Developed complex Talend ETL jobs to migrate the data from flat files to database.
- Pulled files from mainframe into Talend execution server using multiple ftp components.
- Developed complex Talend ETL jobs to migrate the data from flat files to database.
- DevelopedTalendESB services and deployed them on ESB servers on different instances.
- Worked on Talend Data Integration cloud components like tfileinputpositional, tfilelist, tActioninput, tActionoutput, tActionReject, tActionLog, tActionFailure, jCloudLog, tsplitrow, tcontextLoad, tFileArchive, tFileDelete, tFixedFlowInput.
- Worked with Data mapping team to understand the source to target mapping rules.
- Created source and target connections in Talend Integration cloud to pull files from mainframe and place it into ESB Server.
- Published 250 plus jobs into Talend integration cloud.
- Worked on dynamic schema in Integration Actions enable web users to define new columns on the run when they create Flows.
- Created test cases for Integration Actions.
- Integration Actions are published to Talend Integration Cloud for Cloud users to build Flows.
- Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer of project from development to testing environment and then to production environment.
- Performed Unit testing and System testing to validate data loads in the target.
ENVIRONMENT: Talend 6.3.1, Talend integration cloud, ESB services Java scripts, SQL Server, GIT, Eclipse, JUNIT, xml files, flat files, mainframe files, SSIS, Microsoft office, Web Services.
Confidential, Shaumburg, IL
Sr.Talend Developer
Responsibilities:
- Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse.
- Performed data manipulations using variousTalendcomponents like tMap,tJavaRow,tjava, tOracleRow,tOracleInput,tOracleOutput,tMSSQLInput and many more.
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time.
- Designed ETL process using Talend Tool to load from Sources to Targets through data Transformations.
- Extensive experience on Pentaho designer, Pentaho kettle, Penatho BI server, BIRT report designer.
- Developed advanced Oracle stored procedures and handled SQL performance tuning.
- Involved in creating the mapping documents with the transformation logic for implementing few enhancements to the existing system.
- Monitored and supported theTalendjobs scheduled throughTalendAdmin Center (TAC).
- Developed the Talend mappings using various transformations, Sessions and Workflows. Teradata was the target database, Source database is a combination of Flat files, Oracle tables, Excel files and Teradata database.
- Loaded data in to Teradata Target tables using Teradata utilities (Fast Load, Multi Load, and Fast Export) Queried the Target database using Teradata SQL and BTEQ for validation.
- Used Talend to Extract, Transform and Load data into Netezza Data Warehouse from various sources like Oracle and flat files.
- Created connection to databases like SQL Server, oracle, Netezza and application connections.
- Created mapping documents to outline data flow from sources to targets.
- Prepare theTalendjob level LLD documents and working with the modeling team to understand the Big Data Hive table structure and physical design.
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts.
- Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend.
- Maintained stored definitions, transformation rules and targets definitions using Informatica repository Manager.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings.
- Developed mapping parameters and variables to support SQL override.
- DevelopedTalendESB services and deployed them on ESB servers on different instances.
- Created mapplets & reusable transformations to use them in different mappings.
- Developed mappings to load into staging tables and then to Dimensions and Facts.
- Developed theTalendjobs and make sure to load the data into HIVE tables & HDFS files and develop theTalendjobs to integrate with Teradata system from HIVE tables.
- Worked on different tasks in Workflows like sessions, events raise, event wait, decision, e-mail, command, worklets, Assignment, Timer and scheduling of the workflow.
- Unit testing, code reviewing, moving in UAT and PROD.
- Designed theTalendETL flow to load the data into hive tablesand create theTalendjobs to load the data into Oracle and Hive tables.
- Migrated the code into Testing and supported QA team and UAT (User).
- Created detailed Unit Test Document with all possible Test cases/Scripts.
- Working with high volume of data and tracking the performance analysis onTalendjob runs and session.
- Conducted code reviews developed by my team mates before moving the code into QA.
- Experience in Batch scripting on windows, Windows 32 bit commands, Quoting, Escaping.
- UsedTalendreusable components like routines, context variable and global Map variables.
- Provided support to develop the entire warehouse architecture and plan the ETL process.
- Knowledge on Teradata Utility scripts like Fast Load, Multi Load to load data from various source systems to Teradata.
- Modified existing mappings for enhancements of new business requirements.
- Prepared migration document to move the mappings from development to testing and then to production repositories.
- Configured the hive tables to load the profitability system inTalendETL Repositoryand create the Hadoop connection for HDFS cluster inTalendETL repository.
- Works as a fully contributing team member, under broad guidance with independent planning & execution responsibilities.
Environment: Talend 6.3.1,Hive,Pig,Hadoop,Sqoop,PL/SQL,Oracle12c/11g/,Erwin,JSON,Autosys,SQL Server 2012,Teradata,Netezza,Sybase,SSIS,UNIX,Profiles,Workflow&Approval processes,Data Loader, Reports,Custom Objects,Custom Tabs,Data Management,Lead processes,Record types.
Confidential, Warren, MI
Sr.ETL Talend Developer.
Responsibilities:
- Participated in Client Meetings and gathered business requirements and analyzed them.
- Design, develop, test, implement and support ofData Warehousing ETLusing Talend and Hadoop Technologies.
- Design and Implement ETL processes to import data from and into Microsoft Azure.
- Research, analyze and prepare logical and physical data models for new applications and optimize the data structures to enhance data load times and end-user data access response times.
- Create pig and hive scripts to process various types of data sets and load them into data warehouse built on Hive.
- Develop stored procedures/views in Snowflake and use in Talend for loading Dimensions and Facts.
- Develop merge scripts to UPSERT data into Snowflake from an ETL source.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Created complex mappings in Talend using tMap, tJoin, tReplicate, tParallelize, tJava, tjavarow, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher etc.
- Created joblets in Talend for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
- Developed jobs to move inbound files to vendor server location based on monthly, weekly and daily frequency.
- Implemented Change Data Capture technology in Talend in order to load deltas to a Data Warehouse.
- Perform ETL usingdifferent sources like databases, flat files, xml files.
- Migrated Snowflake database to Windows Azure and updating the Connection Strings based on requirement.
- Managed and reviewed Hadoop log files.
- Wrote ETL jobs to read from web apis using REST and HTTP calls and loaded into HDFS using java and Talend.
- Shared responsibility for administration of Hadoop, Hive and Pig and Talend.
- Manage ETL jobs using Talend Administrator Console(TAC) in development and production environment and administer the Talend ETL tool in development and production environments.
- Use GitHub and SVN as version control for the code and implemented branching for different environments.
- Tested raw data and executed performance scripts.
- Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
ENVIRONMENT: Talend 6.1, Pentaho, HDFS, HBase, Map Reduce, GIT, Java scripts, Snowflake, Eclipse, XML, JUNIT, Microsoft Azure, Hadoop, Apache Pig, Hive, JSON, Elastic Search, Web Services, Pentaho kettle Microsoft Office.
Confidential
ETL Informatica Developer.
Responsibilities:
- Using Informatica PowerCenter Designer analyzed the source data to Extract & Transform from various source systems (oracle 10g, DB2, SQL server and flat files) by incorporating business rules using different objects and functions that the tool supports.
- Using Informatica PowerCenter created mappings and mapplets to transform the data according to the business rules.
- Used various transformations like Source Qualifier, Joiner, Lookup, sql, and router, Filter, Expression and Update Strategy.
- Implemented slowly changing dimensions (SCD) for some of the Tables as per user requirement.
- WriteTeradataSQL, BTEQ, MLoad, OLELoad, FastLoad, and FastExport for Ad-hoc queries, and build UNIX shell script to perform ETL interfaces BTEQ, FastLoad or FastExport, via Hummingbird and Control-M software.
- Developed Stored Procedures and used them in Stored Procedure transformation for data processing and have used data migration tools.
- Created and ran the Workflows using Workflow manager inInformaticaMaintained stored definitions, transformation rules and targets definitions usingInformaticarepository manager.
- Documented Informatica mappings in Excel spread sheet.
- Tuned the Informatica mappings for optimal load performance.
- Have used BTEQ, FEXP, FLOAD, MLOAD Teradata utilities to export and load data to/from Flat files.
- Analyzed, identified, fixed bad data and imported data fromSalesForce - CRM to Oracle. Upstream data integration and migration processes in predefined schemas.
- Involved in creating the Unix Scripts and jobs to handle Informatica workflows andTeradatautilities like Bteq, Mload, Fast Export and TPT scripts.
- Extensively involved in Data Extraction, Transformation and Loading (ETL process) from Source to target systems usingInformatica.
- Created and Configured Workflows and Sessions to transport the data to target warehouse Oracle tables using Informatica Workflow Manager. Informatica Web-services and web-portal applications.
- Handful experience on Windows 32 bit commands, Quoting, Escaping.
- Managed the migration of SQL Server 2008 databases to SQL Server 2012.
- Have generated reports using OBIEE for the future business utilities.
- Created SQL scripts to load the custom data into Development, Test and production Instances using Import/Export. Created scripts to create custom Tables and Views.
- This role carries primary responsibility for problem determination and resolution for each SAP application system database server and application server.
- Worked along with UNIX team for writing UNIX shell scripts to customize the server scheduling jobs.
- Constantly interacted with business users to discuss requirements.
Environment: Informatica Power Center Designer, Informatica Repository Manager,OBIEE Oracle10g/9i, and DB2 6.1, Erwin, TOAD, SAP, Unix- SunOS, PL/SQL, and SQL Developer, Java/J2EE, Struts, JDBC, PL/SQL,JUNIT,ANT,HTML,DHTML,JSP,JavaScript,XML,Oracle,ApacheTomcat, MS Excel.
Confidential
Informatica developer.
Responsibilities:
- Created Technical Design Specifications, Unit test document based on functional design specifications provided by Business Analyst.
- Designed and developed ETL Processes based on business rules, job control mechanism using Informatica Power Center.
- Worked extensively on complex mappings using source qualifier, joiner, expressions, aggregators, filters, Lookup, update strategy, stored procedure transformations, etc.
- Used workflow monitor to monitor the jobs, reviewed session/workflow logs that were generated for each session to resolve issues, used Informatica debugger to identify issues in mapping execution.
- Re-engineered lots of existing mappings to support new/changing business requirements.
- Monitored production jobs on a daily basis and worked on issues relating to the job failure and restarted failed jobs after correcting the errors.
- Developed reusable transformations, mapplets, sessions, worklets to make Informatica code very modular and reuse as required
- Performed unit testing, system integration testing, and supported user acceptance testing.
- Performance tuned SQL statements, Informatica mappings, used Informatica parallelism options to speed up data loading to meet defined SLA.
- Using Informatica Power center to make the changes to the existing ETL mappings in each of the environments.
- Collaborated with Project Manager, Tech Lead, Developers, QA teams and Business SMEs to ensure delivered solutions optimally support the achievement of business outcomes.
- SCRUM and AGILE approach was used to solve defects and monitor daily progress.
- Supported Informatica, non-Informatica code migration between environments (DEV/QA/PRD)
- Developed PL/SQL procedures for processing business logic in the database and use them as a Stored Procedure Transformation.
- Developed Oracle PL/SQL Packages, Procedures, Functions and Database Triggers.
- Data Warehouse Data modeling based on the client requirement using Erwin (Conceptual, Logical and Physical Data Modeling).
- Dimensional Modeling for creation of Star Schema and Snow Flake Schema.
Environment: Informatica Power Center 8.1.1,Repository Manager, Designer, Work Flow Manager, Oracle 9i/10g, SQL Server 2008/2005, Teradata, XML Files, Flat Files, CSV files, PL/SQL(Stored Procedure, Trigger, Packages), Erwin, MS Visio, TOAD, Windows.