Aws Services, Spark, Sqoop, Bigdata Talend Lead Application Developer Resume
Chicago, IL
PROFESSIONAL SUMMARY:
- Over 11+ years of extensive experience with Bigdata Talend ETL, AWS Services with Redshift Data warehouse & EC2 Clusters as an Application lead.
- Extensively work with AWS Services, Bigdata, APACHE SPARK, Sqoop.
- Extensively worked with Talend 6.2,6.3.1 - Open Studio/Enterprise Edition
- Big data & Hadoop skills knowledge & Experience on Hadoop architecture, Cloudera, Hive, Talend Big data ( ETL )
- Experience with the Architecture SOA, TOGAF and MDM modelling.
- Web Services and Web Services protocols SOAP, JAXB and XSD, WSDL Tools, SOAP UI RESTful Web Services.
- DQ provides a powerful data profiling tool for users to analysis
- Experience with Big data ecosystem and Java/J2EE related technologies
- Extensive knowledge of Data Modeling / Architecture, Database Administration with specialization in Various ETL Platforms (Talend, Cognos Data Manager).
- Experience with Talend DI Installation, Administration and development for data warehouse and application integration
- Experience in following domains: Retail, Finance, Health Care, Investment/Banking, Telecom, and Insurance.
- Involved generating complex XML’s for Recipe Management, calling web services and integrating Salesforce with Informatica to load the data in Salesforce and create dashboards for revenue data.
- Strong experience in designing and developing Business Intelligence solutions in Data Warehousing/Decision Support Systems using ETL Tools
- Expertise in Data modeling techniques like Data Modeling- Dimensional/Star Schema, and Snowflake modeling, Slowly Changing Dimensions (SCD Type 1, Type 2, and Type 3).
- Talend jobs implementation using multiple complex transformations using tMap and multiple look ups
- Debugging ETL jobs errors, ETL Sanity and production Deployment in TAC-Talend Administrator Console using SVN
- Tracking Daily data load, Monthly data extracts and send to client for their verification
- Extensive experience in designing and developing complex mappings from varied transformation logic and used over 70+ Components in designing Jobs in Talend
- Knowledge of SCD type 1 and 2 and implementation Using Talend
- Expert in using the Talend Troubleshooting &ODI Debugger to understand the errors in Jobs &mappings and used the tMap/expression editor to evaluate complex expressions and look at the transformed data to solve mapping issues.
- Expertise in enhancements/bug fixes, performance tuning, troubleshooting, impact analysis, Unit testing, Integration Testing, UAT and research skills.
- Excellent understanding and best practice of Data Warehousing Concepts, involved in Full Development life cycle of Data Warehousing. Expertise in enhancements/bug fixes, performance tuning, troubleshooting, impact analysis and research skills.
- Good exposure in projects with the Onsite - Offshore model.
- Expert in data warehouse performance tuning.
- Tasks like merging flat files after Creating, deleting temporary files, changing the file name to reflect the file generated date and worked on 50+ Complex jobs in recent times
- Good Knowledge and hands-on experience on Amazon Redshift,Oracle (SQL, PL/SQL), MS Access, databases (MySQL, Oracle, SQL Server)
- Professional communication skills for analysis, design and reporting with internal & external clients.
TECHNICAL SKILLS:
Applications: AWS SERVICES, APACHE SPARK,SQOOP
ETL Tools: : Talend BigData, ODI, Cognos Data Manager, Informatica
Languages:: C/ C++, HTML, SQL, PL/SQL
Databases:: Amazon Redshift,Oracle 10/11G, SQL Server 2012, MySQL
Tools: Redshift, SQL Management Studio, TOAD, MySQL WorkbenchOpen Studio
Operating System: : Windows, Linux, UNIX
Scheduling Tools:: Control-M, AutoSys, Jenkins
EXPERIENCE:
Confidential, Chicago, IL
AWS Services, Spark, Sqoop, Bigdata Talend Lead Application Developer
Responsibilities:
- Closely worked with IT Directors and Sr Mangers Technical, in designing of Data warehouse and maintain and involved in modifying technical Specifications.
- Since working on Development/Enhancement/Migration projects, we are doing migrating entire data warehouse data using AWS services and Apache SPARK and SQOOP applications. Basically, it works through Scala Language script.
- Using Apache Spark application, we are moving Large files sizes are like from 10 GB to 1TB files.
- Apache SPARK Application pick files it will load into Parque format, from there we should apply transformation and load as desired source file format.
- Using Sqoop Application we migrate data from One database to another database including all tables we will dump entire data.
- SPARK application running using Scala and scripting and editing. Installation of SPARK and SQOOP application using IntelliJ.
- We are migrating our Entire network into AWS Cloud services, creating Virtual Private Clouds and creates servers. So, part of this we are migrating entire all sources from Altegra Legacy to AWC Cloud CHC 2.0.
- Install SQL Server on EC2 instances in the new environment, create High Availability solution with existing Vault-Mart servers
- Part of data loading into data warehouse using big data Hadoop Talend ETL components, AWS S3 Buckets and AWS Services for redshift database.
- We must design jobs using Bigdata Talend and Pick files from AWS S3 Buckets and Load into AWS Redshift database.
- We must check Production Jobs our daily status and notify to Business Users.
- Involved in Extraction, Transformation and Loading of data. Data ingestion with different data sources and load into redshift.
- Part of redshift database AWS maintenance, we must vacuum and analyze our AWS redshift tables.
- Part of Deployment we are using Jenkins and Run deck application.
- Designed and Implemented the ETL process using Talend Enterprise Big Data Edition to load the data from Source to Target Database.
- Involved in Data Extraction from Flat files and XML files using Talend by using Java as Backend Language. Using Talend to load the data into our warehouse systems
- Creating and managing schema objects such as tables, views, indexes, stored procedures, and triggers & maintaining Referential Integrity.
- Created SSIS packages to populate data from various data sources.
- Created packages using SSIS for data extraction from Flat Files, Excel Files, and OLEDB to SQL Server.
- Extracting transformed data from Hadoop to destination systems, as a one-off job, batch process, or Hadoop streaming process.
- Worked on Error handling techniques and tuning the ETL flow for better performance.
- Scheduled Jobs in Job Conductor
- Migrated the code and release documents from DEV to QA (UAT) and Production.
- Agile environment
Environment: AWS SERVICES, APACHE SPARK, SQOOP, BIG DATA Talend 6.4.1, Redshift, Data stage, SSIS, SQL SERVER, Putty, GitHub.
Confidential, Oakbrook, IL
ETL Talend BigData Developer
Responsibilities:
- Closely work with Data Architects designing data base tables. Modify technical documentation.
- Use Talend big data components like Hadoop and S3 Buckets and AWS Services for Redshift.
- Involved in Extraction, Transformation and Loading of data.
- Work with the offshore team for the day to day work and review the tasks done by them get the status updates in the daily meetings.
- Data ingestion with different data sources and load into Redshift.
- Developed jobs to send and read data from AWS S3 buckets using components like tS3Connection, tS3BucketExist, tS3Get, tS3Put.
- Designed and Implemented the ETL process using Talend Enterprise Big Data Edition to load the data from Source to Target Database.
- Involved in Data Extraction from Flat files and XML files using Talend by using Java as Backend Language.
- Using Talend to load the data into our warehouse systems
- Used over 20+ Components in Talend Like (tMap, Tfilelist, Tjava, Tlogrow, ToracleInput, ToracleOutput, tsendEmail etc)
- Used debugger and breakpoints to view transformations output and debug mappings.
- Automation testing of web based applications and services with proficiency in Java, BULK API.
- Develop ETL mappings for various Sources (.TXT, .CSV,.XML) and also load the data from these sources into relational tables with Talend Enterprise Edition.
- Worked on Global Context variables, Context variables, and extensively used over 70+components in Talend to create jobs.
- Designing Logical and Physical Data Models with Erwin tools.
- Worked with installation, design and management of MS SQL Server 2008.
- Creating and managing schema objects such as tables, views, indexes, stored procedures, and triggers & maintaining Referential Integrity.
- Created SSIS packages to populate data from various data sources.
- Created packages using SSIS for data extraction from Flat Files, Excel Files, and OLEDB to SQL Server.
- Extracting transformed data from Hadoop to destination systems, as a one-off job, batch process, or Hadoop streaming process.
- Worked on Error handling techniques and tuning the ETL flow for better performance.
- Scheduled Jobs in Job Conductor
- Extensively used Talend components tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tOracleInput, tOracleOutput, tfileList, tDelimited etc.
- Migrated the code and release documents from DEV to QA (UAT) and Production.
- Agile environment
Environment: Talend Enterprise Big Data 6.3.1, Redshift, Data stage, SSIS, SSRS, SQL SERVER, Putty, GitHub.
Confidential, Culver City, LA
ETL Talend Developer
Responsibilities:
- Closely worked with Data Modelers and Data Architects in design of tables and in modification of technical Specifications.
- Involved in Extraction, Transformation and Loading of data.
- Work with the offshore team for the day to day work and review the tasks done by them get the status updates in the daily meetings.
- Working with business team for Adhoc requests and creating reports based on user requirement using Redshift.
- Work with different API’s to get the data using curl command and load into Redshift database.
- Data ingestion with different data sources and load into Redshift.
- Load the files into S3 Bucket and copy into Redshift for creating tables.
- Work with Tableau Developer to build dashboards.
- Worked with different Sources such as SQL Server and Flat files
- Designed and Implemented the ETL process using Talend Enterprise Big Data Edition to load the data from Source to Target Database.
- Involved in Data Extraction from Flat files and XML files using Talend by using Java as Backend Language.
- Implemented Dimensions (SCD) Type1, Type 2 to capture the changes using Talend ETL.
- Using Talend to load the data into our warehouse systems
- Used over 20+ Components in Talend Like (tMap, Tfilelist, Tjava, Tlogrow, ToracleInput, ToracleOutput, tsendEmail etc.)
- Used debugger and breakpoints to view transformations output and debug mappings.
- Talend server Administration
- Automation testing of web based applications and services with proficiency in Java, BULK API.
- Develop ETL mappings for various Sources (.TXT, .CSV,.XML) and also load the data from these sources into relational tables with Talend Enterprise Edition.
- Worked on Global Context variables, Context variables, and extensively used over 100+components in Talend to create jobs.
- Extracting transformed data from Hadoop to destination systems, as a one-off job, batch process, or Hadoop streaming process.
- Used TAC (Admin Console), to Schedule Jobs
- Extensively Used Talend components tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tOracleInput, tOracleOutput, tfileList, tDelimited etc.
- Scheduling the ETL mappings on daily, weekly, monthly and yearly basis.
- Worked on the project documentation and also prepared the Source Target mapping specs with the business logic and also involved in data modeling
- Created Workflows using various tasks like sessions, control, decision, e-mail, command, worklets, and assignment and worked on scheduling of the workflows.
- Verify the logs to confirm all the relevant jobs are completed successfully and timely and involved in production support to resolve the production issues.
Environment: Talend Open Studio 6.0.1, Redshift, Tableau, Cloudberry.
Capgemini
ETL Talend Developer
Responsibilities:
- Closely worked with Data Architects in designing of tables and even involved in modifying technical Specifications.
- Involved in Extraction, Transformation and Loading of data.
- Work with the offshore team for the day to day work and review the tasks done by them get the status updates in the daily meetings.
- Worked with different Sources such as Oracle, SQL Server and Flat files
- Designed and Implemented the ETL process using Talend Enterprise Big Data Edition to load the data from Source to Target Database.
- Expertise in High Availability Configuration and Monitoring on Hadoop Master Nodes.
- Involved in Data Extraction from Oracle, Flat files and XML files using Talend by using Java as Backend Language.
- Implemented Dimensions (SCD) Type1, Type 2 to capture the changes using Talend ETL.
- Did a POC( In-house Based Project) on Talend to load the data into our warehouse systems
- Used over 20+ Components in Talend Like (tMap, Tfilelist, Tjava, Tlogrow, ToracleInput, ToracleOutput, tsendEmail etc.)
- Used debugger and breakpoints to view transformations output and debug mappings.
- Talend server Administration
- Develop ETL mappings for various Sources and also load the data from these sources into relational tables with Talend Enterprise Edition.
- Worked on Global Context variables, Context variables, and created jobs using Talend.
- Extracted transformed data from Hadoop to destination systems
- Worked on Error handling techniques and tuned the ETL flow for better performance.
- Worked Extensively TAC (Admin Console), where we Schedule Jobs in Job Conductor
- Extensively Used Talend components tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tOracleInput, tOracleOutput, tfileList, tDelimited etc.
- Worked with Oracle SQL Developer while Implementing Unit Testing of ETL Talend Jobs.
- Scheduling the ETL mappings on daily, weekly, monthly and yearly basis.
- Working on POC wrt BigData like loading the data into HDFS and creating Map Reduce Jobs
- Worked on the project documentation and also prepared the Source Target mapping specs with the business logic and also involved in data modeling
- Worked on migrating data warehouses from existing SQL Server to Oracle database.
- Implemented Performance tuning in Mappings and Sessions by identifying the bottlenecks and Implemented effective transformation Logic.
- Created Workflows using various tasks like sessions, control, decision, e-mail, command, worklets, and assignment and worked on scheduling of the workflows.
- Verify the logs to confirm all the relevant jobs are completed successfully and timely and involved in production support to resolve the production issues.
Environment: Talend Open Studio 5.0.1, Cognos Data Manager, UNIX, Oracle, SQL Server, TOAD, Autosys
Confidential
ETL Developer
Responsibilities:
- The objective of this project is to prepare a flexible ETL architecture and design suitable for First Data. The scope of this Project is limited to the definition of Architecture & design for the Finance and Inventory Data mart.
- Design and create the Complete “ETL” process from end-to-end using Talend jobs and create the test cases for validating the Data in the Data Marts and in the Data Warehouse.
- Extensive experience through all the phases of Full life cycle Development including requirements gathering, technical specifications, analysis, design, implementation, testing, and operations of large-scale enterprise wide applications.
- Developed MySQL Stored procedures, Triggers, Cursors, Joins, Views, and Optimized Databases.
- Everyday Capture the data from OLTP Systems and various sources of XML, EXCEL and CSV and load the data into Talend ETL Tools.
- Schedule the Jobs every day before 7.00pm, and also have to report the all the presentation reports before 8.00 am of Germany bank business hours as per the SLA.
- Schedule the ETL jobs for Vanquish & Arab National Bank daily.
- Bug fixes on ETL failures on top priority.
- Schedule jobs using Control-M on daily basis.
- Develop the framework manager model and work experience in scheduling the Cognos reports.
- Work experience in Production support as an on call duty.
- Work Experience with UNIX as part of ETL, failure jobs need to remove and reload the files through UNIX and resume the services.
- Work Experience with schedule the jobs monthly and weekly on Cognos environment.
Environment: Informatica Power Center, Cognos 10.1, Control-M, SQL, Talend
Confidential
BI Developer/Onsite Coordinator
Responsibilities:
- Responsible for handling 3 individual projects - Building the Data ware house from scratch for Contracts, CARS Project & ForeCast-DeAgg Project.
- Used Data Mapping Document - Created all the Data Objects and coded all the required Stored Procedures for Raw, Stage, Data Warehouse and Data Mart Environments respectively for the Contracts, Transaction Management, Usage, Billing, and Pricing Cubes.
- Extensively used MS SQL Server 2008 R2 T-SQL Views, Stored Procedures, Inline and Table-Valued Functions, Triggers, and SSIS Packages for end-to-end process of the ETL.
- Interacting with Customers on daily basis for Requirement clarifications and for updating status and also coordinating with the offshore team.
- Work experience in installation on Cognos10.1 on 64 bit Confidential Servers, SQL Server 2008.
- Preparing the ETL Architecture & Develop the framework manager model.
- Capture the paths of the data items against DMR by using Transformer.
- Developing the power play based reports in report studio, and checks the output against power play web.
- Test the reports against the SQL and Bug fixes for Cognos objects during the life cycle promotion in multiple releases for development to QA and Production.
- Creation of Cubes with Transformer.
- Offer the reports to end users with best performance by completing report tuning.
Environment: MS SQL Server 2008 R2, MSSQL Server BI-SSIS and SSRS, Cognos 10.1, Transformer, Windows Server 2008.
Confidential
BI Cognos Developer
Responsibilities:
- Interacting with clients on daily basis for Requirement clarifications and for updating status.
- Migration of power play reports to Cognos 8.4 RS by developing new report and format reports to achieve the power play report look.
- Capture the paths of the data items against DMR by using Transformer.
- Develop the Transformer using hierarchy, general dimension & Time dimension.
- By developing these attributes, we can publish the Transformer package.
- Developing the power play based reports in report studio, and checks the output against power play web.
- Need to capture the logic from the information provided in TDD and in turn change the logic as filters in the reports.
- Create the cross tab reports using Analysis studio.
- Wrote complex ad-hoc queries to retrieve the Data using necessary Joins, Unions and Aggregate functions according to the Client’s requirements and analyzed the Data in the spreadsheets.
- Installed and Configured Cognos 8.4, Framework Manager, IIS and.
- Developed and Designed Cognos Reports for Products.
- Generated Reports for Drill down, Drill through and Sub-Reports - Scheduled and deployed them in SSRS Report Manager.
- Test the reports against the SQL.
- Bug fixes for Cognos objects during the life cycle promotion in multiple releases for development to QA and Production.
- Creation of Test Cases, Technical Specifications.
- Offer the reports to end users with best performance by completing report tuning.
Environment: Cognos 8.4, Power play, Transformer, SQL Server. Windows XP.
Confidential
SQL Server Developer
Responsibilities:
- Responsible for handling 3 individual projects - Building the Data ware house from scratch for Contracts, CARS Project & ForeCast-DeAgg Project.
- Used Data Mapping Document - Created all the Data Objects and coded all the required Stored Procedures for Raw, Stage, Data Warehouse and Data Mart Environments respectively for the Contracts, Transaction Management, Usage, Billing, and Pricing Cubes.
- Extensively used the SSIS Packages for end-to-end process and designed the test cases for validating the Data in the Data Marts and in the Data Warehouse. Maintained the Data Integrity and Data Validity throughout the Process of building the DWH.
- Used TFS Server and Visual Studio 2008- Schema Comparisons and Database Comparisons - synchronized all the DEV, SIT and PROD Environments.
Environment: MS SQL Server 2008 R2/2005, MS SQL Server BIDS, MS Visual Studio 2008, MS Visual Studio 2005, Windows XP, Team Foundation Server, Performance Point Monitoring Server.