Sr. Etl/talend Developer Resume
Overland Park, KS
SUMMARY
- 8+ years of professional IT experience which includes 3years experience in Big - Data related technologies.
- Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, YARN, Cassandra,, Hive, Pig, HBase, Sqoop, Oozie, Flume.
- Good Knowledge on Map Reduce design patterns.
- Experience with distributed systems, large-scale non-relational data stores, NoSQL map-reduce systems, data modeling, database performance tuning, and multi-terabyte data warehouses.
- Extensively worked on Hive, Pig for performing data analysis.
- Experience in managing HBase database and using it to update/modify the data quickly.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in ingesting log data into HDFS using Flume.
- Experience in managing and reviewing Hadoop log files.
- Involved in developing complex ETL transformation & performance tuning.
- Experience in developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite.
- Experience in submitting Talend jobs for scheduling using Talend scheduler which is available in the Admin Console.
- Experience in Spark to process large stream of data.
- Experienced in running MapReduce and Spark jobs over YARN.
- Good knowledge in MongoDB concepts and its architecture.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Excellent working knowledge of System Development Life Cycle (SDLC) and Software Testing Life Cycle (STLC) and Defect Life Cycle.
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster
PROFESSIONAL EXPERIENCE
Talend+Hadoop Developer
Confidential - Overland Park, KS
Responsibilities:
- Worked closely with Business Analysts to review the business specifications of the project and also to gather the ETL requirements.
- Created Talend jobs to copy the files from one server to another and utilized Talend FTP components.
- Created and managed Source to Target mapping documents for all Facts and Dimension tables
- Analyzing the source data to know the quality of data by using Talend Data Quality.
- Involved in writing SQL Queries and used Joins to access data from Oracle, and MySQL.
- Created many complex ETL jobs for data exchange from and to Database Server and various other systems including RDBMS, XML, CSV, and Flat file structures.
- Integrated java code inside Talend studio by using components like tJavaRow, tJava, tJavaFlex and Routines.
- Experienced in using debug mode of talend to debug a job to fix errors.
- Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite.
- Conducted JAD sessions with business users and SME's for better understanding of the reporting requirements.
- Developed Talend jobs to populate the claims data to data warehouse - star schema.
- Used Talend Admin Console Job conductor to schedule ETL Jobs on daily, weekly, monthly and yearly basis.
- Worked on various Talend components such as tMap, tFilterRow, tAggregateRow, tFileExist, tFileCopy, tFileList, tDie etc.
- Worked Extensively on Talend Admin Console and Schedule Jobs in Job Conductor.
Environment: Talend Open Studio, TalendAdministrator Console, MS SQL Server 2012/2008, Oracle 11g, Hive, HDFS,java, Sqoop, TOAD, UNIX..
Hadoop+Talend Developer
Confidential, Ottawa, KS
Responsibilities:
- Process the data and push the valid records to HDFS. Import data from MySQL to HDFS using SQOOP.
- Tune the MapReduce, PIG and Hive jobs to increase the performance and decrease the execution time of the jobs.
- Compress the files downloaded from the servers before storing them in the cluster to save cluster resources.
- Write Corejava programs to convert the JSON files to CSV or TSV files for further processing.
- Optimize already developed long running MapReduce and Pig job for better performance and accurate results.
- Create Hive databases and tables over the HDFS data and write HiveQL queries on the tables.
- Schedule Hadoop and UNIX jobs using OOZIE.
- Work with NoSQL databases like HBase.
- Write Pig and HiveUDFs for processing and analyzing log files.
- Developing Scripts and Batch Job to schedule various Hadoop Program.
- Visualize the complicated data analysis on the dashboards as per the business requirements.
- Integrated Hive, PIG and Mapreduce jobs with elastic search to publish the metrics to the dashboards.
- Submitted talend jobs for scheduling using Talend scheduler which is available in the Admin Console.
- Deployed talend jobs on various environments including dev, test and production environments.
- Involved in analysis, design testing phases and responsible for documenting the technical specifications.
Environment: Hadoop 2x, YARN, HDFS, Mapreduce, PIG,HIVE, HBASE, Shell Scripting,java,Oozie, TALEND Open Studio, LINUX.
Sr. ETL/Talend Developer
Confidential, Overland Park, KS
Responsibilities:
- Participated in Client Meetings and gathered business requirements and analyzed them.
- Design, develop, test, implement and support of Data Warehousing ETL using Talend and HadoopTechnologies.
- Design and Implement ETL processes to import data from and into Microsoft Azure.
- Create pig and hive scripts to process various types of data sets and load them into data warehouse built on Hive.
- Develop stored procedures/views in Snowflake and use in Talend for loading Dimensions and Facts.
- Develop merge scripts to UPSERT data into Snowflake from an ETL source.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Created joblets in Talend for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
- Developed jobs to move inbound files to vendor server location based on monthly, weekly and daily frequency.
- Implemented Change Data Capture technology in Talend in order to load deltas to a Data Warehouse.
- Perform ETL using different sources like databases, flat files, xml files.
- Migrated Snowflake database to Windows Azure and updating the Connection Strings based on requirement.
- Managed and reviewed Hadoop log files.
- Wrote ETL jobs to read from web apis using REST and HTTP calls and loaded into HDFS using java and Talend.
- Shared responsibility for administration of Hadoop, Hive and Pig and Talend.
- Tested raw data and executed performance scripts.
- Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
ENVIRONMENT: Talend, HDFS, HBase, MapReduce,, Eclipse, XML, JUNIT,Microsoft Azure, Hadoop, Apache Pig, Hive,java, Elastic Search, Web Services,Microsoft Office
SQL SERVER SSIS ETL Developer
Confidential, Overlandpark, KS
Responsibilities:
- Created scripts for data validation and massaging of legacy data in the staging area before moving it to the DSS.
- Schedule Jobs to run SSIS packages at night, to feed daily data into the Decision Support System.
- Wrote documentation for the packages, scripts and the jobs created for DTS to SSIS migration.
- Created various reports such as graphs, charts, matrix, drill down, drill through, parameterized, sub reports and linked reports etc.
- Deployed the reports to the Report Manager.
- Performed trouble-shooting and maintenance of the reports with any enhancements or changes as needed from time to time.
- Set up report subscriptions as required by BRD and the management.
- Dropped and recreated Indexes in the SQL Server DSS while migrating legacy data to SQL Server 2005.
- Used BCP to transfer data.
- Generated server side T-SQL scripts for data manipulation and validation and created various snapshots and materialized views for remote instances.
- Modified existing databases by adding/removing tables there by altering referential integrity, primary key constraints and relationships according to requirements.
- Created trace in SQL Profiler and used Data base engine tuning advisor for Performance tuning on stored procedures and Scripts.
- Documented the migration process, reports and the DSS structure and various objects.
- Worked comfortably with a combination of Agile and Spiral methodologies.
- Facilitated meetings between various development teams, DBAs, and Testing teams for timely progress of the migration process.
- Participated in testing during the UAT (QA testing).
- Mentored and monitored team development and provided regular team status updates to the management.
Environment: SQL Server 2005/2000, DTS, SQL SSIS, SSRS, SQL Profiler, .NET Framework 3.5, C#, Visual Source Safe 2005.