Etl Developer Resume
3.00/5 (Submit Your Rating)
SUMMARY
- Overall 10+ years of work experience in IT industry.
- Extensive hands on experience in ETL tools like Informatica, Data stage, SSIS, Oracle, SQL Server, Unix, Hive, Sqoop, Scala and Spark
- Experience building Analytics Platforms using Python.
- Very string Python coding experience with solid understanding of Python internals.
- Numpy, Scipy and Pandas expertise.
- Performed ETL operations by using Python and SQLAlchemy.
- Expert in building Early warning systems (C - PEWS, MC-EWS) for analysing and taking decisions to the business owners and it will also help to take immediate actions on the regular business transactions for the lower level management.
- Designed Data model, schema design and data processing flow for C-PEWS and MC-EWS applications.
- Proficient in Data warehouse concepts, Oracle PL/SQL and Business Intelligence concepts.
- Provide estimations for ETL deliverables and oversee the progress for quality ETL Deliverables.
- Work with the business stakeholders, analysts and other programmers to develop a solution or enhance existing solution
- Implemented change Data capture methods for Incremental loads
- Implemented Slowly Changing Dimensions type 1, type 2 and type 3 methods on SMART by using Informatica.
- Created several Repository Mappings and Queries facilitating rapid analysis, trouble shooting, code verification and deployment.
- Extensive knowledge on developing Informatica Mappings, Mapplets, Workflows and scheduling jobs in (serial and parallel mode).
- Worked extensively in Data ware housing projects by using Informatica 8.6, 9.6 and Datastage.
- Worked with various Transformations like Aggregation, Expression, Filter, Joiner, Lookup, Router, Union Transformation, etc.
- Worked on sequential, parallel jobs creation and scheduling Data stage jobs on TWS.
- Worked with using various stages in Data stage like Transformer, Aggregation, Filter, Joiner, Lookup, Funnel, sequential file creation, etc.
- Worked on data extraction from oracle and loading into Hive and from Hive to oracle (vice versa)
- Worked on run deck setup for processing HDFS data by using Talend.
- Extensive experience on Building Talend jobs and moved to BDPASS environment and executing as shell scripts and scheduling these shell scripts in TWS.
- Deal with databases like Oracle, Sqlserver and experience in integration of various data sources like Oracle, Flat Files, and Sqlserver.
- Extensive experience in Oracle backend objects like Database Triggers, Stored Procedures, Views, Materialized Views, Synonyms, Constraints, Sequences, Functions, Packages and exception handing using PL/SQL.
- Implementing ETL process on Big Data (HDFS file system) by using Sqoop and Hive.
- Migrating Regular ETL jobs from Informatica to BIG Data.
- Good Knowledge on Oracle Performance Tuning, Indexing, Partitioning techniques and Ref Cursors.
- Good Knowledge on Oracle Bulk Collect, Collections (V-arrays, Nested Tables, Associative Arrays)
- Extensive knowledge on SQL, UNIX commands and Shell scripting.
- Good Knowledge on Hive, Sqoop, Scala and Spark
- Assess new tools and software, perform POCs, and offer recommendations to improve our enterprise landscape
- Good Experience as Hadoop/Spark Developer using Big data technologies like Hadoop Ecosystem, Spark Ecosystem
- Knowledge of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Spark MLib
- Experience in using RDD caching for Spark streaming.
- Experience in using Spark SQL with data source Hive.
- Migrating the coding from Hive to Spark and Scala using Spark SQL, RDD.
- Implemented Spark using Scala and Spark-SQL for faster testing and processing of data
- Converting MAPR programs into Spark transformations using Spark RDD in Scala.
- Developed Spark scripts by using Scala shell commands as per the requirement.
- Extensive experience in using Microsoft BI studio products like SSIS for implementation of ETL methodology in data extraction, transformation and loading.
- Expert in Data Warehouse development starting from inception to implementation and ongoing support, strong understanding of BI application design and development principles.
- Highly experienced in whole cycle of DTS/SQL server integration services (SSIS 2005/2008) Packages (Developing, Deploying, and Scheduling, Troubleshooting and monitoring) for performing Data transfers and ETL Purposes across different servers.
- Experience in providing Logging, Error handling by using Event Handler, and Custom Logging for SSIS Packages.
- Experience in Performance Tuning in SSIS packages by using Row Transformations, Block and Unblock Transformations
- Scheduling and Monitoring ETL Processes using DTS Exec utilities and batch files.
- Wrote Python and batch scripts to automate the ETL scripts runs every hour.
- Developed ETL scripts in Python to get data from one database table and insert, update the resultant data to another database table.
- Wrote Python scripts to parse XML documents and load the data in database
TECHNICAL SKILLS
ETL Tools: Informatica, Data stage, SSIS, Talend
Languages: SQL, PL/SQL, HQL, Sqoop, Scala, Spark
Reporting Tools: MicroStrategy
Scripting languages: Python, Shell, Sed & Awk, VBA.
Systems: Data Bases
Windows, UNIX, Linux: Oracle 11g,12C, Oracle Exadata, Sql server 2012, Hive
PROFESSIONAL EXPERIENCE
Confidential
ETL Developer
Environment: Data stage, SSIS, Oracle, Unix, HDFS, Sqoop, Hive, Scala, Spark, Python
Responsibilities:
- Working as a Technical lead for the Altruista Clinical data Processing.
- Mentoring the team in development work and supporting to resolve technical issues.
- Developed Altruista Clinical Data model structure in HIVE.
- All New sources data loading into HIVE by using sqoop and Talend. Hive using as a Staging area and generating adhoc reports from hive server.
- From hive to SMART data loading using sqoop, unix scripts and also using Talend.
- SFTP the files from Ingenix server to HDFS (Hadoop File System) and copying the files from HDFS to Hive for report generation and exporting the Data from Hive to SMART by using Sqoop.
- Prepared UNIX shell scripts to avoid incomplete termination of Sqoop export process due to bad records and send back to the bad records list to INGENIX team.
- Writing HIVE queries to generate load statistics report.
- Performed Customer Requirements Gathering, Requirements Analysis, Design, Development, Testing, End User Acceptance Presentations, Implementation and Post production support of BI Projects.
- Automated and improved existing manual processes. Optimizing the server platforms using Caching, Scheduling, and Clustering.
- Worked with back end Database Administrators to provide requirements for necessary back end tables.
- Extensively worked in both Ad-hoc and Standard Reporting Environments and Involved in creating Reports scalable to large volumes of data. Managed projects through entire Systems Development Life Cycle, including creating
- Timely/thorough status reports and leading status meetings.
- Handling the incidents and given the solutions.
- Migrated existing unix and SQL scripts to Datastage jobs
- Developed new Datastage jobs for extracting data from source systems in place of existing java code and sql scripts.
- Improving the performance of current sql scripts by using Datastage parallel processing.
- Replaced Crontab scheduled jobs with combination of Datastage and TWS.
- Improve the performance of the incremental processes by replacing sql scripts with Change capture method in Datastage.
- Developed various parallel and sequential jobs and creating done files for completed jobs and sending the done files into MicroStrategy server to trigger the reports based on even trigger.
- Involved in various back end ETL development tasks and Production Support.
- Fixed Data Integrity issues on ETL side and providing Ad-hoc Solutions.
- To Develop the technical components like PL/SQL Procedures, Unix Shell Scripts, sqlloader control files to achieve the exact business needs
- Involved in writing Procedures, Functions, Triggers, Packages, Cursors, Views and data loading using SQL Loader utility for new/enhance requirements.
- Work on the ad-hoc requests like analysis of data inconsistency, data being populated incorrectly, code fix and improving the existing processes.
- Developing Spark programs using Scala API’s to compare the performance of spark with Hive and SQL
- Implemented Spark using Scala and Spark-SQL for faster testing and processing of data
- Designed and created Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets
- Imported the data from Hive Data lake into Spark RDD
- Used Spark-SQL to load Data lake and create schema RDD and loaded into C&S tenant hive Tables and handled structured data Using Spark-SQL
- Load the data into Spark RDD and do in memory data computation to generate the output response
- Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data
- Involved in converting Hive/SQL into Spark transformations using Spark RDD’s, Scala
- Analyzed the SQL scripts and designed the solution to implement using Scala/Spark
- Involved in converting MapReduce programs into Spark transformations using Spark RDD in Scala
- Developed Spark scripts by using Scala shell commands as per the need
- Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data
- Proficient in usage of SSIS Control Flow items (For Loop, Execute package/SQL tasks, Script task, send mail task) and SSIS Data Flow items (Conditional Split, Data Conversion, Fuzzy lookup, Fuzzy Grouping, Pivot).
- Having hands on experience in DR Processes including Log Shipping and Database Mirroring.
- Experience in building SSIS packages (.dtsx) involving ETL process, extracting data from various flat files, Excel files, legacy systems and loading into SQL server.
- Hands on experience working with SSIS, for ETL process ensuring proper implementation of
- Event Handlers, Loggings, Checkpoints, Transactions and package configurations.
- Created SSIS/DTS packages to copy tables, schemas and views and to extract data from Excel and Oracle.
Confidential
PL/SQL Developer, DBA
Environment: Oracle, Unix, Informatica, Micro Strategy 9.3,VBA
Responsibilities:
- To understand the business requirements by doing a complete analysis and preparing possible approaches for the requirement.
- To Develop the technical components like PL/SQL Procedures, Unix Shell Scripts, sqlloader control files to achieve the exact business needs
- Analysis of the Data Quality, impact analysis, monitoring the data loads and resolving the production issues.
- To run the monthly loads into data ware house and monitor it and resolve the errors if it errors out
- Involved in writing Procedures, Functions, Triggers, Packages, Cursors, Views and data loading using SQL Loader utility for new/enhance requirements.
- Work on the ad-hoc requests like analysis of data inconsistency, data being populated incorrectly, code fix and improving the existing processes.
- Taking up the Knowledge Transfer sessions for the new joiners and other members who are working the developing reports on the Data Ware house and also helping the team members with their issues.
- Reviewing, Unit testing, Integration testing on the deliverables.
- Creating Smart Data Dictionary by using VBA and Shell Scripting, Automate the Smart Data dictionary.
- Mentoring the team in development work
- Extensively worked with MicroStrategy Intelligence Server, MicroStrategy Web and MicroStrategy Administrator reporting basics using MicroStrategy Desktop and MicroStrategy Web. Involved in troubleshooting MicroStrategy prompt, Filter, Template, Consolidation, and Custom Group objects in an enterprise data warehouse team environment.
- Worked extensively on creating Application Objects like Metrics, Compound Metrics, Filters, Custom Groups and Consolidations for Complex Reports.
- Automated and improved existing manual processes. Optimizing the server platforms using Caching, Scheduling, and Clustering.
- Worked with back end Database Administrators to provide requirements for necessary back end tables.
- Extensively worked in both Ad-hoc and Standard Reporting Environments and Involved in creating Reports scalable to large volumes of data. Managed projects through entire Systems Development Life Cycle, including creating
- Timely/thorough status reports and leading status meetings.
- Handling the incidents and given the solutions.
- Involved in various back end ETL development tasks and Production Support.
- Fixing Data Integrity issues on ETL side and providing Ad-hoc Solutions.
- Work on the ad-hoc requests like analysis of data inconsistency, data being populated incorrectly, code fix and improving the existing processes.
- Installed, configured and maintained Oracle 11GR2 Real Application Cluster (RAC) databases including a 4 Node RAC on EXADATA.
- Installed, configured and maintained Oracle Database 12c on Oracle Sun Solaris T4-1 SPARC servers and 11g on T3-1 SPARC servers.
- Setup physical standby’s (DATAGUARD) for several Databases running 11g and 12c as well (CDB and Non-CDB)
- Configured Data guard broker between the primary and standby databases, perform Switchover/Failover and monitor the data guard status via command line and GUI.
- Involved in Availability testing of several Production databases before going Live which involved Node level failover testing, Listener/Scan level failover testing and database failover testing.
- Configure and build Golden Gate Extracts/replicat for Uni-directional and Bi-directional replications on Exadata and Non-Exadata servers using DBFS and ACFS storage.
- Performed Goldengate initial load using (expdp and impdp) from source to target.
- Worked on Logdump utility of Goldengate.
- Good at altering the Rba for replicat during the troubleshoot of Goldengate based on the transaction type and not missing any transactions.
- EXADATA monitoring, Setting up the alerts for database, listener, tablespace monitoring, ZFS backups setup from OEM 12c.
- Migrating existing 10g/11g databases to 12.1.0.1 using transportable tablespace method. Upgraded supported versions to 12cR1 using DBUA (direct path upgrade)
- Plugging from and Unplugging into XML files, creating the Container database and pluggable databases. Migrating the 11g database to 12c pluggable database.
- Knowledge on Oracle Support Process for Technical requests using Metalink and other support procedures.
- Participated in 24 X 7 on-call rotation, off-hour production problem resolution activities.
- Excellent communication skills, with strong troubleshooting and problem solving ability.
Confidential
PL/SQL Developer, DBA
Environment: Oracle 11g
Responsibilities:
- Coding of Application program from specifications using SQL.
- Developed complex SQL queries using joins, sub queries and correlated sub queries.
- Preparing analysis documents for the existing code.
- Writing Procedures, Functions, Triggers, Packages, Cursors, Views and data loading using SQL Loader utility for new/enhance requirements.
- Improving performance for existing process.
- Involved in writing data Migration scripts to load data successfully from client database to phrased database.
- Wrote UNIX shell scripts for data files FTP and to run SQL *Loader scripts.
- Wrote SQL *Loader scripts for importing data from Flat Files, CSV files and files with delimiters.
- Developed Test Plans and participate in unit testing thru production implementation.
- Planning and scheduling Backups using RMAN (hot, cold and incremental backups).
- Experience in taking logical backup of the database using DATAPUMP (Expdp/Impdp).
- Installation and configuration of Recovery Catalog for multiple Databases.
- Recovery of Databases (Recovering from Media Failure, Recovering tables dropped accidentally, Recovering data files, Recovering from Block Corruption etc.)
- Providing DBA support to multiple Cluster, Non Cluster and ASM Databases in production, development and Testing Servers in UNIX & Windows Environments.
- Experience in performing database related activities using Oracle Enterprise Manager Grid (OEM Grid).
- Implementation of Flashback Technology.
- Managed Database Structures, Converted Logical data to Physical data, Storage Allocation, Table/Index segments, Rollback and undo segments, Constraints, Database Access, Roles and Privileges and Database Auditing.