Sr Etl/big Data Developer Resume
Detriot, MI
SUMMARY:
- Over 9 years of IT experience in Data warehousing using different tools Talend/ ODI Informatica PowerCenter in all phases of Analysis, Design, Development, Implementation and support data warehousing applications. Experience in multiple distributions/platforms (Apache, Cloudera , Hortonworks).
- Extensive experience on BigData Analytics with hands on experience in writing MapReduce jobs on Hadoop Ecosystem including Hive and Pig.
- Excellent knowledge on Hadoop Architecture; as in HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience with distributed systems, large scale non - relational data stores, MapReduce systems, data modeling, and big data systems.
- Extensive experience with Informatica Power Center 9.x, 8.x and 7.x hosted on UNIX, Linux and Windows platforms.
- Hands-on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
- Experienced in providing security to Hadoop cluster with Kerberos.
- Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
- Knowledge in implementing unified data platforms using Kafka producers/ consumers.
- Knowledge in writing Spark SQL scripts.
- Expertise in different Informatica Performance Tuning issues like Source, Target, Mapping, Transformation, session to make them more efficient in terms of session performance.
- Good understanding of relational database management systems like Oracle, DB2, and SQL Server and extensively worked on Data Integration using Informatica for the Extraction transformation and loading of data from various database source systems.
- Expertise in SQL and Performance tuning on large scale Oracle, Teradata & Microsoft SQL Server.
- Linked Amazon Web Services such as S3 to Splunk Cloud for log data ingestion o Installed and tested AppDynamics Java APM agents to determine USPTO application baseline activity and performance o Advising on Amazon Web Services best practices for successful cloudformation deployments using nested templates
- Highly Proficient in creating SQL Server Reports, handling sub-reports and writing queries to perform roll-up and drill down operations in SSRS.
- Experience with high volume datasets from various sources like Oracle, Flat files, SQL Server and XML.
- Experienced in handling SCDs (Slowly Changing Dimensions) using Informatica .
- Expertise in trouble-shooting production issues by root cause analysis to identify the problem, impact analysis to determine the dependencies and providing the resolution.
- Extensive knowledge of various Performance Tuning Techniques on Sources, Targets, Mappings and Workflows using Partitions/Parallelization and eliminating Cache Intensive Transformations.
- Experienced creating and scheduling SQL Server Agent jobs for running ETL processes and performing data massaging.
- Worked with scheduling tools like Autosys, Tidal etc. Well versed in writing UNIX shell scripting.
- Good in planning, organizing, prioritizing multiple tasks.
- Interested in learning new technologies and willing to working changing environments.
- Strong decision-making and interpersonal skills with result oriented dedication towards goals.
TECHNICAL SKILLS:
ETL Tool: Informatica 9.5.1, Informatica 8.x,7.x, Mulesoft, Talend
Cloud: Salesforce, AWS, S3, Amazon Web ServicesKnowledge in SAP, Knowledge in MulesoftDatabases Oracle … DB2, SQL Server 2005, Netezza, Teradata (Fast load, Multiload, Tpump and Fast export).
Data Modeling Tool: ERWIN, Toad Data modeler and MS Visio
Scripting Languages: UNIX Shell Scripting, Windows Batch scripting.
Operating Systems: Windows, LINUX, UNIX
Schedulers: Control-M, Autosys, Tidal
WORK EXPERIENCE:
Sr ETL/Big Data Developer
Confidential, Detriot, MI
Responsibilities:
- Closely worked with Business Analyst, Architects and business users.
- Responsible for coordinating with the Business Analysts and users to understand business and functional needs and implement the same into an ETL design document.
- Developed various Informatica Mappings & Mapplets to load data to ODS using different transformations like Source Qualifier, Joiner, Router, Sorter, Aggregator, Connected and Unconnected Lookup, Update Strategy, Expression, Stored Procedure, Java, Normalizer, XML and Sequence Generator to load the data into the target tables.
- Used Spark Streaming API for consuming data from Kafka source and processed data with core
- Developed and deployed Hive UDF's written in Java for encrypting customer - id's, creating item-image-URL's etc.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in analyzing data with Hive and Pig.
- Experienced knowledge over the Restful API's like Elastic Search.
- Writing Pig scripts to process the data.
- Extracted the data from MySQL, AWS RedShift into HDFS using Sqoop.
- Used HiveQL to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Worked on Batch Processing or Scheduling and Shell Scripting.
- Worked with DBA team to fix performance issues in ETL programs.
- Used more components in Talend and Few to be mentioned: tjava, toracle, txmlMap, tdelimited files, tlogrow, tlogback components etc. in many of my Jobs Design
- Worked on Joblets (reusable code) & Java routines in Talend
- Implemented Talend POC to Extract data from Salesforce API as an XML Object & .csv files and load data into SQL Server Database
- Closely worked and assisted the QA team during the test cycles to resolve the issues and bug fixing.
- Created Mappings using Mapping Designer to load data from various sources like Oracle, Flat Files, MS SQL Server and XML.
- Created reports using SSRS from various sources like SQL Server R2 and Cubes from Analysis Services
- Worked on Custom Component Design and used to have embedded in Talend Studio
- Used to be On call Support if the Project is deployed to further Phases
- Used Talend Admin Console Job conductor to schedule ETL Jobs on daily, weekly, monthly and yearly basis (Cron Trigger)
- Deploying SSRS Reports across multiple environments including Test, Production and supporting environments including Reporting Services Report Manager.
- Involved in Performance Tuning at various levels including Target, Source, and Mapping.
- Cluster coordination services using ZooKeeper.
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and HIVE using Sqoop
- Responsible to manage data coming from different sources
- Monitoring the running MapReduce programs on the cluster.
- Responsible for loading data from UNIX file systems to HDFS.
- Installed and configured Hive and also wrote Hive UDFs.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- ETL Data Cleansing, Integration &Transformation using Pig: Responsible of managing data from disparate sources.
Environment: Informatica Power Center 9.5.1, Oracle 11g & Teradata, SQL Developer, SSRS, Flat files, Control-M, UNIX, Windows.
ETL/ Big Data Consultant
Confidential, Austin, TX
Responsibilities:
- Involved in understanding the business requirements, discuss with Business Analysts, analyzing the requirements and preparing business rules. nvolved in importing and exporting data (MySQL, Oracle, csv and text file) from local/external file system and RDBMS to HDFS. Load log data into HDFS using Flume.
- Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports for the BI team.
- Designed a data warehouse using Hive, created and managed Hive tables in Hadoop.
- Involved in gathering requirements, designing, development and testing
- Developed workflow in Oozie to automate the tasks of loading data into HDFS and run multiple Hive and Pig jobs.
- Implemented Partitioning and Bucketing in Hive.
- Developed Hive UDF (User Defined Function) to process customized business logic to replace traditional complex Hive - QL.
- Created and maintained Technical documentation for executing Hive queries and Pig Scripts.
- Worked with NoSQL database Cassandra to create tables and store data.
- Supported Map Reduce programs that run on the cluster.
- Gained experience in setting up Hadoop Cluster.
- Involved in running the loads to the data warehouse and data mart involving different environments.
- Responsible for definition, development and testing of processes/programs necessary to extract data from client's operational databases, Transform and cleanse data, and Load it into data marts.
- Used the update strategy to effectively migrate data from source to target. Created mappings and Mapplets using various transformations of Lookup, Aggregator, Expression, Stored procedure, Sequence Generator, Router, Filter, and Update Strategy.
- Extensively used PL/SQL programming in backend and front-end functions, procedures, packages to implement business rules.
- Created database partitions and Materialized Views and improved the performance
- Created, Configured and Load Scheduled the Sessions and Batches for different mappings using workflow Manager and Unix Scripts.
- Analyzed/de-bug production issues and provided quick turnaround.
- Interacted with end users to identify key dimensions and abstracted quantitative metrics feasible in deciding business solutions.
- Monitoring SQL Server Logs, Application logs and analyzing the logs.
- Extensively worked in the performance tuning of the programs, ETL Procedures and processes.
- Interaction with Offshore Team everyday based on the Tickets/issues and to follow up with them
- Organized data in the reports by using Filters, Sorting, Ranking and highlighted data with Alerts
- Used Unix Command and Unix Shell Scripting to interact with the server and to move flat files and to load the files in the server.
- Responsible for file archival using Unix Shell Scripting.
Environment: Informatica Powercenter9.1.0, TOAD, MS SQL, Oracle, PL/SQL, Oracle10g/9i, SQL Server, SSRS, TOAD, UNIX.
Informatica/OBIEE Consultant
Confidential, Scranton, PA
Responsibilities:
- Participated in developing ETL design using Informatica, used Transformations like Aggregator, Update, Filter, Lookup, Sorter.
- Developed Aggregate, Joiner, Router, Lookup and Update Strategy transformation logic.
- Design and development of Stored Procedures, Triggers, Cursors, Functions and Packages.
- Creation of Test data and verify the workflows and sessions and monitored sessions using Informatica Workflow Monitor.
- Tuning Informatica Mappings and Sessions for optimum performance.
- Used Informatica features to implement Type I & II changes in (SCD) slowly changing dimension tables.
- Configured DAC to run customized workflows and created custom tasks to run IVR workflows.
- Used Debugger to test the data flow and fix the mappings.
- Closely interacted and worked with Business in developing Informatica Mappings / Mapplets and Workflows with Worklets & Tasks using various Transformations (Update Strategy, Filter, Router, Expression, Stored Procedure, Lookup and Aggregator) for ETL of data from Multiple Sources to Data Warehouse.
- Extensively used almost all of the transformations of Informatica including complex lookups, Stored Procedures, Update Strategy and others.
- Customized the Pre - Built mappings and also designed new mappings and Workflows in the ETL Front using the Informatica Power center.
- Created workflows for full load and incremental loads to be used by DAC.
- Configure DAC and create containers, subject areas and tasks.
- Used Informatica Power Center Workflow manager to create sessions, batches to run with the logic embedded in the mappings.
- Usage of Lookups, Aggregator, Ranking, Stored procedures / functions, SQL overrides usage in Lookups and source filter usage in Source qualifiers and data flow management into multiple targets using Routers was extensively done.
- Worked on building OBIEE Repository at 3 layers i.e., Physical Layer, Business Model and mapping Layer and Presentation Layer.
- Created Dashboards, Answers and iBots for various needs of Business.
- Set up physical joins in Physical Layer and Setting up Logical Joins in Logical Layer.
- Generated dimension hierarchies and level based measures in BMM layers based on the requirements.
- Created transformations and mappings using Informatica v9 Power center to move data from multiple sources into targets.
- Was involved in Unit, Integration and System testing and performed data validation for the reports generated using Informatica and OBIEE.
- Created ETL transformations like Lookup, Joiner, Rank and Source Qualifier Transformations in the Informatica v9designer.
- Developed Calculated Measures and assigned Aggregation levels based on Dimension Hierarchies.
- Used the expression builder utility to create new logical columns in the BMM layer and created various calculation measures.
- Worked on the Time Series Wizard and implemented time comparison measures.
- Worked extensively on Oracle BI Publisher and provided the capability to download the reports in the user defined formats.
- Created iBots and Delivers to send Alert messages to subscribed committee contacts especially during Month end Reconciliation.
- Customized Liquidity, Financial structure, GL Balance, Cash flow OOB (Out of the Box) dashboard pages in General Ledger (GL).
- Designed the data model for Project Cost Accounting (P.C.A) to Fixed Assets (F.A) drill down which allows the Cross functional reporting so as to meet the client s tax reporting needs.
- Customized Payments Due, Payment Performance, Customer Report dashboard pages in Account Receivables (AR).
- Customized Dashboard pages like AP Balance, Overview, Payments Due, Supplier Details and Invoice Details in Payables Dashboard.
- Customized Dashboard pages like GL Balance, Cash Flow and Balance Sheet in General Ledger Dashboard.
- Developed Interactive Dashboards using Reports with different Views (Drill-Down, guided navigation, Pivot Table, Chart, View Selector, Column Selector, dashboard and page prompts) using Oracle Presentation Services. Worked on Filters for the prompts.
- Involved in creating the Dimensional Hierarchies, Level based measures and creating the Navigations from one report to other pages.
- Involved in creating report and dashboard prompts.
- Managed Security privileges for each Subject area and Dashboards according to user requirements.
- Used the Catalog Manager and maintained the Analytics web catalog to manage dashboards & answers.
Environment: Oracle 10g/11g, Informatica Power Center 8.6, DAC 10g, OBIEE 10g/11g, BI Apps 7.9.6.1 (Oracle Service Analytics (Self Service, Contracts and Assets Modules), Oracle Financial Analytics (Fixed Assets, Accounts Receivable and General Ledger Modules)).
Business Objects Developer
Confidential
Responsibilities:
- Involved in analyzing and development of the Data Warehouse.
- Translation of Business Processes into Informatica mappings for building Data Marts.
- Analyzed business needs, created and developed new functionality to meet real time data integration that facilitated decision making.
- Involved in information gathering from end user and analyzing the data to developed ER diagram, data dictionaries, and flow charts.
- Actively participated in the requirement analysis, database design and modeling.
- Designed universes, classes, checked cardinalities and detecting loops using Business Objects Designer.
- Created and configured database for development and production environments in SQL Server.
- Created tables in SQL Server 2000
- Developed stored procedures and database triggers using SQL Server.
- Database and log backups & restoration, backup strategies, scheduling the backups. Backing up System & User Databases and restoring them when necessary.
- Loaded the SQL Server database using Bulk Export and Import.
- Created database for production environment In SQL.
Environment: SQL Server 2000, Business Objects, Crystal Reports, Windows NT.