Sr.ETL/Big Data Developer/Support Resume San Jose, CA - Hire IT People

PROFESSIONAL SUMMARY:

9 years of professional experience in IT in Analysis, Design, Development, Testing, Documentation, Deployment, Integration, and Maintenance of web based and Client/Server applications using Java and Big Data technologies along with ample 6+ years of experience in ETL Informatica and Talend data warehousing tools, involving mapping, creation, designing, analysis, design implementation, and support of application software of a client requirement.
Involved in full Life Cycle Development including System Analysis, Design, Data Modeling, Implementation and Support of various applications in OLTP, Data Warehousing, and OLAP applications.
3+ years of hands - on experience in working on Apache Hadoop ecosystem components like Map-Reduce, Sqoop, Spark, Scala, Flume, NiFi, Pig, Hive, HBase, Oozie and Impala.
Extensive experience of Informatica Power Center 9.x/8.x/7.x and Power Mart 6.x/5.x to carry out the Extraction, Transformation and Loading process.
Excellent understanding/ knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Map Reduce and Spark.
Good knowledge in using apache NiFi to automate the data movement between different Hadoop systems.
Experience in managing and reviewing Hadoop log files.
Extensively worked on Informatica tools Admin console, Repository manager, Designer, Workflow manager, Workflow monitor.
Extensively used ETL methodology for performing Data Profiling, Data Migration, Extraction, Transformation and Loading using Talend and designed data conversions from wide variety of Source systems including Oracle, DB2, SQL server, Teradata, Hive and non- Relational sources like flat files, XML Files.
Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
Have clear understanding of Data Warehousing and BI concepts with emphasis on ETL and life cycle development using Power Center, Repository Manager, Designer, Workflow Manager and Workflow Monitor.
Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Good understanding of NoSQL databases and hands on experience with Apache HBase.
Experience in managing the Hadoop infrastructure with Cloudera Manager.
Data Modeler with strong Conceptual and Logical Data Modeling skills, has experience with JAD sessions for requirements gathering, creating data mapping documents, writing functional specifications, queries.
Vast experience in designing and developing complex mappings from varied transformation logic like Unconnected and Connected Lookup transformations, Source Qualifier, Joiner transformation, Update Strategy, Rank transformations, Expressions, Aggregator, Sequence Generator etc.
Strong Skills of Extraction of data from SAP R/3 tables using Informatica Power Connect.
Usage of Informatica mapping variables/parameters and session variables where necessary.
Actively involved in Performance Tuning, ErrorHandling and ProductionSupport.
Experience in implementing update strategies, incremental loads and change data capture.
Experience in conversion scripts using SQL, PL/SQL, stored procedures, functions and packages to migrate data from SQL server database to Oracle database.
Expertise in OLAP/OLTP System Study, Analysis and E-R Modeling, developing database Schemas like Star Schema, Snowflake Schema, Conforming Dimensions and Slowly Changing Dimensions used in relational, dimensional and multi dimensional modeling.
Strong in UNIX Shell scripting. Developed UNIX scripts using PMCMD utility and scheduled ETL load using utilities like TWS, Maestro, Autosys and Control M.
Experience in coding using Oracle 11g/10g/9i/8i, DB2, SQL Server /08, SQL, PL/SQL procedures / functions, triggers and exceptions. Good experience in Relational Database Concepts, Entity relational Diagrams.
Have good experience as Informatica administrator in creating domains, repositories and folders.
Development experience across the business areas such as Finance, Retail, Travel, Insurance and Healthcare.
Excellent communication and interpersonal skills. Ability to work effictively as a team member as well as an individual.

TECHNICAL SKILLS:

BIGDATA ECHNOLOGIES: Hadoop, HDFS, Hive, MapReduce, Pig, Sqoop, Flume, Kafka, Pulsar, Zookeeper, Oozie, Hue, Hbase, Spark Core, Spark SQL, Scala, Cloudera, Impala, Presto, Elastic Search, Apache NiFi, Data Highway

ETL: TalendStudio6.2.1, InformaticaPowerCenter 10.1/9.5/9.1/8.6.1/8.5.1/8.1/7. x Repository Manager, Designer, Workflow Monitor, Workflow Manager, Power Mart 6.2/6.1/5.1. x/4.7,Oracle Data Integrator 10.1, Power Connect for DB2/Siebel/SAP, SSIS, Data Quality.

Data Modeling: Physical Modeling, Logical Modeling, Relational Modeling, Dimensional Data Modeling, ER Diagrams, Erwin 4.0/3.5.2/2. x

Databases Scheduling Tools: Oracle 11g/10g/9i/8i, DB2, MS SQL Server 2000/2005/2008/2014, GreenPlum 4.3, Teradata V2R6, SQL*Loader Tidal, Autosys, Tivoli Workload Scheduler, Control M

Environment: Windows 95/98/2000/NT, UNIX, MS-DOS, SQL*Plus

Languages: SQL, PL/SQL, T-SQL, Shell scripts, Java, Scala

PROFESSIONAL EXPERIENCE:

Confidential, San Jose, CA

Sr.ETL/Big Data Developer/Support

Responsibilities:

Effectively handled various enhancements and solved various productions problem.
Involved in bug fixing and resolving issues.
Used web based ETL tool subex ROC GUI, Moneta to process files and jobs to load the data.
Import the data from different sources like HDFS/HBase into Hive tables .
Performed real time analysis on the incoming data.
Performed procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
Developed Spark code using Spark-SQL/Streaming for faster testing and processing of data.
Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
Worked on Big Data Integration and Analytics based on Hadoop, Spark, Kafka, NiFi ecosystem and web Methods technologies.
Worked on migrating MapReduce Java programs into Spark transformations using Spark and Scala.
Worked on NiFi data Pipeline to process large set of data and configured Lookup’s for Data Validation and Integrity.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, and Pair RDD's.
Processed HDFS data and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQLinto HDFS using Sqoop.
Worked with different file formats like Json, AVRO and parquet and compression techniques like snappy. Nifi ecosystem is used.
Created Hive tables on top of HBase using Storage Handler for effective OLAP analysis.

Environment: Spark Core, Spark SQL, Scala, Hortonworks, Moneta, Subex ROC 5.0, ODI 10.1.3.6.2, Oracle 12c, Teradata 13, UNIX, HDFS, Hive, MapReduce, Sqoop, Hbase, Apache NiFi.

Confidential, Southlake TX

Sr.ETL/Big Data Developer

Responsibilities:

Analyzed the requirements and framed the business logic and implemented it using Talend & Involved in ETL design and documentation.
Worked on Talend components like tReplace, tmap, tsort and tFilterColumn, tFilterRow,tJava, Tjavarow, tConvertType, tOracleInput, tOracleOutput and many more.
Experience in development of ELT and ETL Mapping in Oracle Data Integrator.
Experience in importing and exporting Tera bytes of data between HDFS and Relational Database Systems using Sqoop and ingesting them into HBase.
Knowledge of job workflow scheduling and monitoring tools like Oozie and Ganglia, NoSQL databases such as HBase, Cassandra, BigTable, administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.
Started using apache NiFi to copy the data from local file system to HDFS.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map way
Writing shell scripts to monitor the health check of Hadoop daemon services and responding accordingly to any warning or failure conditions.
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Importing data from Oracle to HDFS & Hive for analytical purpose.
Loading data to HP vertica database using copy command with clustered approach from HDFS.
Created HBase tables to store various data formats of data coming from different sources.
Developed Spark scripts by using Scala shell commands as per the requirement.
Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop

Environment: Hadoop YARN, Spark Core, Spark SQL, Scala, Cloudera, Talend Studio 6.2.1, ODI 10.1.3.6.2, Oracle 12c, UNIX, HDFS, Hive, NiFi, MapReduce, Sqoop, Hbase, Vertica 6.1.3

Confidential, Philadelphia, PA

Sr.ETL/Big Data Developer/Support

Responsibilities:

Involved in creating Hive tables, loading the data and writing hive queries that will run internally in MapReduce and Spark .
Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
Installed and configured Hive and also written Hive UDFS .
Involved in emitting processed data from Hadoop to relational databases or external file systems using Sqoop.
Generated reports using tableau through impala with live connectivity.
Extensively used Informatica Workflow Manager to run the workflows/mappings and monitored the session logs in Informatica Workflow Monitor.
Verified session logs to identify the errors occurred during the ETL execution.
Created Test Cases, traceability matrix based on mapping document and requirements.
Written several complex SQL queries for data verification and data quality checks.
Reviewed the test cases written based on the Change Request document and Testing has been done based on Change Requests and Defect Requests.
Developed test plan with testing scenarios from the end user perspective for User Acceptance Testing (UAT).
Tested access privileges for several users based on their roles in Web Pages log in.
Involved in testing PL/SQL programs, PL/SQL batch processes, UNIX Shell Scripts.
Preparation of System Test Results after Test case execution.
Used Ab Initio GDE and Co-op to run the graphs and monitor them.
Tested the ETL Informatica mappings and other ETL Processes (DW Testing).
Effectively coordinated with the development team for closing a defect.
Prepared Test Scenarios by creating Mock data based on the different test cases.
Perform defect Tracking and reporting with strong emphasis on root-cause analysis to determine where and why defects are being introduced in the development process.
Debugging and Scheduling ETL Jobs/Mappings and monitoring error logs.
Have tested reconcile and incremental ETL loads for the project.
Have worked in production support team for EDW ETLs execution support in 24X7 shifts.
Wrote complex SQL queries for extracting data from multiple table, multiple databases.

Environment: Informatica Power Center 10.1, Abinitio GDE 3.2.1, Oracle 11g, Teradata 12, Flat Files, Power Exchange, UNIX, PL/SQL Developer, Control M 8.0.1, HDFS, Hive, Spark, Scala, MapReduce, impala, Tableau

Confidential, Kansas City, MO

Sr. ETL Developer

Responsibilities:

Interacted with the users, Business Analysts for collecting, understanding the business requirements.
Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Experienced in managing and reviewing Hadoop log files.
Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
Extracted and updated the data into MongoDB using MongoDB import and export command line utility interface. Designed and Developed the Informatica workflows/sessions to extract, transform and load the data into Targets .
Created Mapplets, reusable transformations and used them in different mappings.
Developed mapping parameters and variables to support SQL override.
Used Informatica Power Center 9.6 for extraction, transformation and load (ETL) of data in the data warehouse.
Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
Developed complex mappings in Informatica to load the data from various sources.
Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
Parameterized the mappings and increased the re-usability.
Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
Created procedures to truncate data in the target before the session run.
Extensively used Toad utility for executing SQL scripts and worked on SQL for enhancing the performance of the conversion mapping.
Used the PL/SQL procedures for Informatica mappings for truncating the data in target tables at run time.
Extensively used Informatica debugger to figure out the problems in mapping. Also involved in troubleshooting existing ETL bugs.
Created a list of the inconsistencies in the data load on the client side so as to review and correct the issues on their side.
Written documentation to describe program development, logic, coding, testing, changes and corrections.
Created Test cases for the mappings developed and then created integration Testing Document.
Followed Informatica recommendations, methodologies and best practices.

Environment: Informatica Power Center 9.6.1, Oracle 11g, Flat Files, Power Exchange, UNIX, PL/SQL Developer, Pig, HBase, Oozie, Sqoop.

Confidential, Minneapolis, MN

Sr.ETL/Big data Developer

Responsibilities:

Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase and Sqoop.
Installed Hadoop, MapReduce, HDFS, and developed multiple MapReduce and spark jobs in PIG and Hive for data cleaning and pre-processing.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables loading data and writing queries that will run internally in MapReduce way.
Involved in processing ingested raw data using MapReduce, Apache Pig and HBase.
Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
Extensively used Informatica 9.6.1 to extract, transform and load data from multiple data sources, mainly involved in ETL design & development.
Designed and developed mappings, defined workflows and tasks, monitored sessions, exported and imported mappings and workflows, backups, and recovery.
Involved in detailed analysis of requirements and creation of requirement specifications and functional specifications.
Handling the user queries and tickets (ITSM) for Informatica applications.
Mainly involved in Preparation of Low Level Designs (Mapping Documents) by understanding the system CPMS.
Responsible in developing the ETL logics and the data maps for loading the tables.
Responsible for migrating from Development to staging and to Production (deployments).
Extracting and loading of data from flat file, SQL Server sources to SQL Server database (target warehouse) using transformations in the mappings.
Reloading of applications run and need to make sure reloading process is completed.
Analyze functional requirements provided by Business Analysts for the code changes.
Create workflows based on Dependencies in Informatica for the code changes.
Unit test the data mappings and workflows.
Validate the data loaded into database.
Provide the status report for all the application monitoring, tracking status.
Execute the Test cases for the code changes.
Provided extensive support in UAT (user acceptance testing) and deployment of mappings.

Environment: Hadoop, Hive, Spark, MapReduce, Pig, MongoDB, Oozie, Sqoop, Kafka, Cloudera, Spark, HBase, HDFS, Python, Solr, Zookeeper, Cassandra, DynamoDB, Informatica Power Center 9.6.1, SQL Server 2014, Flat Files, IDQ, GreenPlum 4.3, Quality Centre, Autosys.

Confidential, Lindon, UT

Sr.ETL Developer

Responsibilities:

Responsible for the life-cycle development, implementation, deployment, quality assurance & operational integrity of multiple production projects and environments that support key business areas.
Experience in Migration as per change control, Configuration and creating domains, repositories and folders.
Participated in the Design of Data mart using Star Schema including fact tables.
Designed and Developed the Informatica workflows/sessions to extract, transform and load the data into Targets .
Developed procedures to populate the customer data warehouse with transaction data, cycle and monthly summary data and historical data.Worked with different Informatica tuning issues and fine-tuned the transformations to make them more efficient in terms of performance.
Manage the Informatica and Oracle data transformation environments and processes including development, quality assurance, production and failover.
Identify process/procedure/control improvements and implemented when/where appropriate. Ensure project testing and development quality assurance meets approved standards. Maintain and foster effective relationships with staff, clients, developers and others.
Involved in ETL code using PL/SQL in order to meet requirements for Extract, transformation, cleansing and loading of data from source to target data structures.
Responsible for time estimation for projects, and Production support for Informatica jobs. Used to take care of performance and connectivity issues.
Development of UNIX shell scripts for job scheduling and Informatica pre and post mapping session processes.
Development of Informatica mappings to load data to various Staging, Dimensions and other tables in EDW and three different data marts and Informatica administration and mappings code review.
Developed Workflow process for the complete automation of the GL process.
Responsible for setting up of production job scripts for ETL batch Scheduling using TWS.
Supporting the application in Production environment by monitoring the ETL process everyday during the nightly loads.

Environment: Informatica Power Center 9.5, Oracle 10g, DB2, Flat Files, Windows XP SP2, SQL Server 2008, PL/SQL, TWS 9.2.

Confidential, Detroit, MI

Sr. Informatica Developer

Responsibilities:

Developed coding for data movement between MPS (Multifamily Processing System) and the target systems EDW(Enterprise Data Warehouse and EDM(Enterprise Data Mart).
Used CSV file, XML/XSD file, Teradata as source and oracle 11g as the target database.
Extensively did XML shredding for data movement from the XML file to the Oracle target database.
Used ETL (SSIS) to develop jobs for extracting, cleaning, transforming and loading data into data warehouse.
Extensively used SSIS transformations such as Lookup, Derived column, Data conversion, Aggregate, Conditional split, SQL task, Script task and Send Mail task etc.
Experience in development of extracting, transforming and loading (ETL), maintain and support the enterprise data warehouse system and corresponding marts.
Prepared low level technical design document and participated in build/review of the BTEQ Scripts, FastExports, Multiloads and Fast Load scripts, Reviewed Unit Test Plans & System Test cases.
Worked on Source to Target Mapping document.
Extensively worked with Informatica components Source Analyzer, Target Designer, Repository Manager, Work flow Manager/Monitor and Admin Console.
Responsible for Informatica repository backups.
Extensively worked on Autosys to schedule Informatica mappings.
Responsible for shell scripting to run Informatica mapping using pmcmd command.
Responsible for check in /check out scripts and maintaining versions using Clearcase.
Responsible for Unit testing and writing test cases for ETL mappings.
Involved in the performance tuning of source & target, transformation and session optimization.
Worked closely with Project Manager, Business users and testing team to strictly meet the deadline.
Involved in the performance tuning by determining bottlenecks at various points like targets, sources, mappings, sessions or system.

Environment: ETL Informatica Power Center 8.6.1, SQL Integration Services (SSIS), XML Spy 2012, Rapid SQL 7.7, Oracle SQL Developer 1.1.2, Oracle11g, SQL Server 2008, SAP R/3 Windows XP, Red Hat Enterprise Linux Server r5.6.

Confidential, Omaha, NE

Informatica Developer

Responsibilities:

Assisted gathering business requirements and worked closely in various Applications with Business Analyst teams to develop Data Model.
Worked on Informatica Power Center 9.1 tool- Source Analyzer, Target designer, Mapping Designer, Transformation Developer, Mapplet Designer.
Used most of the transformations such as Source Qualifier, Aggregator, Lookups, Filters, Sequence generator, Joiner, Expression as per the Business requirement.
Developed procedures to populate the customer data warehouse with transaction data, cycle and monthly summary data, and historical data.
Extensively used debugger to troubleshoot logical errors.
Worked with the Business analysts for requirements gathering, business analysis and testing.
Requirement gatherings with the Internal Users and sending requirements to our Source System Vendor.
Used Teradata SQL Assistant to run the SQL queries and validate the data.
Used Teradata and Oracle as sources for building the datamarts to analyze the business individually.
Worked on Data Quality to proactively monitor and cleanse the data for all applications and maintain its clean state.
Responsible for creating Workflows and sessions using Informatica workflow manager and monitor the workflow run and statistic properties on Informatica Workflow Monitor.
Involved in Unit testing to check whether the data is loading into target, which was extracted from different source systems according to the user requirements.
Working in Agile methodologies and participating daily scrum meetings for providing status reports to team regarding project status, task and issues.

Environment: Informatica Power Center 9.1, DB2, Flat Files, Teradata v2r6, Windows XP, SQL Server 2008, XML Files, Data Quality, Cognos 8.

Confidential, Raleigh, NC

ETL Developer

Responsibilities:

Interacted with business analysts and translate business requirements into technical specifications.
Using Informatica Designer, developed mappings, which populated the data into the target.
Used Source Analyzer and Warehouse Designer to import the source and target database schemas and the mapping designer to map the sources to the targets.
Responsibilities included designing and developing complex informatica mappings including Type-II slowly changing dimensions.
Worked extensively on Workflow Manager, Workflow Monitor and Worklet Designer to create, edit and run workflows, tasks, shell scripts.
Developed complex mappings/sessions using Informatica Power Center for data loading.
Enhanced performance for Informatica session using large data files by using partitions, increasing block size, data cache size and target based commit interval.
Extensively used aggregators, lookup, update strategy, router and joiner transformations.
Developed the control files to load various sales data into the system via SQL*Loader.
Extensively used TOAD to analyze data and fix errors and develop.
Involved in the design, development and testing of the PL/SQL stored procedures, packages for the ETL processes.
Developed UNIX Shell scripts to automate repetitive database processes and maintained shell scripts for data conversion.

Environment: Informatica Power center 8.6.1, Oracle 10g, SQL Server 2005, ERWIN 4.2, TOAD, UNIX(AIX), Windows XP, Tivoli

Confidential, Tempa, FL

Jr ETL Developer

Responsibilities:

Mapping Data Items from Source Systems to the Target System.
Used most of the transformations such as Source Qualifier, Aggregator, Lookups, Filters, Sequence generator, Joiner, Expression as per the Business requirement.
Worked on Informatica Tool Source Analyzer, Data warehousing designer, Mapping Designer & Mapplet, and Transformations.
Extensively used ETL to load data from flat files to ORACLE.
Involved in the development of Informatica mappings and follow some instruction to tune a mapping
Worked with transformations such as expression, filter, router, update, union, lookup, rank, connected stored procedures, aggregator, and sequence generator. Sorter, joiner transformation.
Created traces using SQL server profiler to find long running queries and modify those queries as a part of Performance Tuning operations.
Developed ETL packages with different data sources (SQL Server, Flat Files, Excel source files, XML files etc) and loaded the data into target tables by performing different kinds of transformations using SQL Server Integration Services (SSIS).
Created SSIS packages to pull data from SQL Server and exported to Excel Spreadsheets and vice versa.
Involved in unit testing, systems testing, integrated testing and user acceptance testing.
Customized the existing dimensions and fact tables.
Actively involved in migration strategies between development, testing and production Repositories
Experience of handling slowly changing dimensions to using Type I, Type II.
Involved to prepare Unit Test Case documents.

Environment: Informatica Power Center 8.6.1, SQL Server Management Studio (SSMS), Business Objects, Oracle 10g, SQL Server 2008, Oracle Designer, Toad, PL/SQL, Linux, Erwin

We provide IT Staff Augmentation Services!

Sr.etl/big Data Developer/support Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship