Senior Bigdata Developer Resume
Irvine, CA
SUMMARY:
- Over 15+ years of focused industry experience in IT with a strong background Hadoop , Informatica, BigData , Teradata , Oracle Database development and Data Warehousing and about 12 years in ETL process using Hadoop, Informatica Power Center 9.5/8.6.1,..
- Expertise at designing tables in Spark, HBASE, Hive, MYSQL using SQOOP and processing data like importing and exporting of databases to the HDFS.
- Expertise in creating mapreduce & Spark framework to create a structured file from an unstructured file.
- Expertise in Core Java concepts including Collections, Exception Handling, Serialization and Deserialization .
- Expert in Oracle Business Intelligence 10g/11g and Applications Consultant with successful track record in gathering user requirements, designing, developing and supporting different Business Applications.
- Extensively worked on Business Intelligence and Data warehousing with Kimball and Inmon Methodology
- Expert in Installation and configuration of Financials Analytics , Supply chain and Order Management Analytics, and Procurement and Spend Analytics.
- Expertize on YARN (MapReduce2.0) architecture and components suchs as Resource Manager, Node Manager, Container and Application Master and execution of MapReduce job.
- Analytics Metadata objects, Web Catalog Objects (Dashboard, Pages, Folders, Reports) and scheduling iBot, DAC .
- Proficient in understanding Business processes requirements and translating them into technical requirements.
- Solid experience in Informatica PowerCenter Mappings, Mapplets, Transformations, Workflow Manager, Workflow Monitor, Repository Manager, Star Schema and Snow flake Schema, OLTP, OLAP, Data Reports.
- Extensive experience with Data Extraction, Transformation, and Loading (ETL) from disparate Data sources like Multiple Relational Databases like Teradata, Oracle, DB2 - UDB and Worked on integrating data from flat files, CSV files, and XML files into a common reporting and analytical Data Model using Erwin.
- Extensively worked on ETL Processes, Data mining, and Web reporting features for Data warehouses using Business Object.
- Designed and created Oracle Database objects: tables, Indices, Views, Procedures, Packages, and Functions.
- Experience in Developing SQL and PL/SQL code, migrate to test environment , and perform unit and integration testing.
- Experience in tuning the stored procedures, queries by changing the join orders .
- Experience in Pig, Hive, Spark,Scala,Python Sql for Hadoop Eco System.
- Strong experience with UNIX Korn Shell scripting.
- Extensively worked on Scheduling jobs.
- Excellent understanding of Client/Server architecture makes a valuable asset to any project, with Excellent Logical, Analytical, Communication and Organizational Skills.
- Excellent team player and can work on both development and maintenance phases of the project.
- Highly skilled in providing instantaneous and workable solutions to business problems .
TECHNICAL SKILLS:
BI Tools: OBIEE 11g/10g, OBIEE Financial App, Sales App,Business Object
ETL Tools: Informatica Power Center/ Power Mart 9.x/8.x/7.x/6.x, Spartk
Applications: Oracle R12, Siebel, SFDC, OBIEE Financial, Supply Chain, Procure/spend Apps
Tools: /Utilities: SQL*Loader, TOAD 7.x/8.x/9.x, Erwin 4.5
Languages: SQL, PL/SQL, UNIX shell scripting, Java 1.4, HTML 4+, CSS, JavaScript, C, C++,, TSO,ISPF,Cobol,DB2,JCL,Pig,Hive,Mapreduce,Python
Databases: Oracle 10g/9i/8i/7.3, TeraData V2R4/V2R3, DB2 UDB 7.1, MySql, MS Access, vsam, NoSQL,HBASE
PROFESSIONAL EXPERIENCE:
Senior Bigdata Developer
Confidential
- Analyze or transform stored data by writing Java Mapreduce or Pig jobs based on business requirements.
- Handled importing of data from various data sources, performed transformations using Hive, Map Reduce and loaded data into HDFS.
- Used Spark RDD and Dataframe for transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
- Participated in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meeting with various business users.
- Drives and manages development and application of data algorithms, models and corresponding documentation.
- Developed several advanced Map Reduce programs to process data files received.
- Used Sqoop to transform data between Oracle and HDFS.
- Developed PIG Latin scripts for the analysis of semi structured data.
- Developed code using Scala expressions to achieve multithreaded processing of real time data.
- Created Hive & Spark SQL queries for faster processing of data.
- Created the Dataframe and RDD from Parquest and Jason Files .
- Created Shell scripts to dump the data from MySQL to HDFS.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Written java code for file writing and reading, extensive usage of data structure ArrayList and HashMap.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Written Scala code for file writing and reading, extensive usage of data structure ArrayList and HashMap.
- Used SBT for packaging the scala program and executed through eclipse in Spark Framework.
- Designed and modified ETL Mappings using Informatica from Scratch to load the Data from Source System (EBS) to staging system (SDE mappings) and to the target Analytics Warehouse.
- Design and Manage execution of Extract, Load, and Transformation processes from source data to target using Informatica.
- Experience with creation, application and validation of appropriate Data algorithms / models to the data to produce clear business actionable results
- Created map reduce Jobs to produce a structure Data.
- Transfer the data from HDFS TO HBASE using pig, hive and Map reduce scripts and visualize the streaming data in dashboard tableau. managing and reviewing Hadoop Log files and also developed the Pig UDF's and Hive UDF's to pre-process the data for analysis.
- Automated the Business Process to schedule the Informatica Jobs.
- Using Universal adapters created new folders and mappings in Informatica to bring the data from other source systems.
- Developed Mappings using corresponding Source, Targets and Transformations like Source Qualifier, Sequence Generator, Filter, Router, Joiner, Lookup, Expression, Update Strategy and Aggregator.
- Did performance tuning to improve Data Extraction, Data process and Load time both in Database and Informatica.
- Implemented Time Comparison and Calculation measures in the business model using Time series wizard and modeled Slowly Changing Dimension Data.
- Created Teradata Tables, Views, and indexes to accomplish the new structure.
- Developed many complex Full/Incremental Informatica Objects (Workflows / Sessions, Mappings/mapplets) with various transformations.
- Developed AP Supplier Contact report that will provide detailed contact information about Vendors/Suppliers for communication purposes using Oracle Reports 10g.
Environment: Informatica 9.5, Linux, Linux, Hadoop 2.x, Hive, Mapreduce, Pig, Spark Obiee 11g, Teradata, Oracle 11g
Bigdata Developer
Confidential
- I nteracted with SME's, Business Analysts and Users for gathering and analyzing the Business Reports Requirements.
- Gathered requirements from the client for GAP analysis, translated them into technical design documents and worked with team members in making recommendations to close the GAP.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Configured Hadoop environment using Cloudera manager for different options like Data Directories, Log Configuration, Port Configuration.
- Developed custom MapReduce programs and User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
- Used Pig as ETL tool to do transformations, event joins, filter both traffic and some pre-aggregations before storing the data onto HDFS.
- Created Star Schema, Time Series Objects, Hierarchies, Level Based Measures and Security using External Table Authentication, Session Variables, and Initialization Blocks.
- Designed and modified ETL Mappings using Informatica from Scratch to load the Data from Source System (EBS) to staging system (SDE mappings) and to the target Analytics Warehouse.
- Written MapReduce code to process and parsing the data from various source and storing parsed data into HBase andHive using HBase -Hive Integration.
- Written and used complex data type in storing and retrieved data using HQL in HIVE .
- Created Mapreduce program to format the unstructured XML File
- Created Hive SQL queries to load the flat file to External Table.
- Written HiveUDF to sort Structure fields and return complex data type.
- Implemented Spark using Scala and SparkSQL for fast testing and processing of data.
- Implementing algorithm for analyzing using Spark.
- Wrote Hive UDFS to extract data from staging tables and analyzed the web log data using the Hive QL.
- Importing and exporting data using Sqoop to load data to and from Teradata to HDFS on regular basis.
- Implemented Oozie engine to chain multiple MapReduce, Hive jobs.
- Built big Data solutions using HBase handling millions of records for the different trends of data and exporting it to Hive.
- Worked using Catalog Manager to migrate Web Catalog among Instances.
- Software developer in Java Application Development, Client/Server Applications, Internet/Intranet based database
- Created UDF using python and executed through Hive
- Loaded external data from LINUX file system to HDFS. Responsible for managing data from multiple sources
- Built reports using Answers including drilldown objects, union based reports and formatted functions within the reports.
- Performed debugging for verifying the prebuilt ETL mappings for Oracle BI Applications.
- Created Teradata Tables, Views, and indexes to accomplish the new structure.
- Created Fastload and Multi load scripts in Unix to load Teradata database.
- Created Bteq scripts to test the warehouse daa.
- Developed many complex Full/Incremental Informatica Objects (Workflows / Sessions, Mappings/mapplets) with various transformations.
- Using Universal adapters created new folders and mappings in Informatica to bring the data from other source systems.
- Developed Mappings using corresponding Source, Targets and Transformations like Source Qualifier, Sequence Generator, Filter, Router, Joiner, Lookup, Expression, Update Strategy and Aggregator.
Environment: Informatica 9.5, Linux, Hadoop 2.x, Hive, Mapreduce, Pig,Spark, Obiee 11g, Oracle 11g,Pyhton
Confidential, Irvine, CA
BI Lead
- Full project life cycle implementation for Oracle Business Analytics Warehouse (OBAW)
- Analyzed the source system to understand the various attributes.
- Created high level and low level design document for Service Request Dashboard.
- Designed various Dimensions and Facts for sales performance and Support data mart.
- Created the sales performance plan in DAC Scheduling tool and associated the workflows.
- Upgraded Informatica from 8.6 to 9.1.
- Manages and participates in development, application and enhancement of data algorithms / models.
- Performing data migration from Legacy Databases RDBMS to HDFS using Sqoop.
- Writing Pig scripts for data processing.
- Implemented Hivetables and HQL Queries for the reports.
- Partitioned the Tables and Modified the Long running query to improve ETL and Dashboard
- Architect Struts framework to manage the project in MVC pattern.
- Developed the OOZIE workflows for the Application execution
- Written MapReduce code to process and parsing the data from various source and storing parsed data into HBase andHive using HBase -Hive Integration.
- Loading unstructured log data of user clicks into HDFS by using automated shell scripts.
- Developed CIPHER Encryption, DECRYPTION algorithm to decrypt and encrypt data while loading to and from HBASE.
- Implemented JAVA/J2EE design patterns such as DAO, and Singleton and created Beans.
- Designed Schema/Diagrams using Fact, Dimensions, Physical, Logical, Alias and Extension tables.
- Performance Tuned the SQL query and Mappings for better performance.
- Implemented the ETL Informatica mappings to Production
- Created the training document for User
- Used TPT Connection for Teradata to do Incremental Loading to Teradata database
- Supported the Mappings and resolved issues in Production.
- Responsible for installing, configuring, developing, support and maintenance of the BI APPS.
Confidential, Los Angeles, CA
Senior Informatica ETL/OBIEE Analyst
- Analyzed the current requirement and created the design and Specification document for SAP Integration Project to Finance system.
- Designed the Functional and Technical specification documents for the combined data users.
- Created the new mappings and modified existing mappings as per the requirement.
- Created Teradata Tables, Views, and indexes to accomplish the new structure.
- Designed the new SCD2 dimension structures and loaded them using Informatica 8.6.
- Created fast load, multi load script to load tables based on specifications.
- Generated the SAP ABAP code from mapping to pull the data from SAP Systems to staging environment.
- Developed versatile reusable mapplets, transformations and tasks for data cleansing and audit purposes across the team.
- Analyzed, Designed, and Developed OBIEE Metadata repository (RPD) that consists of Physical Layer, Business Mapping and Model Layer and Presentation Layer .
- Developed custom reports/Ad-hoc queries using Answers and assigned them to application specific dashboards.
- Designed and developed reports and dashboards
- Created usage analytics subject area, reports on usage
- Analyzed performance bottlenecks in OBIEE and implemented performance improvement techniques like
- Migrated the Mapping, Session and Workflows to Integration Environment.
- Configured and created OBIEE Repository, Involved in modifications of the physical, the BMM, and the presentation layers of metadata repository using OBIEE Administration Tool.
- Written the complex SQL query to update the Dimension and fact in post session sql.
- Created the Unix Scripts to handle the errors and send a notification to Support team for failure and successful message. Coordinate with external team to set up the environment.
- Created the application support guide for Level2 Team to support the application.
- Resolved system test and Production issues.
- Created proper documentation to describe program development, logic, coding, testing, changes and corrections.
Confidential, Los Angeles, CA
Sr. Programmer Analyst
- Designed the Data Warehousing ETL procedures for extracting the data from all source systems to the target system.
- Extensively used Transformations like Router, Lookup (connected and unconnected), Update Strategy, Source Qualifier, Joiner, Expression, Aggregator and Sequence generator Transformations.
- Worked extensively with dynamic cache with the connected lookup Transformations.
- Created, scheduled, and monitored workflow sessions on the basis of run on demand, run on time, using Informatica Power Center workflow manager.
- Design Siebel Web Components including OBIEE Answers and Intelligence Dashboards.
- Used DAC client to load the custom mapping to the warehouse.
- Used workflow manager for session management, database connection management and scheduling of jobs.
- Configured the session so that Power Center Server sends an Email when the session completes or fails.
- Debugged and sorted out the errors and problems encountered in the production environment.
- Determined various bottle necks and successfully eliminated them to great extent.
- Extensively worked on PL/SQL to write Stored Procedures to increase the performance and tuning the programs, ETL Procedures and processes.
- Created the various BI reports for Canadian receipt and inventory using Business Object.
- Wrote and modified Unix Korn Shell scripts to handle dependencies between workflows and log the failure information.
- Actively took part in the post implementation production support.
- Mentoring junior team members and fostering a learning environment.
- Loaded the Teradata tables using fast load and MultiLoad through Mainframe JCL.
- Executed BTEQ script to update the Teradata table using JCL.
Confidential, Wilkesboro, NC
Senior ETL Informatica /Teradata Developer
- Involved in Analysis, Requirements Gathering and documenting Functional & Technical specifications.
- Analyzed the specifications and identifying the source data that needs to be moved to the data warehouse.
- Partitioned lot of tables, which have frequent inserts, deletes and updates to reduce the contention and to improve the performance.
- Designed the Data Warehousing ETL procedures for extracting the data from all source systems to the target system.
- Extensively used Transformations like Router, Lookup (connected and unconnected), Update Strategy, Source Qualifier, Joiner, Expression, Aggregator and Sequence generator Transformations.
- Worked extensively with dynamic cache with the connected lookup Transformations.
- Designed and Optimized Power Center CDC and Load Mappings to load the data in slowly changing dimension.
- Created, scheduled, and monitored workflow sessions on the basis of run on demand, run on time, using Informatica Power Center workflow manager.
- Used workflow manager for session management, database connection management and scheduling of jobs.
- Managed offshore people for delivering the task and scheduled training as and when required to get the task done.
Confidential, Minneapolis, MN
ETL Developer
- Created Data models using Ralph Kimball's Star Schema strategies for Data warehouse development and maintenance.
- Responsible for Preparing ETL Strategies for Extracting Data from Different Data Sources like Oracle, SQL Server, Flat file.
- Creating mappings to load source data into target tables using transformations like the Joiner, Expression, Filter, Lookup, Update, Rank and Sorter.
- Design and Develop Pre-session, Post-session routines and batch execution routines using Workflow Manager to run Informatica sessions.
- Migrated the existing mappings to the production environment.
- Designed and documented Meta data dictionary.
- Unit testing of individual mappings and their integration testing in the workflow.