We provide IT Staff Augmentation Services!

Big Data Application Hadoop Developer Resume

2.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • Twelve years of IT experience in System Analysis, Design, Development, Testing, Implementation and Production support of Data warehousing applications.
  • Expertise in Data Modeling, Data Analysis, Data Profiling, Data Extraction, Data Transformation and Data Loading.
  • Nine plus years of experience in ETL & Business Intelligence using IBM Info Sphere Data Stage /7.x (Parallel Extender and Server), Informatica Power Center 9.x/8.x, IDQ and IBM Cognos Report.
  • Three years of experience in Big Data platform using Apache Hadoop and its ecosystem.
  • Expertise in ingestion, storage, querying, processing and analysis of Big data.
  • Experienced in using Pig, Hive, Sqoop, Oozie, Flume, HBase and Hcatalog.
  • Good experience with teh Hive Query optimization and Performance tuning.
  • Hands on experience in writing Pig Latin Scripts and custom implementations using UDF'S.
  • Good experience with Sqoop for importing data from different RDBMS to HDFS and export data back to RDBMS systems for ad - hoc Reporting.
  • Experienced in batch job workflow scheduling and monitoring tools like Oozie.
  • Extended Hive and Pig core functionality by writing custom UDFs.
  • Experienced in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
  • Strong experience in building Dimension and Fact tables for Star Schema for various databases Oracle 11g/10g/9i/8i, Teradata v2r6 / v2r12, IBM DB2 UDB 9.1/9.7, MS SQL Server 8.0/9.0.
  • Hands on experience in managing and supporting large Data Warehouse applications, Developing and Tuning of PL/SQL scripts, complex Data Stage ETL routines, BTEQ, FASTLOAD, MLOAD scripts.
  • Experienced with various scheduling tools like Autosys, Control-M and Maestro Job Scheduler.
  • Experience in designing both logical and physical data models for large-scale data warehouse implementations using Data Modeling Tool - Erwin.
  • Established best ETL standards, Design ETL framework, reusable routines and validating data.
  • Expertise in all phases of Software development Life cycle (SDLC) - Project Requirement Analysis, Design, Development, Unit Testing, User Acceptance Testing, Implementation, Post implementation Support and Maintenance.
  • Strong functional noledge of Master Data management (MDM), Data Quality, Data Profiling and Meta Data Management life cycle.
  • Experience with UNIX, Linux scripting to automate teh jobs and ETL routines.
  • Developed ad-hoc Reports using IBM Cognos Report Author and Siebel Analytics.
  • Good Working noledge of writing Python scripts.
  • Worked as technical team lead/off shore coordinator in leading multiple projects for timely deliverables.

TECHNICAL SKILLS

Hadoop: HDFS, Map Reduce, HDFS, Oozie, Hive, Pig, Sqoop, HCatalog, Flume and HBase

Data warehouse Tools: Data Stage 9.1/7.5( Parallel and Server edition), Informatica Power Center 9.x/8.6.3/8.6.0/7.1.3/ 6.2/6.1, IDQ (Informatica Developer), IDE (Informatica Analyst), IBM Data Stage 7.3.1, IBM Cognos 8.4 BI (Report Author, Framework Manager), Informatica Metadata Manager(Administrator, Analyzer), SAP Analytics Web 7.8(OBIEE)

Operating Environment: HP-UNIX, IBM AIX 5.2, IBM Mainframes, MS Windows 95/98/NT/2000/XP, Sun Solaris.

RDBMS Tools: TOAD 7.x/8.5, PL/SQL Developer

Databases: Oracle 11g/10g/9i/8i/7.x, Teradata V2R5, DB2, SQL server 2000, DB2

Languages: SQL, PL/SQL, BTEQ, C, C++, SAS 8.0, ABAP 4.3C,JAVA

Scripting: Unix, Linux shell scripting

Scheduling Tools: Autosys, Tivoli (Maestro), Control-M

Data Modeling: Erwin 4.0 and Visio 2007

Tools/Utilities: SQL*Plus, TOAD, Teradata SQL Assistant 6.1, Multiload, Fastload, BTEQ Win, SQL*Loader

Other Tools: HP-Quality Center, PVCS Serena Manager. Visio 2007

XML: XML,HTML,DTD, XML Schema

Methodologies: Agile, UML, Waterfall

PROFESSIONAL EXPERIENCE

Confidential

Big Data application Hadoop Developer

Role and Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in requirement analysis, design, coding and implementation.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Developed Sqoop scripts to import/export data from relational sources and handled incremental loading on teh customer, transaction data by date.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Worked on partitioning HIVE tables and running teh scripts in parallel to reduce run-time of teh scripts.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
  • Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
  • Developed job flows in Oozie to automate teh workflow for extraction of data from warehouses and weblogs.
  • Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing teh data onto HDFS.
  • Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
  • Developed Pig routines for data cleansing and preprocessing.
  • Used Multiple Outputs class in Map Reduce jobs to name teh output files.
  • Created sequence files to store data as binary format using Map Reduce program.
  • Used different file formats like Text files, Sequence Files, JSON and Avro.
  • Worked on shell scripting.

Environment: Hadoop, PIG, HIVE, Sqoop, Flume, Oozie, Java (Jdk 1.6), Eclipse, HBase, LINUX, UNIX Shell Scripting.

Confidential

ETL Lead/Data Stage Developer

Role and Responsibilities:

  • Involved in designing teh ODS data model as per ACCORD model and DW architecture using Star Schema. Identifying teh Fact, Dimension, Junk and Bridge dimension tables.
  • Involved in designing Logical and physical models using ERWIN.
  • Analyzed functional requirements. Designed and implemented ETL framework using Data Stage.
  • Designed, developed teh Data Stage jobs, job sequences and Shared Containers to implement teh business requirements and ETL data flows diagrams.
  • Implemented CDL (Change Detection logic) & IL( Incremental Load) pattern by writing complex SQL scripts to balance teh load between Oracle and Data Stage server.
  • Lead teh team of 5 on-site and offshore developers by providing technical solutions, and managing offshore developer’s tasks on daily basis.
  • Created reusable objects such as containers, routines to handle teh reusability of code.
  • Re-designing teh ETL load flow by eliminating teh intermediate steps in Data Stage to get significant ETL performance gains.
  • Involved in setting up teh best ETL standards and halped teh client to use ETL best practices.
  • Designed Exception error logging and Audit logging framework using Data Stage routines.
  • Wrote ETL technical specification document, and developed Common ETL Project Templates.

Environment: Data Stage 9.1/7.5.1, Windows, Mainframe, Web Focus, Erwin, PL/SQL developer, Oracle 11g.

Confidential

Sr. Informatica Developer

Role and Responsibilities:

  • Worked closely with Business users and Project Manager to get Business Requirements.
  • Created, documented all project deliverables as per GE policies.
  • Created TDD (Technical Design document) and established best practices.
  • Designed, developed ETL routines to extract flat file data and load into Proficy Scheduler.
  • Performed Unit, Integration, System Test and UAT.
  • Developed Error handling routines to handle file transfer failure and data recovery from failure.
  • Responsible for completing code migration, production readiness review to ensure complete and accurate migration to various environments.

Environment: Informatica Power Center 8.1.1, HPUX 11.31, Proficy Scheduler, ERP.

Confidential

GSr. Informatica Developer

Role and Responsibility:

  • Involved in all Phases of SDLC including Requirement, Analysis, Design, Development and implementation phases.
  • Participated in design, code reviews and implementation of best ETL methodologies.
  • Analyzed Business Requirements and translated them into Technical Specifications and ETL data mapping documents.
  • Facilitated meetings with Business users and Business Analyst to resolve teh functional gaps.
  • Designed and developed ETL framework for teh loading of Data Warehouse.
  • Estimated data size and data growth and determined teh space for DW.
  • Designing Aggregation, Indexing and partitioning strategies for teh warehouse.
  • Involved in Data quality, Data profiling and recommended best data quality measures.
  • Developed ETL routines, UNIX shell wrappers to perform FTP and run ETL batch jobs.
  • Facilitated design sessions with ETL governance team for implementing best practices.
  • Participated actively in end-to-end testing includes System Integration Testing, User acceptance Testing, Pre-production and Post-production activities.

Environment: Informatica Power Center 8.6.0, Oracle 10.3.0.2, HPUX 11.31, MS Office, HP-Quality center, PVCS Serena,Harvest CM Workbench7.1.123, Control-M, CMFast, SAP-BO.

Confidential,Cincinnati,OH

Sr. Informatica Developer

Role and Responsibility:

  • Extensively worked on Migration strategy, Execution, Testing and Implementation of teh project.
  • Identified dependency Unix/Maestro scripts and their migration process.
  • Prepared migration plan for Unit Test, System Test, Performance Test and UAT.
  • Prepared inventory of Maestro jobs, converted and loaded into TIDAL scheduler.
  • Identified and converted FTP scripts to SFTP scripts.
  • Mentored 3 members team of ETL developers and testers and conducted code reviews.
  • Interacted closely with EES application support team to get resolves teh conflicts/technical gaps.
  • Prepared and validated Test Scenarios and Test cases with EES Support team.
  • Analyzed teh current state of all EES applications in AS-IS environment and plan for migrating/upgrading current environment to LINUX environment with minimal changes.
  • Validated Development, QA and Production Environments as per functional requirements.

Environment: Informatica Power Center 7.1.3, SAP,Oracle 9i, SQL, Microsoft-Visio,Unix-HP, Windows XP, HP-Quality center, PVCS Serena, Documentum, Tivoli Job scheduling tool.

Confidential,Cincinnati,OH

Sr. Informatica Developer

Role and Responsibility:

  • Analyzed Business/Functional specifications and translate them to technical specifications.
  • Closely interact with Business users to complete data mappings specs between source and target.
  • Established best practices for data movement, exceptional error handling, data recovery between source databases to target database.
  • Identified numerous gaps between AS-IS and TO-BE systems and halped Users to resolve them.
  • Designed, developed and implemented complex ETL routines using Informatica and Shell scripts.
  • Participated in testing procedures, test strategy, test plans for System and User acceptance testing.
  • Interacted with QA team to resolve teh issues timely manner and meet deliverables.
  • Written Unix Shell scripts to automate batch load jobs in Maestro.

Environment: Informatica Power Center 7.1.1, SAP,Oracle 9i, Microsoft-Visio,Unix-HP, Windows XP, HP-Quality center, PVCS Serena, Documentum, Tivoli Job scheduling tool.

Confidential, CA

Datawarehouse Informatica Developer

Role and Responsibility:

  • Automated manually generated Ad-hoc Reports using Informatica Mappings.
  • Performed Administration tasks like Creating Repositories, Users, and Assigning privileges.
  • Created technical design documents, as required, used to define ETL mappings.
  • Designed, Developed and Tested Workflows/Worklets according to Business Process Flow.
  • Converted existing Stored Procedures to ETL Mappings.
  • Worked on Siebel Answers 7.7 to generate various ad-hoc reports.

Environment: Informatica 7.1.1, Siebel Analytics 7.7.2,SQL server 2000, SQL, DTS, Windows XP.

Confidential, Santa Clara, CA

Encover Informatica Developer

Role and Responsibility:

  • Analyzed Functional specs and prepared Technical design docs.
  • Developed interface Mappings and Mapplets as per business logic.
  • Written ABAP programs in Source Qualifier to retrieve data from SAP system in real-time.
  • Developed Unix Shell Scripts to automate teh ETL Batch process by using Meastro.

Environment: Informatica 6.1, PowerConnect, Oracle 9i, UNIX, Windows XP, SAP R/3, ABAP, Meastro.

Confidential, Akron, OH

Datawarehouse Informatica Developer

Role and Responsibility:

  • Designed and developed ETL routines, Mapplets, reusable transformations and Mappings.
  • Desigend, developed, implemented, tested and validated complex Mappings/Workflows.
  • Extensively used ETL to load data from Oracle 9i, SAP R/3 and flat files to Teradata.
  • Developed, enhanced and validated BTEQ scripts to load teh data into Teradatabase EDW.
  • Identified teh bottlenecks. Tuned and optimized ETL processes for better performance.

Environment: menformatica 6.2, Teradata 6.0, BTEQ, Oracle 9i, VSAM, UNIX, Windows XP, Autosys 4.0.

We'd love your feedback!