We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY:

  • Over 10+ years of Experience in software systems Development client/server Business systems with strong Data Warehousing experience on Extraction, Transformation and Loading (ETL) processes using Big Data Hadoop, IBM Information server 11.3/8.5/9.1/ Data Stage 7.5/7.1 and Quality Stage.
  • Have 2 years of experience with Hadoop Ecosystem including HDFS, MapReduce, Impala, Hive, Scoop, Yarn, Cloudera and PIG.
  • Efficient in all phases of the software development lifecycle, coherent with Data Cleansing, Data Conversion, Performance Tuning, Unit Testing, System Testing, User Acceptance Testing.
  • Experience in analyzing Star Schema/Snowflake Schema, Dimensional Data Modeling.
  • Extensively worked with Parallel Processing for splitting bulk data into subsets to distribute the data to all available processors to achieve best job performance.
  • Strong in process creation for both Server and Parallel in Datastage.
  • Data migration, synchronization, consolidation and cleansing of operational systems, such as legacy, ERP and CRM applications, to enable strategic, tactical and operational business intelligence.
  • Experience in evolving strategies and developing Architecture for building functional OLTP systems from the OLAP systems.
  • Experience in designing SQL queries, stored procedures, triggers, scripts, cursors Create, maintain, modify, and optimize SQL Server databases.
  • Strong analytical, problem solving skills with excellent communication and interpersonal skills.

TECHNICAL SKILLS:

Knowledge Domains: Insurance, Banking, Human resource and Retail.

Hadoop Ecosystem: HDFS, Impala, Hive, Pig, Spark

Web Technologies: HTML, Java Script

Platform/Technologies: IBM Information server 11.3/9.1/8.1/ Data Stage 7.5/7.1,Unix, DB2 Oracle 10g, 11i

Tools: /Packages: Control - M, Actimize 2.0.9/4.0,XML

PROFESSIONAL EXPERIENCE:

Confidential, Chicago, IL

Hadoop Developer

Responsibilities:

  • Processing the Delta from CDC subscription to Operational Data Store (ODS) in HDFS using Impala.
  • Developing ETL using Hive, Impala based on the Source to Confidential mapping provided by Data Governance team.
  • Providing the data extracts in Parquet format to Analytics team from ODS using Impala.
  • Ensure Data security for EDW sensitive data in/out of HDFS using HPE encryption during Sqoop phase.
  • Developed Data migration procedure to import data from Legacy systems (DB2, AS400) into HDFS.
  • Generating the Metadata code artifacts in HDFS for the Source tables using Python.
  • Processing the CDC data into HDFS for DB2, AS400 source system tables using Data Stage.
  • Automation of incremental data to HDFS using Java based spring batch workflow management, Control-M scheduler.
  • Scheduling and Maintaining batch job dependencies using Control-M.
  • Responsible for Operational Data Store (ODS) layer to EDW in Hadoop.
  • Supporting the Business critical Batch jobs in Hadoop to maintain the SLAs.
  • Developed Enterprise Data Platform Support Model and transitioned to application support team

Environment: Oracle 11i, UNIX 5.1, Control-M, Impala, Cloudera, YARN

Confidential, Chicago, IL

Sr.ETL Developer

Responsibilities:

  • Understanding and gathering the project requirements & detail analysis.
  • Developed detailed development Strategy for the entire application and developed various jobs by consuming data from different source to landing tables and transforming to base tables.
  • Resolving issues with the business units, developers and effectively coordinating development efforts with other areas/teams
  • Ran the ETL process through Control M scheduler.
  • Effectively used DataStage Manager to Import/Export projects from development server to production server. Parameterized jobs for changing environments.
  • Proficient in Data analysis, Data modeling, Database design, Data migration and Data acquisition using the Datastage.
  • Developed Jobs using DataStage 8.x to load the data from different sources such as database tables and sequential files to the Confidential database tables.
  • Involved in testing of Stored Procedures and Functions, Unit and integration Testing of DataStage Jobs, Batches, fixing invalid Mappings.
  • Written shell scripts to schedule the job sequences on the DataStage server.
  • Involved in implementing DataStage Components to set up the repository using DataStage Manager, Utilizing the Manager tools to import the source and Confidential database schemas
  • Created and tested ETL processes composed of multiple DataStage jobs using DataStage Job Sequencer.
  • Developed Shell scripts to automate the processes. Developed Shared Containers and re-used in multiple jobs.
  • Created database tables and partitioned the tables to optimize the data retrieval.

Environment: Oracle 11i, UNIX 5.1, Control-M, Datastage 8.5/11.3

Confidential, New York, NY

Team Lead

Responsibilities:

  • Developed Jobs using DataStage 8.x to load the data from different sources such as database tables and sequential files to the Confidential database tables.
  • Understanding the client requirements by studying Approach Document, functional document & prepared technical specification.
  • Developed ETL jobs to load the data into the Confidential tables (HCM).
  • Used DataStage Designer for Exporting and importing the jobs between development, testing and the production servers.
  • Created job sequencers to automate the ETL process.
  • Involved in performance tuning by creating the indexes in the database level & modified the ETL jobs to run in parallel wherever required.
  • Involved in Unit testing and supported System Integration testing, Quality Assurance Testing, User acceptance testing and Cut-Over Activities.
  • Involved in testing of Stored Procedures and Functions, Unit and integration Testing of DataStage Jobs, Batches, fixing invalid Mappings.
  • Involved in migrating the code from one environment to the other.
  • Replaced manual extraction with automatic process using DataStage tool
  • Extensively wrote user SQL coding for overriding for generated SQL query in DataStage.
  • Work with the development team and do a constant reviews of milestones and intermediate screen reviews and provide feedback on functionality and usability in both ETL Datastage and Batch jobs.

Environment: PeopleSoft EPM 8.9, Oracle 10g, UNIX, Datastage 8.1

Confidential, New York, NY

Sr. ETL Developer

Responsibilities:

  • Understanding the client requirements by studying Approach Document, functional document & prepared technical specification.
  • Used the DataStage Designer to develop processes for extraction, cleansing, and transforming, integrating and loading data into data warehouse database.
  • Extensively wrote user-defined SQL coding for overriding Auto generated SQL query in DataStage.
  • Used the DataStage Designer to develop processes for extracting, cleansing, transforms, integrating and loading data into database.
  • Developed ETL jobs to load the data into the Confidential tables (HCM).
  • Created job sequencers to automate the ETL process.
  • Involved in performance tuning by creating the indexes in the database level & modified the ETL jobs to run in parallel wherever required.
  • Involved in Unit testing and supported System Integration testing, Quality Assurance Testing, User acceptance testing and Cut-Over Activities.
  • Involved in migrating the code from one environment to the other.

Environment: Datastage 9.1, Oracle 11i, UNIX 5.1, Qualify

Confidential, New York, NY

ETL Developer

Responsibilitilies:

  • Understanding the requirement specifications
  • Created all the supporting documents like process flows, LLD’s, HLD’s & STM document.
  • Created the Visio’s for control M flow and for High level design.
  • Used DataStage Director to execute and monitor the jobs.
  • Reviewing of Jobs as per LLD.
  • Developed Oracle Queries to load data from source to foundation tables.
  • Extracting data, cleansing, transforming, integrating, and loading data into Data Warehouse.
  • Design and Develop the ETL Parallel jobs in Data Stage.
  • Experience in Unix Shell Scripting.
  • Created the Visio’s for control M flow and for High level design.
  • Involved in logging error messages and fix the errors.
  • Preparation of Test Scenarios, Test Cases and Test Data
  • Coordinating with the onsite team for requirement.

Environment: IBM InfoSphere DataStage server 8.1, PeopleSoft EPM Application Version 9.0 People tools Version 8.5, Unix 5.1, Oracle 11g.

We'd love your feedback!