We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Mt Laurel, NJ

PROFESSIONAL SUMMARY:

  • 9+ Years of experience in IT industry in implementation of various Data Warehousing projects
  • Strong experience in analysis, design, implementation and testing activities of Enterprise Data warehouse and Business Intelligence applications
  • Worked in various Data warehousing technologies like Teradata and Data Stage
  • Working as ETL Developer with Teradata and Data Stage technologies
  • Extensive experience with Teradata Utilities like BTEQ, Fast Load, Multi Load, Tpump, TPT and Teradata SQL Assistant
  • Extensively experience with Performance Tuning
  • Hands - on experience in working with relational databases like Oracle, MS SQL Server, MS Access and DB2
  • Strong understanding of Data warehouse principles and methodologies - Ralph Kimball/Bill Inman. Experienced with designing and implementing star and snow flake schemas
  • Strong knowledge & experience in UNIX shell scripting and Python scripting
  • Hands-on experience with Job scheduling tools Autosys, TWS, UC4 and oozi.
  • Experienced in design and development of common re-usable ETL components and python scripts to define audit processes, Job monitoring and error logging processes.
  • Experienced in using advance Serde jars to handle XML, Json and COBOL inputs.
  • Experienced in working with SOA environment and familiar in using MQ Connector stage
  • Familiarity in using the service management and IT governance tools Service Now, JIRA.
  • Efficient in creating source to target mappings (ETL Mapping) from various source systems into target EDW (Enterprise Data warehouse) and documenting design specifications.
  • Very good experience with Hadoop tools like HDFS, Hive, Sqoop, KAFKA, Spark and Scala.
  • Strong understanding of SDLC and Agile Methodologies
  • Over 5+ years of working experience in health care domain( Confidential )
  • Over 2 years of working experience in Banking domain (Barclay’s Bank)

TECHNICAL SKILLS:

ETL Tool: Teradata 12,13,14 and 15 Utility Suite, IBM Web Sphere Datastage v.7.5, IBM Info Sphere Information Server 8.5, 8.7, 9.1.2, 11.1, Hive and Spark 2.xTalend

Databases: Oracle 10g/9i, Teradata 12, 13, 14 and 15, DB2 v10.1, MS SQL Server 2000

Environment: IBM AIX 5.3, AIX 6.1, Linux, Windows Xp/7

Others: SQL Plus, SQL*Loader, TOAD, Service Now ticketing tool, HPSM, Autosys 4.5, Tivoli Workload Scheduler, HPQC

Hadoop: HDFS, Hive, Sqoop, Python, Mango DB, Spark, Scala and Kafka(real time read)

PROFESSIONAL EXPERIENCE:

Confidential, Mt. Laurel, NJ

Hadoop Developer

Responsibilities:

  • Building replication of data warehouse in Data Lake.
  • Conversion of unformatted and variable length COBOL file into fixed width files using Python scripting.
  • Designed generic Python scripts to handle simple COBOL files according to copy books length specification.
  • Creation of Podium entities (Hive Tables) to feed the data which is been split according the segments and identifiers.
  • Ingestion of COBOL files using Podium to hive tables
  • Sqooping data from RDBMS to hdfs system (from Teradata and Oracle systems)
  • Oozy work flows for automate the complete lifecycle of data on daily, weekly and monthly schedules.
  • Responsible for user reports extraction by following the business rules and aggregations.
  • Creation of Hive ingestion, extraction and transformation with various hive performance techniques
  • Optimization of the hive queries.

Confidential, Sunnyvale, CA

Data Engineer

Responsibilities:

  • Working on multiple project to prepare Teradata reports and sending it to end customer.
  • Handling Historical run back from using TPT load.
  • Working on UC4 (Atomic) scheduler and Hive data base.
  • Responsible for leading offshore team to make sure on time deliverable of work.
  • End to End Teradata solution and Hadoop migration.
  • Built complete end to end solutions on Datalake.
  • Designed hive dynamic partition tables which replicates the Teradata data warehouse.
  • Extraction data from heterogeneous source like DB, Flat Files and real time tools like KAFKA and ingest it into HDFS as raw data which accumulates complete history.
  • Extraction of Delta and feed into the Hive dynamic partition tables (Bus Date)
  • Explode the nested data and make the data into flat for understanding of business.
  • Created Kafka events for the real time streaming data and feed them into Hadoop and merge the data and finally feed into the hive dynamic partitioned tables.
  • Made use of Orc file format, Vectorization for fast query performance, bucking for slicing the data further to achieve greater performance, dynamic partition to avoid complete file scan.

Confidential, Louis, Missouri

Sr. Teradata/Hadoop Developer

Responsibilities:

  • Extraction of data from difference data source using E.T.L tools
  • Involved in design of ETL jobs to apply transformation logic
  • Worked on complex mappings in Datastage
  • Used Fast Load and Multi Load utilities for loading data to staging and target tables respectively
  • Developed an automated script as part of innovation to pick the suitable utility based on volume of data dynamically
  • Worked on TPTLOAD & TPTSTEAM utilities
  • Used Teradata Export utilities for reporting purpose
  • Developed TWS Jobs to schedule the process based on frequency of sources.
  • Created dependencies on the source system by using hard dependency concept rather file watchers.
  • Involved in purging data based on retention period and developed automated process for purge the data
  • Recent Integration with Hadoop DataLake and using Hive a structured data Layer on top HDFS.
  • Had been successful in implementing Spark as a process back bone for Hive Queries and involved in writing scala functions to refine the unstructured and poor data to make it understand to ETL Layer.

Environment: Teradata TD14, Data Stage 8.7, Unix/Linux, TWS, RCS, HPSM ticketing, TPTLOAD & TPTSTEAM, Fast Load and Multi Load, Hive, Spark and Scala.

Confidential

Senior Teradata/ETL developer

Responsibilities:

  • Extensively used Datastage Designer and Teradata to build the ETL process which pulls the data and does the grouping techniques in job design.
  • Developed master jobs (Job Sequencing) for controlling flow of both parallel & server Jobs.
  • Good knowledge in parameterizing the variables rather than hardcoding directly, Used Director widely for monitoring the job flow and processing speed.
  • Based on the above analysis did performance tuning for improving job processing speed.
  • Developed Autosys jobs for scheduling the Jobs. These job includes Box jobs, Command jobs, file watcher jobs and creating ITG requests.
  • Closely monitor schedules and look into the failures to complete all ETL/Load processes within the SLA.
  • Designed and developed SQL scripts and extensively used Teradata utilities like BTEQ scripts, Fast Load, Multi Load to perform bulk database loads and updates.
  • After Completing ETL activities corresponding load file will be sent to Cube team for building Cubes.
  • Created spec doc’s for automating the manual processes.
  • Closely work with On-shore people and Business people to resolve critical issues occurred during load process.
  • Later all the UGAP jobs were migrated from Autosys to TWS, we were involved in end to end migration process and migrated successfully

Environment: Teradata TD14, Data Stage 8.5, Unix/Linux, DB2 v10.1, TWS maestro, HPSM, Mercury ITG,SQL,BTEQ,

Confidential

Teradata Database Analyst

Responsibilities:

  • Was been part of Data modelling, used Erwin tool to build the architecture.
  • Has written Job Control routines to invoke one DS job from another. Also designed a custom BuildOp stage to suit our project requirements, which reused in the jobs designed.
  • Used the Administrator to set the environmental variables and user defined variables for different environments for different projects
  • Incorporated error handling in job designs using error files and error tables to log error containing data discrepancies to analyze and re-process the data.
  • Designed wrapper Unix shell scripts to invoke the DS Jobs from Command line
  • Developed the TWS schedules & Jobs to schedule the running of jobs
  • Job Parameters were used extensively to parameterize the jobs.
  • Troubleshooting the technical issues aroused during the load cycle
  • Used Cognos to provide the reporting data to business
  • Implemented enhancements to the current ETL programs based on the new requirements.
  • Involved in all phases of testing and effectively interacted with testing team.
  • Documented ETL high-level design and detail level design documents.

Environment: Teradata TD13, Data Stage 8.5, Unix/Linux, DB2, Oracle 10g, TWS maestro, HPSM, Mercury ITG.

Confidential

Teradata Developer

Responsibilities:

  • Involved in writing ad-hoc SQL queries which involves sub queries, joins, functions etc.
  • Extensively used BTEQ scripts to handle the errors and to load the target tables.
  • Extensively worked on query tuning to achieve better performance using Explain Plan.
  • Worked with Fast Load scripts to the load the data into source tables.
  • Eliminated spool space issues by creating proper indexes and statistics.
  • Performed data reconciliation checks across the various source and target tables.
  • Done error handling in Fast Load too.
  • Creating Unit test cases and implemented unit test results
  • Worked on Impact Analysis and Implementation Plan.

Environment: Teradata TD12, Mainframes, SQL, BTEQ,. Fast Load.

Confidential

Teradata Developer

Responsibilities:

  • Maintaining the Daily Process and schedules in healthy and fixing the issues in time.
  • Managing history of the failed jobs in incidents and fixing them by identifying the root cause.
  • Solving the different types of issues related to users and maintaining SLAs
  • Sending the reports to different clients in time of daily loads.
  • Continuous checking of health of the databases and prioritizing the schedules.
  • Maintaining the issue logs for future and to easy the process further.
  • Doing resource management so that the process is smooth and no delays in critical streams
  • Maintain Knowledge Repository with current materials and other documents.

Environment: Teradata TD12, HP Service Centre, Tivoli Work Schedule 8.4, UNIX, Mainframes.

Confidential

Database Developer

Responsibilities:

  • Migrating the code from Mainframe environment into UNIX environment
  • Analysis and Review of the specification (Design) documents.
  • Coding all members required for project development i.e. BTEQ, MLOAD AND FLOAD Import scripts as per Confidential Standards.
  • Execution of the scripts.
  • Preparation of Unit Test Plan and Unit Test Cases.
  • Performing Code Review as per Barclay’s code standards.
  • Execution of the scripts and preparation of Unit Test Report.
  • Performing Reconciliation check between source and target.

Environment: Teradata TD12, IBM JSC 8.4, UNIX, Mainframes, Unit Test Plan and Unit Test, BTEQ, MLOAD AND FLOAD Cases.

Hire Now