We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Columbus, OH

SUMMARY:

  • 10 years of total Software development experience with Hadoop Ecosystem, Big Data and Data Science Analytical Platforms, and Enterprise - level Cloud Base Computing and Applications.
  • Working on Agile Methodology for more than 6 Years.
  • Around 3years of experience in Design and Implementation of Big data applications using Hadoop stack Spark Hive, Pig, Oozie, Sqoop, Flume, HBase and NoSQL Data bases.
  • Hands on experience in writing complex Hive, NiFi and data modeling.
  • Have experience creating batch style distributed computing applications using Apache Spark and Flume.
  • Have hands-on experience in SPARK SQL and usage of Hadoop Architecture frameworks and various components
  • Experience and in-depth understanding of analyzing data using HIVEQL, PIG.
  • Worked extensively with HIVE DDLs and Hive Query language (HQLs).
  • Good hands-on experience with PIVOTAL'S query processing model HAWQ.
  • In-depth understanding of NoSQL databases such as HBase.
  • Proficient knowledge and hands on experience in writing shell scripts in Linux.
  • Adequate knowledge and working experience in Agile & Waterfall methodologies.
  • Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
  • Have a fairly good understanding of Kafka.
  • Experienced in job workflow scheduling and monitoring tools like Oozie and ESP.
  • Experience using various Hadoop Distributions (Cloudera, Hortonworks etc.) to fullyimplement and leverage new Hadoop features
  • Experienced in requirement analysis, application development, application migration and maintenance using Software Development Lifecycle (SDLC).
  • Experience in ETL tools like Informatica Power Center (Repository Manager,Mapping Designer, Workflow Manager and Workflow Monitor).
  • Hand on experience in reporting tools such as Microstrategy and Tableau
  • Hands-on experience on working with schedulers like ESP, DAC, AutoSys, Control-M,SOS- Berlin

TECHNICAL SKILLS:

Hadoop/BigData: HDFS, HBase, Pig, Hive, Sqoop, Flume, MongoDB, oozie, Zookeeper

ETL Tools: Informatica 9.6, 9.5.1, 9.1, 8.6

Business Intelligence Tools: R Studio, Tableau 9.1, MicroStrategy,MS Excel - Analytical Solver

Databases & Tools: Teradata 13, Netezza, Oracle 11g

Scheduler: Database Administration Console 10(DAC), Autosys.

Project Planning & Tracking: HP ALM, JIRA

Content management: Confluence

Release management: TFS, Tortoise SVN

Programming: R, Python, SPARK,SQL

Database & Skills: Oracle 11 g,Netezza

Operating Systems: Windows 2000, NT, XP, UNIX.

PROFESSIONAL EXPERIENCE:

Confidential - Columbus, OH

Hadoop Developer

Responsibilities:

  • Spear lead the POC and migration of data warehouse from Informatica 9.1 to Hadoop with Hadoop too tools HDFS, Hive, Pig and NiFi.
  • Hands on experience in data ingestion using NiFi
  • Created reusable NiFi templates to load the data from different source systems into Raw layer.
  • Experienced on loading and transforming of large sets of structured and semi structured datafrom HDFS through Sqoop and placed in HDFS for further processing.
  • Designed appropriate partitioning/bucketing schema to allow faster data retrieval during analysis using HIVE.
  • Involved in processing the data in the Hive tables using HQL high-performance, low-latency queries.
  • Transferred the analyzed data across relational database from HDFS using Sqoop enabling BI team to visualize analytics.
  • Developed custom aggregate functions using Spark SQL and performed interactive querying.
  • Managing and scheduling Jobs on a Hadoop cluster using Airflow DAG.
  • Involved in creating Hive tables, loading data and running hive queries in those data.
  • Extensive working knowledge of partitioned table performance tuning, compression related properties in Hive.
  • Work with Data Engineering Platform team to plan and deploy new Hadoop Environments and expand existing Hadoop clusters.

Environment: Informatica,Teradata,Tableau, MicroStrategy 10.7,CDH 5.4.5, Hive1.2.1, HBase1.1.2, Flume1.5.2, MapReduce, Sqoop1.4.6, Spark,NiFi(Standalone Cluster), Nagios, Shell Script, Oozie 4.2.0, Zookeeper 3.4.6.

Confidential

Hadoop Developer

Responsibilities:

  • Working on projecting involving migration of data from the mainframes to HDFS data lake and creating reports by performing transformations on the data put in the Hadoop data lake.
  • Built python script to extract the data from the Hawq tables and generated a "dat" file for the downstream application
  • Built a generic framework to parse raw data with fixed length using python which takes JSON
  • Layout for the fixed positions of the strings and load the data into Hawq tables.
  • Built generic framework that transforms two or more data sets in HDFS using python.
  • Built generic frameworks for Sqoop/Hawq to load data from SQL server to HDFS and HDFS to
  • Hawq using python.
  • Performed extensive data validation using Hawq partitions for efficient data access.
  • Built generic framework that allows for us to update the data in a Hawq tables using python.
  • Coordinated in all testing phases and worked closely with Performance testing team to create a baseline for the new application.
  • Created automated workflows that schedule jobs daily for loading data and other transformation jobs in using CA-ESP.
  • Developed functions using PL python for various use cases.
  • Documented technical design documents and production support documents.
  • Wrote python scripts to create automated workflows.
  • Technology Platforms: PHD-2.0, HAWQ 1.2, SQOOP 1.4, Python 2.6, SQL

Environment: Informatica, Netezza,H adoop HDP 2.1, Oracle, SQL Server, Zookeeper3.4.6, Oozie 4.1.0, MapReduce, YARN,2.6.1, HDFS, Sqoop1.4.6, Hive 1.2.1, Pig 0.15.0.

Maersk Line - Charlotte, NC

ETL Lead

Responsibilities:

  • Analyzing content and quality of databases, recommending data management procedures, and developing extraction/ ETL processes.
  • Acted as an offshore coordinator and lead the team in offshore by providing them mapping documents and acted as a point of contact for the onsite team.
  • Documented user requirements, translated requirements into system solutions and develop implementation plan and schedule.
  • Responsible to migrate the Informatica code from one environment to another by creating the xml files using informatica repository manager.
  • Developed informatica mappings to load the data into dimension and Fact tables.
  • Analyzed the business requirements and functional specifications.
  • Extracted data from oracle database and spreadsheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
  • Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
  • Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
  • Developed complex mappings in Informatica to load the data from various sources.
  • Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
  • Parameterized the mappings and increased the re-usability.
  • Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
  • Created procedures to truncate data in the target before the session run.
  • Extensively used Toad utility for executing SQL scripts and worked on SQL for enhancing the performance of the conversion mapping.
  • Created the ETL exception reports and validation reports after the data is loaded into the warehouse database.
  • Written documentation to describe program development, logic, coding, testing, changes and corrections.
  • Created Test cases for the mappings developed and then created integration Testing Document.
  • Followed Informatica recommendations, methodologies and best practices.

Environment: Informatica 9.6,Oracle,OGG, Teradata, HP ALM,Teradata GCFR Frame Work, Teradata BI Temporal Framework, IBM WS MQ, SAP-BO-XI

FICO - San Jose, CA

ETL Developer

Responsibilities :

  • Analyzed the business requirements and functional specifications.
  • Extracted data from oracle database and spread sheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
  • Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
  • Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
  • Developed complex mappings in Informatica to load the data from various sources.
  • Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
  • Parameterized the mappings and increased the re-usability.
  • Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
  • Created Test cases for the mappings developed and then created integration Testing Document

Environment: Traid Importer, X-Book, Query center, Informatica9.1, SQL Developer, Oracle 11g, MySql

Confidential

ETL Developer

Responsibilities:

  • Analyzed the business requirements and functional specifications.
  • Extracted data from oracle database and spread sheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
  • Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
  • Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
  • Developed complex mappings in Informatica to load the data from various sources.
  • Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
  • Parameterized the mappings and increased the re-usability.
  • Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
  • Created Test cases for the mappings developed and then created integration Testing Document.

Environment :Informatica9.1, SQL Developer, Oracle 11g, SQL Server 2005, Netezza, Sybase, UNIX

Confidential

ETL Developer

Responsibilities:

  • Developed Informatica mappings to load the data into dimension and Fact tables.
  • Analyzed the business requirements and functional specifications.
  • Extracted data from oracle database and spread sheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
  • Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
  • Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator
  • Parameterized the mappings and increased the re-usability.
  • Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.

Environment: Informatica 9.1,8.6, SQL Developer, Oracle 11g, Netezza, QC, DAC, UNIX.

Confidential

ETL Developer

Responsibilities:

  • Used Informatica to populate data into staging area and Warehouse, Operational data store.
  • Created transformations and mappings including expression, aggregators, Filter router, joiner, and lookup.
  • Experience of Slowly Changing Dimensions.
  • Parameterized the mappings and increased the re-usability.
  • Written documentation to describe program development, logic, coding, testing, changes and corrections.
  • Created Test cases for the mappings developed and then created integration Testing Document. Followed Informatica recommendations, methodologies and best practices.
  • Extensively used Toad utility for executing SQL scripts and worked on SQL for enhancing the performance of the conversion mapping.

Environment :Informatica8.6, Toad, Oracle 10g.

We'd love your feedback!