Hadoop Developer Resume Columbus, OH - Hire IT People

SUMMARY:

10 years of total Software development experience with Hadoop Ecosystem, Big Data and Data Science Analytical Platforms, and Enterprise - level Cloud Base Computing and Applications.
Working on Agile Methodology for more than 6 Years.
Around 3years of experience in Design and Implementation of Big data applications using Hadoop stack Spark Hive, Pig, Oozie, Sqoop, Flume, HBase and NoSQL Data bases.
Hands on experience in writing complex Hive, NiFi and data modeling.
Have experience creating batch style distributed computing applications using Apache Spark and Flume.
Have hands-on experience in SPARK SQL and usage of Hadoop Architecture frameworks and various components
Experience and in-depth understanding of analyzing data using HIVEQL, PIG.
Worked extensively with HIVE DDLs and Hive Query language (HQLs).
Good hands-on experience with PIVOTAL'S query processing model HAWQ.
In-depth understanding of NoSQL databases such as HBase.
Proficient knowledge and hands on experience in writing shell scripts in Linux.
Adequate knowledge and working experience in Agile & Waterfall methodologies.
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
Have a fairly good understanding of Kafka.
Experienced in job workflow scheduling and monitoring tools like Oozie and ESP.
Experience using various Hadoop Distributions (Cloudera, Hortonworks etc.) to fullyimplement and leverage new Hadoop features
Experienced in requirement analysis, application development, application migration and maintenance using Software Development Lifecycle (SDLC).
Experience in ETL tools like Informatica Power Center (Repository Manager,Mapping Designer, Workflow Manager and Workflow Monitor).
Hand on experience in reporting tools such as Microstrategy and Tableau
Hands-on experience on working with schedulers like ESP, DAC, AutoSys, Control-M,SOS- Berlin

TECHNICAL SKILLS:

Hadoop/BigData: HDFS, HBase, Pig, Hive, Sqoop, Flume, MongoDB, oozie, Zookeeper

ETL Tools: Informatica 9.6, 9.5.1, 9.1, 8.6

Business Intelligence Tools: R Studio, Tableau 9.1, MicroStrategy,MS Excel - Analytical Solver

Databases & Tools: Teradata 13, Netezza, Oracle 11g

Scheduler: Database Administration Console 10(DAC), Autosys.

Project Planning & Tracking: HP ALM, JIRA

Content management: Confluence

Release management: TFS, Tortoise SVN

Programming: R, Python, SPARK,SQL

Database & Skills: Oracle 11 g,Netezza

Operating Systems: Windows 2000, NT, XP, UNIX.

PROFESSIONAL EXPERIENCE:

Confidential - Columbus, OH

Hadoop Developer

Responsibilities:

Spear lead the POC and migration of data warehouse from Informatica 9.1 to Hadoop with Hadoop too tools HDFS, Hive, Pig and NiFi.
Hands on experience in data ingestion using NiFi
Created reusable NiFi templates to load the data from different source systems into Raw layer.
Experienced on loading and transforming of large sets of structured and semi structured datafrom HDFS through Sqoop and placed in HDFS for further processing.
Designed appropriate partitioning/bucketing schema to allow faster data retrieval during analysis using HIVE.
Involved in processing the data in the Hive tables using HQL high-performance, low-latency queries.
Transferred the analyzed data across relational database from HDFS using Sqoop enabling BI team to visualize analytics.
Developed custom aggregate functions using Spark SQL and performed interactive querying.
Managing and scheduling Jobs on a Hadoop cluster using Airflow DAG.
Involved in creating Hive tables, loading data and running hive queries in those data.
Extensive working knowledge of partitioned table performance tuning, compression related properties in Hive.
Work with Data Engineering Platform team to plan and deploy new Hadoop Environments and expand existing Hadoop clusters.

Environment: Informatica,Teradata,Tableau, MicroStrategy 10.7,CDH 5.4.5, Hive1.2.1, HBase1.1.2, Flume1.5.2, MapReduce, Sqoop1.4.6, Spark,NiFi(Standalone Cluster), Nagios, Shell Script, Oozie 4.2.0, Zookeeper 3.4.6.

Confidential

Hadoop Developer

Responsibilities:

Working on projecting involving migration of data from the mainframes to HDFS data lake and creating reports by performing transformations on the data put in the Hadoop data lake.
Built python script to extract the data from the Hawq tables and generated a "dat" file for the downstream application
Built a generic framework to parse raw data with fixed length using python which takes JSON
Layout for the fixed positions of the strings and load the data into Hawq tables.
Built generic framework that transforms two or more data sets in HDFS using python.
Built generic frameworks for Sqoop/Hawq to load data from SQL server to HDFS and HDFS to
Hawq using python.
Performed extensive data validation using Hawq partitions for efficient data access.
Built generic framework that allows for us to update the data in a Hawq tables using python.
Coordinated in all testing phases and worked closely with Performance testing team to create a baseline for the new application.
Created automated workflows that schedule jobs daily for loading data and other transformation jobs in using CA-ESP.
Developed functions using PL python for various use cases.
Documented technical design documents and production support documents.
Wrote python scripts to create automated workflows.
Technology Platforms: PHD-2.0, HAWQ 1.2, SQOOP 1.4, Python 2.6, SQL

Environment: Informatica, Netezza,H adoop HDP 2.1, Oracle, SQL Server, Zookeeper3.4.6, Oozie 4.1.0, MapReduce, YARN,2.6.1, HDFS, Sqoop1.4.6, Hive 1.2.1, Pig 0.15.0.

Maersk Line - Charlotte, NC

ETL Lead

Responsibilities:

Analyzing content and quality of databases, recommending data management procedures, and developing extraction/ ETL processes.
Acted as an offshore coordinator and lead the team in offshore by providing them mapping documents and acted as a point of contact for the onsite team.
Documented user requirements, translated requirements into system solutions and develop implementation plan and schedule.
Responsible to migrate the Informatica code from one environment to another by creating the xml files using informatica repository manager.
Developed informatica mappings to load the data into dimension and Fact tables.
Analyzed the business requirements and functional specifications.
Extracted data from oracle database and spreadsheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
Developed complex mappings in Informatica to load the data from various sources.
Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
Parameterized the mappings and increased the re-usability.
Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
Created procedures to truncate data in the target before the session run.
Extensively used Toad utility for executing SQL scripts and worked on SQL for enhancing the performance of the conversion mapping.
Created the ETL exception reports and validation reports after the data is loaded into the warehouse database.
Written documentation to describe program development, logic, coding, testing, changes and corrections.
Created Test cases for the mappings developed and then created integration Testing Document.
Followed Informatica recommendations, methodologies and best practices.

Environment: Informatica 9.6,Oracle,OGG, Teradata, HP ALM,Teradata GCFR Frame Work, Teradata BI Temporal Framework, IBM WS MQ, SAP-BO-XI

FICO - San Jose, CA

ETL Developer

Responsibilities :

Analyzed the business requirements and functional specifications.
Extracted data from oracle database and spread sheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
Developed complex mappings in Informatica to load the data from various sources.
Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
Parameterized the mappings and increased the re-usability.
Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
Created Test cases for the mappings developed and then created integration Testing Document

Environment: Traid Importer, X-Book, Query center, Informatica9.1, SQL Developer, Oracle 11g, MySql

Confidential

ETL Developer

Responsibilities:

Analyzed the business requirements and functional specifications.
Extracted data from oracle database and spread sheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
Developed complex mappings in Informatica to load the data from various sources.
Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance.
Parameterized the mappings and increased the re-usability.
Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.
Created Test cases for the mappings developed and then created integration Testing Document.

Environment :Informatica9.1, SQL Developer, Oracle 11g, SQL Server 2005, Netezza, Sybase, UNIX

Confidential

ETL Developer

Responsibilities:

Developed Informatica mappings to load the data into dimension and Fact tables.
Analyzed the business requirements and functional specifications.
Extracted data from oracle database and spread sheets and staged into a single place and applied business logic to load them in the central oracle database.(Warehouse)
Used Informatica Power Center for extraction, transformation and load (ETL) of data in the data warehouse.
Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator
Parameterized the mappings and increased the re-usability.
Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings.

Environment: Informatica 9.1,8.6, SQL Developer, Oracle 11g, Netezza, QC, DAC, UNIX.

Confidential

ETL Developer

Responsibilities:

Used Informatica to populate data into staging area and Warehouse, Operational data store.
Created transformations and mappings including expression, aggregators, Filter router, joiner, and lookup.
Experience of Slowly Changing Dimensions.
Parameterized the mappings and increased the re-usability.
Written documentation to describe program development, logic, coding, testing, changes and corrections.
Created Test cases for the mappings developed and then created integration Testing Document. Followed Informatica recommendations, methodologies and best practices.
Extensively used Toad utility for executing SQL scripts and worked on SQL for enhancing the performance of the conversion mapping.

Environment :Informatica8.6, Toad, Oracle 10g.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Columbus, OH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship