Senior Hadoop Developer Resume Tampa, Florida - Hire IT People

SUMMARY

8 years of extensive Professional IT experience, including 2.5 years of Hadoop/Big data experience, capable of processing large sets of structured,semi - structuredand unstructured data and supporting systems application architecture.
2.5 years of experience in Big Data Analytics using HDFS, HIVE, FLUME, SQOOP, GPLOADER, HBase, HUE, Linux and Python automation scripting, and Informatica BDE.
Over 2 years of Data warehousing and ETL experience using Informatica Power Center 9.6.1/9.5.1/9.1.0/8.6.1/8.1/7.1, Cloud Integration and IDQ as an Analyst.
Experience in Importing and Exporting the Data using SQOOP from HDFS to Relational Database systems.
Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Java into Pig Latin and HQL (Hive QL).
Experience in Streaming the Data to HDFS using Flume.
Experienced in defining work flows with Oozie.
Expertise in writing ETL Jobs for analyzing data using Pig.
Expertise in Linux shell scripting (ksh, sh), Python scripting, DOS scripting, job scheduling in CRON, Control-M and IBM Workload Manager.
Experience with FTP/Sftp/Scp for transfering files between various systems.
Extensive Database experience using Oracle 11g/10g/9i/, Teradata and MS SQL Server.
Responsible for all activities related to the development, implementation, administration and support of ETL processes for large data warehouse using Power Center.
Experience on using Informatica command line utilities like pmcmd and pmrep.
Extensively worked on Informatica Designer components Source Analyzer, Target Designer, Transformation Developer, Mapping Designer and Mapplet Designer.
Extensively worked on Informatica Power Center Transformations such as Source Qualifier, Lookup, Filter, Expression, Router, Joiner, Update Strategy, Rank, Aggregator, Stored Procedure, Transaction Control, Java, SQL, Sorter, and Sequence Generator.
Proficient in Data warehouse design based on Ralph Kimball and Bill Inmon methodologies.
Extensively designed and developed Slowly Changing Dimension SCD 1, 2 & 3 mappings.
Good experience in Informatica and SQL Performance Tuning.
Strong hands on experience using Teradata standalone loader utilities like FastExport, FastLoad, MultiLoad, Tpump and TPT) and BTEQ scripts.
Maintained outstanding relationship with Business Analysts and Business Users to identify information needs as per the business requirement.
Followed waterfall and agile methodologies with scrum process.
Excellent written and communication skills, analytical skills with the ability to perform independently as well as in a team.
Proven ability in defining goals, coordinating teams and achieving results.

TECHNICAL SKILLS

Hadoop/Ecosystem: Cloudera CDH5, HDFS, HIVE, SQOOP, HUE, Flume, PIG, HBASE, Spark, HDFS File system commands.

LANGUAGES: SQL, PL/SQL, Java, C, C++

Database: Oracle, DB2, Teradata and MS SQL Server

Methodologies: Agile, waterfall.

BI TOOLS: Informatica 7x, 8x and 9x, Informatica BDE

Operating Systems: Windows Server 2003/2000/XP, Windows 7, 8,UNIX, Linux 6.5/6.7

Tools: MS Office, TOAD, SQL Developer, SQL Assistant, SharePoint, AutoSys, WIT, JIRA.

PROFESSIONAL EXPERIENCE

Confidential, Tampa, Florida

Senior Hadoop Developer

Responsibilities:

Hands on experience working on Hadoop echo system using HDFS, HIVE, Sqoop, Flume, HBase, HUE, Spark, Storm and Linux/Python data movement and data validation scripts.
Creating Hive tables, loading data and writing hive queries.
Hands on experience in writing complex queries in Hive QL andGreenplum
Built and maintained standard operational procedures for all neededGreenplumimplementations.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Implemented Partitioning, Bucketing in Hive for better organization of the data.
Developed Sqoop commands to pull the data from Teradata, Oracle and export into Greenplum
Created Linux, Python scripts for file validation, data movement and file archival.
Gather the requirements from users and by analysis of current systems.
Prepare the impact analysis document and high-level design for the requirements.
Create probabilistic models for the classification of data.
Coordination of deployment across SIT, Preproduction and Production environments.
Worked with business to prioritize and Implement the change requests.

Environment: Cloudera CDH5, HDFS, HIVE, HUE, SQOOP, Flume, PIG, HBase, Oozie work flow, Autosys, JIRA, Informatica Big Data Edition, Teradata, Oracle, WLM and Linux.

Confidential, Texas

Sr. ETL Lead Developer

Responsibilities:

Conducting interviews with various business users in identifying and capturing business requirements.
Proposed and documented the solutions to ensure all the required objectives are met.
Interpret measurement definitions and perform data decomposition. Perform Gap analysis.
Perform volumetric analysis, database sizing and data profiling activities.
Preparation of data flow models, high level technical design documents.
Worked with Data modeler in developing STAR Schemas and Snowflake schemas.
Preparation of coding standards documents in line with the enterprise architecture.
Work with the project manager to prepare and revise the work hour estimates.
Lead and mentor a team of 6 offshore resources during the development and test phases.
Assist the Data Stewards in updating Data Dictionary / Metadata for the implemented changes in Data Warehouse.
Creation of complex Informatica mappings to meet functional and performance objectives.
Extensively used push down optimization techniques to improve the performance.
Designed anomaly handling logic to ensure bad records are reported and corrected.
Created numerous materialized views for handling data replication in an effective manner.
Created shell scripts for managing and scheduling the Informatica workflows.
Developed mappings to load into staging tables and tan to Dimensions and Facts.
Responsible for code migration across different environments during the lifecycle.
Closely worked with DBA for creating the indexes, partitions etc. to avoid performance bottlenecks

Environment: Informatica Power center 8.6, Golden Gate, Teradata, Oracle, Control-M and UNIX.

Confidential, Richmond

ETL Developer

Responsibilities:

Converted business requirements into technical design documents.
Profile various source systems and validate the mapping document.
Created numerous ETL mappings using Informatica to load data into data warehouse system.
Implemented Dynamic Lookup Transformation for change data capture
Designed reusable modules where in the data quality checks like numeric and date checks are performed prior to load
Created Informatica workflows using various tasks like command, decision, email, control and File watcher.
Created shell scripts to send email notifications for reconciliation reporting.
Used Debugger to test and determine the logical errors in the mappings.
Implemented complex slowly changing dimensions like SCD2 and hybrid versions.
Involved in Performance tuning at source, target, mappings, sessions, and system level
Involved in code reviews to ensure the compliance of coding standards.
Worked closely with SIT and UAT testing teams for data validations
Involved in project releases, configuration management and migration activities.
Created Oracle Stored Procedures, Packages to implement complex logics.
Created Oracle Triggers to populate audit columns for tracking any DML operations.
Created job schedules to perform initial loads to the new warehouse platform.
Involved in base lining the code for migration to higher environments.

Environment: Informatica Power center 8.6, SQL Server, Oracle, AutoSys and Linux.

Confidential

Support Analyst

Responsibilities:

Perform deep-level troubleshooting on escalation from Level 1 and Level 2.
Perform root cause analysis on recurring issues.
Document job run logs, dependencies and their schedules.
Perform SQL score carding, optimization and performance tuning.
Perform monthly and quarterly maintenance releases.
Find and fix data issues to support other vendor requests.
Create job aids and solution scripts.
Attend the production handover calls to validate deliverables.
Responsible for implementing changes and enhancements to the applications
Assist data governance and compliance teams.

Environment: Informatica Power center 7.1, AbInitio, Oracle, CRON, Windows and Linux.

Confidential

ETL Analyst

Responsibilities:

Extensively used ETL to load data from Flat Files, XML, Oracle to oracle 8i
Involved in Designing of Data Modeling for the Data warehouse
Involved in Requirement Gathering and Business Analysis
Developed data Mappings between source systems and warehouse components using Mapping Designer
Worked extensively on different types of transformations like source qualifier, expression, filter, aggregator, rank, update strategy, lookup, stored procedure, sequence generator, joiner, XML.
Involved in the performance tuning of the Informatica mappings and stored procedures and the sequel queries inside the source qualifier.
Involved in the Performance Tuning of Database and Informatica. Improved performance by identifying and rectifying the performance bottle necks.
Used Server Manager to schedule sessions and batches.
Involved in creating Business Objects Universe and appropriate reports
Wrote PL/SQL Packages and Stored procedures to implement business rules and validations.

Environment: Informatica 7.1.3, ORACLE 10g, UNIX, Windows NT 4.0, UNIX Shell Programming, PL/SQL, TOAD Quest Software

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Tampa, FloridA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship