We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Tampa, FloridA

SUMMARY

  • 8 years of extensive Professional IT experience, including 2.5 years of Hadoop/Big data experience, capable of processing large sets of structured,semi - structuredand unstructured data and supporting systems application architecture.
  • 2.5 years of experience in Big Data Analytics using HDFS, HIVE, FLUME, SQOOP, GPLOADER, HBase, HUE, Linux and Python automation scripting, and Informatica BDE.
  • Over 2 years of Data warehousing and ETL experience using Informatica Power Center 9.6.1/9.5.1/9.1.0/8.6.1/8.1/7.1, Cloud Integration and IDQ as an Analyst.
  • Experience in Importing and Exporting the Data using SQOOP from HDFS to Relational Database systems.
  • Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
  • Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Java into Pig Latin and HQL (Hive QL).
  • Experience in Streaming the Data to HDFS using Flume.
  • Experienced in defining work flows with Oozie.
  • Expertise in writing ETL Jobs for analyzing data using Pig.
  • Expertise in Linux shell scripting (ksh, sh), Python scripting, DOS scripting, job scheduling in CRON, Control-M and IBM Workload Manager.
  • Experience with FTP/Sftp/Scp for transfering files between various systems.
  • Extensive Database experience using Oracle 11g/10g/9i/, Teradata and MS SQL Server.
  • Responsible for all activities related to the development, implementation, administration and support of ETL processes for large data warehouse using Power Center.
  • Experience on using Informatica command line utilities like pmcmd and pmrep.
  • Extensively worked on Informatica Designer components Source Analyzer, Target Designer, Transformation Developer, Mapping Designer and Mapplet Designer.
  • Extensively worked on Informatica Power Center Transformations such as Source Qualifier, Lookup, Filter, Expression, Router, Joiner, Update Strategy, Rank, Aggregator, Stored Procedure, Transaction Control, Java, SQL, Sorter, and Sequence Generator.
  • Proficient in Data warehouse design based on Ralph Kimball and Bill Inmon methodologies.
  • Extensively designed and developed Slowly Changing Dimension SCD 1, 2 & 3 mappings.
  • Good experience in Informatica and SQL Performance Tuning.
  • Strong hands on experience using Teradata standalone loader utilities like FastExport, FastLoad, MultiLoad, Tpump and TPT) and BTEQ scripts.
  • Maintained outstanding relationship with Business Analysts and Business Users to identify information needs as per the business requirement.
  • Followed waterfall and agile methodologies with scrum process.
  • Excellent written and communication skills, analytical skills with the ability to perform independently as well as in a team.
  • Proven ability in defining goals, coordinating teams and achieving results.

TECHNICAL SKILLS

Hadoop/Ecosystem: Cloudera CDH5, HDFS, HIVE, SQOOP, HUE, Flume, PIG, HBASE, Spark, HDFS File system commands.

LANGUAGES: SQL, PL/SQL, Java, C, C++

Database: Oracle, DB2, Teradata and MS SQL Server

Methodologies: Agile, waterfall.

BI TOOLS: Informatica 7x, 8x and 9x, Informatica BDE

Operating Systems: Windows Server 2003/2000/XP, Windows 7, 8,UNIX, Linux 6.5/6.7

Tools: MS Office, TOAD, SQL Developer, SQL Assistant, SharePoint, AutoSys, WIT, JIRA.

PROFESSIONAL EXPERIENCE

Confidential, Tampa, Florida

Senior Hadoop Developer

Responsibilities:

  • Hands on experience working on Hadoop echo system using HDFS, HIVE, Sqoop, Flume, HBase, HUE, Spark, Storm and Linux/Python data movement and data validation scripts.
  • Creating Hive tables, loading data and writing hive queries.
  • Hands on experience in writing complex queries in Hive QL andGreenplum
  • Built and maintained standard operational procedures for all neededGreenplumimplementations.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Implemented Partitioning, Bucketing in Hive for better organization of the data.
  • Developed Sqoop commands to pull the data from Teradata, Oracle and export into Greenplum
  • Created Linux, Python scripts for file validation, data movement and file archival.
  • Gather the requirements from users and by analysis of current systems.
  • Prepare the impact analysis document and high-level design for the requirements.
  • Create probabilistic models for the classification of data.
  • Coordination of deployment across SIT, Preproduction and Production environments.
  • Worked with business to prioritize and Implement the change requests.

Environment: Cloudera CDH5, HDFS, HIVE, HUE, SQOOP, Flume, PIG, HBase, Oozie work flow, Autosys, JIRA, Informatica Big Data Edition, Teradata, Oracle, WLM and Linux.

Confidential, Texas

Sr. ETL Lead Developer

Responsibilities:

  • Conducting interviews with various business users in identifying and capturing business requirements.
  • Proposed and documented the solutions to ensure all the required objectives are met.
  • Interpret measurement definitions and perform data decomposition. Perform Gap analysis.
  • Perform volumetric analysis, database sizing and data profiling activities.
  • Preparation of data flow models, high level technical design documents.
  • Worked with Data modeler in developing STAR Schemas and Snowflake schemas.
  • Preparation of coding standards documents in line with the enterprise architecture.
  • Work with the project manager to prepare and revise the work hour estimates.
  • Lead and mentor a team of 6 offshore resources during the development and test phases.
  • Assist the Data Stewards in updating Data Dictionary / Metadata for the implemented changes in Data Warehouse.
  • Creation of complex Informatica mappings to meet functional and performance objectives.
  • Extensively used push down optimization techniques to improve the performance.
  • Designed anomaly handling logic to ensure bad records are reported and corrected.
  • Created numerous materialized views for handling data replication in an effective manner.
  • Created shell scripts for managing and scheduling the Informatica workflows.
  • Developed mappings to load into staging tables and tan to Dimensions and Facts.
  • Responsible for code migration across different environments during the lifecycle.
  • Closely worked with DBA for creating the indexes, partitions etc. to avoid performance bottlenecks

Environment: Informatica Power center 8.6, Golden Gate, Teradata, Oracle, Control-M and UNIX.

Confidential, Richmond

ETL Developer

Responsibilities:

  • Converted business requirements into technical design documents.
  • Profile various source systems and validate the mapping document.
  • Created numerous ETL mappings using Informatica to load data into data warehouse system.
  • Implemented Dynamic Lookup Transformation for change data capture
  • Designed reusable modules where in the data quality checks like numeric and date checks are performed prior to load
  • Created Informatica workflows using various tasks like command, decision, email, control and File watcher.
  • Created shell scripts to send email notifications for reconciliation reporting.
  • Used Debugger to test and determine the logical errors in the mappings.
  • Implemented complex slowly changing dimensions like SCD2 and hybrid versions.
  • Involved in Performance tuning at source, target, mappings, sessions, and system level
  • Involved in code reviews to ensure the compliance of coding standards.
  • Worked closely with SIT and UAT testing teams for data validations
  • Involved in project releases, configuration management and migration activities.
  • Created Oracle Stored Procedures, Packages to implement complex logics.
  • Created Oracle Triggers to populate audit columns for tracking any DML operations.
  • Created job schedules to perform initial loads to the new warehouse platform.
  • Involved in base lining the code for migration to higher environments.

Environment: Informatica Power center 8.6, SQL Server, Oracle, AutoSys and Linux.

Confidential

Support Analyst

Responsibilities:

  • Perform deep-level troubleshooting on escalation from Level 1 and Level 2.
  • Perform root cause analysis on recurring issues.
  • Document job run logs, dependencies and their schedules.
  • Perform SQL score carding, optimization and performance tuning.
  • Perform monthly and quarterly maintenance releases.
  • Find and fix data issues to support other vendor requests.
  • Create job aids and solution scripts.
  • Attend the production handover calls to validate deliverables.
  • Responsible for implementing changes and enhancements to the applications
  • Assist data governance and compliance teams.

Environment: Informatica Power center 7.1, AbInitio, Oracle, CRON, Windows and Linux.

Confidential

ETL Analyst

Responsibilities:
  • Extensively used ETL to load data from Flat Files, XML, Oracle to oracle 8i
  • Involved in Designing of Data Modeling for the Data warehouse
  • Involved in Requirement Gathering and Business Analysis
  • Developed data Mappings between source systems and warehouse components using Mapping Designer
  • Worked extensively on different types of transformations like source qualifier, expression, filter, aggregator, rank, update strategy, lookup, stored procedure, sequence generator, joiner, XML.
  • Involved in the performance tuning of the Informatica mappings and stored procedures and the sequel queries inside the source qualifier.
  • Involved in the Performance Tuning of Database and Informatica. Improved performance by identifying and rectifying the performance bottle necks.
  • Used Server Manager to schedule sessions and batches.
  • Involved in creating Business Objects Universe and appropriate reports
  • Wrote PL/SQL Packages and Stored procedures to implement business rules and validations.

Environment: Informatica 7.1.3, ORACLE 10g, UNIX, Windows NT 4.0, UNIX Shell Programming, PL/SQL, TOAD Quest Software

We'd love your feedback!