We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

0/5 (Submit Your Rating)

Richardson, TexaS

SUMMARY

  • 10+ years of experience in Data Analysis, Database Designing, Data Modeling, Data warehousing and Database development .
  • 3+ years of experience Hadoop, HDFS, MapReduce, Kafka, Pig, Hive, Impala, HBase.
  • Experience in data management and implementation of Big Data applications usingHadoop frameworks.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience in working with ClouderaHadoopdistribution and Hortonworks.
  • Good experience of using Spark SQL and Scala.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using SQOOP, perform structural modifications using Pig, HIVE and analyze the loaded data.
  • Hands on experience in in Oracle, JDK, J2EE, XML, JDBC
  • Hands on experience with Datawarehousing/ETL process.
  • Used Oracle PL/SQL, Netezza, Teradata and ETL tools like Informatica 10.0 / 9.6, Big Data Edition.
  • Experienced in translating data access, transformation, and movement requirements into Functional Requirements and Mapping Designs.
  • Proficient with back end database programming using PL/SQL, SQL, Writing Packages, Triggers, Stored Procedures, Materialized Views, Partitioning and performance tuning.
  • Experience of working on Performance Tuning and Optimization on applications by identifying bottlenecks in Informatica as well as database.
  • Strong in Unix / Linux Shell Scripting.
  • Skilled programmer, expert designer and accomplished team leader.
  • Experienced in using agile approaches including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Excellent communication skills have helped me work in large teams, mentor peers, gathering user requirement.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop - HDFS, Hive, Impala, Kafka, Sqoop, Pig, HBase, Oozie, YARN, Zookeeper, Scala, Spark SQL, Phoenix, Nifi

Programming/Scripting: Java, Javascript, C, SQL, PLSQL, Pro*C, Shell Scripting, Scala, Python

BI/ETL: Informatica BigData Edition, Informatica Power Center 9.6PowerCenter Real Time Edition B2B Data Transformation Studio

Databases: Oracle 11g, DB2, SQL Server2008/2005, Netezza, Teradata

Tools: IntelliJ, Eclipse, SQL Developer, Aginity Workbench, Management studioTOAD, DT studio, DB Visualizer

Methodologies: Agile, Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, Richardson, Texas

Hadoop Consultant

Responsibilities:

  • Work closely with the business and analytics team in gathering the system requirements.
  • Provide design recommendations and leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Used Kafka for real time streaming along with Apache Nifi to configure the processors.
  • Used HBase to persist the steaming data received from Kafka.
  • Used Apache Phoenix to execute queries on HBase.
  • Created Hive tables for batch data loading and analyzing data using hive queries
  • Created complex Hive queries to help business users to analyze and spot emerging trends by comparing fresh data with historical metrics.
  • Managed and reviewed Hadoop log files.
  • Also involved in doing Analytics on claims and reject claims processing in the DataLake.
  • Tested raw data and executed performance scripts.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: Unix Shell Scripts, HDFS, Hive, Apache Kafka, Apache Nifi, Apache Phoenix, Teradata

Confidential, Irving,Texas

ETL / Hadoop Consultant

Responsibilities:

  • Work closely with the business and analytics team in gathering the system requirements.
  • Provided design recommendations and leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Used Sqoop to bulk data load from DB2 to HDFS for the intial load.
  • Used Spark RDD, Dataframes and Datasets to perform validations and aggregations on the provider and claims files.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Created Hive tables for data loading and analyzing data using hive queries
  • Created complex Hive queries to help business users to analyze and spot emerging trends by comparing fresh data with historical metrics.
  • Developed Scala scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: DB2, Spark, Scala, Unix Shell Scripts, HDFS, Hive, Sqoop, Eclipse

Confidential, Round Rock, Texas

ETL / Hadoop Consultant

Responsibilities:

  • Working with cross functional Business and IT teams of Business Analysts, Data analysts, Data Modelers, Solution Architects, DBA's, Developers and Project Managers.
  • Creating data mappings from the Enterprise Datawarehouse System, Customer MDM to the Dashboard related data.
  • Used Kafka for the steaming orders related data into HDFS.
  • Used Sqoop scripts to moved the required data from Teradata into HDFS.
  • Used Pig scripts to transform the data to 360 JSON Format, where the data was ingested into AllSight.
  • Created Hive queries to compare the raw data with EDW reference tables and performing aggregates.
  • The composite view was indexed in Elastic Search.
  • Developed Complex Hive queries for the analysts.
  • Cluster co-ordination services through ZooKeeper.
  • End to End testing of the entire system.
  • Conducted demo to the end users.

Environment: Oracle, Teradata, Unix Shell Scripts, Java, HDFS, Pig, Hive, Sqoop, HBase, Kafka, Allsight

Confidential

ETL/Hadoop Consultant

Responsibilities:

  • Working with cross functional Business and IT teams of Business Analysts, Data analysts, Data Modelers, Solution Architects, DBA's, Developers and Project Managers.
  • Involved in the Big Data implementation for Mercury Insurance
  • Migrated from Informatica 9.6 to Informatica Big Data Edition.
  • Migrated the data from Oracle, Netezza into HDFS using Sqoop and Pig scripts.
  • Used UDFs for specific transformation logic.
  • Experienced on loading and transforming of large sets of structured, and semi structured data from Oracle through Sqoop and placed in HDFS for further processing.
  • Monitored Hadoop scripts which take the input from HDFS and load the data into Hive.
  • Designed and developed Hive queries using partitions/buckets for data analysis.
  • Used Informatica BDE for Data Profiling.
  • Implemented test scripts to support test driven development and continuous integration

Environment: Netezza 6, UNIX Shell Programming, HDFS, Pig, Python, Hive, Impala, Sqoop, HBase, Informatica PowerCenter 9.6/ Big Data Edition.

Confidential, Tampa Florida

ETL Consultant

Responsibilities:

  • Work with cross functional Business and IT teams of Business Analysts, Data analysts, Data Modelers, Solution Architects, DBA's, Developers and Project Managers.
  • Converted functional requirements into technical specification, mapping document,Interface Control Documents.
  • Assisted the team in the development of design standards and codes for effective ETL procedure development and implementation.
  • Created complex mappings using Connected / Unconnected Lookups, Normalizer, Union transformations
  • Enhanced performance for Informatica session for sources such as large data files by using partitions, increasing block size, data cache size and target based commit interval, push down optimizations.
  • Designed workflows with decision, assignment task, event wait, and event raise tasks.
  • Tested the ETL components.
  • Mentored the offshore team .
  • Implementation of the Datamart into production and provide support.

Environment: Oracle 11g, PL/SQL, UNIX Shell Programming, Java, Informatica PowerCenter 9.1, MS-Visio, Control-M

Confidential, Malta, New York

ETL Consultant

Responsibilities:

  • Analyzed legacy application data and determined conversion rules for migration to data warehouse
  • Performed data analysis, data mapping including validating data quality and data consistency to arrive at gap-analysis
  • Developed detailed programming specifications for ETL, data migration, and data scrubbing processes.
  • Ensured proper data movement is available to achieve the business objectives for application development, analytical reporting
  • Used Informatica Designer to create several mappings, transformations using source as flat files and databases to move data to a target Data Warehouse.
  • Performance Tuning.
  • Post implementation knowledge transfer to client and peers

Environment: Teradata, Informatica PowerCenter 8.6.1, Java, UNIX Shell Programming, ERStudio, Control-M

Confidential, New York

ETL Onsite Lead

Responsibilities:

  • Gathering and documenting business requirements into Technical Specifications.
  • Analyzed the source system to understand the architecture to get deeper understanding of business rules and data integration checks.
  • Designing the fact and dimension tables for the Star Schema using ER Studio.
  • Developed complex database structures for data validations and coded complex PL/SQL procedures.
  • Designed and constructed complex MViews, Partitions, Stored Procedures and Packages for implementing the ETL process
  • Developed mappings for ETL process.
  • Using shell scripting to automate the batch jobs to reduce manual intervention.
  • Tuning and Performance Optimization to considerably reduce the total time for ETL processes and reports.
  • Testing the entire ETL process .
  • Support and maintain the DataWarehouse.
  • Co-ordination with Application Teams, Business Analysts, System Administrators for day-to-day maintenance and implementations.

Environment: Oracle 11g, PL/SQL, Java, UNIX Shell Programming, Informatica 8.6,ERStudio

Confidential, San Francisco CA

ETL Developer

Responsibilities:

  • Gathering and documenting business requirements into Technical Specifications.
  • Analyze the central datawarehouse and determined conversion rules for migration .
  • Developed detailed programming specifications for ETL, data migration, and data scrubbing processes.
  • Coding and testing the ETL using packages,shell scripts.
  • Performance optimization of Oracle PL/SQL .
  • Co-ordinated with Application Teams, Business Analysts, System Administrators for day-to-day maintenance and implementations.

Environment: Oracle 10g, PL/SQL, Linux, Java, ERStudio, MS SQL Server 2000, UNIX Shell Programming

Confidential

Software Engineer

Responsibilities:

  • Preparing the detailed design document from functional spec provided by the client.
  • Design, document and develop complex solutions from requirements
  • Automated the process of loading of data warehouse feeds.
  • Worked as Team Lead for a team of 7 members.
  • Worked with large database containing millions rows involving high net worth value customers based in US.
  • Automated processes such as daily loading data of data using SQL Loader from other systems
  • Effectively reduced the overall processing time by implementing packages, triggers, IOT, Partitioned Tables, Indices, materialized Views.
  • Extensively used TKProf, SQL Trace to constantly monitor the system and fine tuned the system for better performance removing bottlenecks during processing.

Environment: Oracle 9i, PL/SQL, Forms 9i, Java, JSP, Reports, SQL* Loader, TOAD, ERStudio

Confidential

Software Engineer

Responsibilities:

  • Lead a team of 10 members in the capacity of ‘Team Lead ‘.
  • Prepared the detailed design document from functional spec provided by the client.
  • Designed the logical and physical database in keeping with the Business Rules.
  • Designed and Constructed complex Views, Triggers, Stored Procedures and Packages
  • Coordinated the Review Meeting.
  • Briefed the Management with the Project Status Report every week.
  • Performance tuning and Optimization of large database

Environment: Oracle 9i, Developer 2000, Forms 5.0, PL/SQL, Java, JSP, UNIX shell scripts, TOAD, Pro*C

We'd love your feedback!