Hadoop Consultant Resume Richardson, Texas - Hire IT People

SUMMARY

10+ years of experience in Data Analysis, Database Designing, Data Modeling, Data warehousing and Database development .
3+ years of experience Hadoop, HDFS, MapReduce, Kafka, Pig, Hive, Impala, HBase.
Experience in data management and implementation of Big Data applications usingHadoop frameworks.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Experience in working with ClouderaHadoopdistribution and Hortonworks.
Good experience of using Spark SQL and Scala.
Worked on developing ETL processes to load data from multiple data sources to HDFS using SQOOP, perform structural modifications using Pig, HIVE and analyze the loaded data.
Hands on experience in in Oracle, JDK, J2EE, XML, JDBC
Hands on experience with Datawarehousing/ETL process.
Used Oracle PL/SQL, Netezza, Teradata and ETL tools like Informatica 10.0 / 9.6, Big Data Edition.
Experienced in translating data access, transformation, and movement requirements into Functional Requirements and Mapping Designs.
Proficient with back end database programming using PL/SQL, SQL, Writing Packages, Triggers, Stored Procedures, Materialized Views, Partitioning and performance tuning.
Experience of working on Performance Tuning and Optimization on applications by identifying bottlenecks in Informatica as well as database.
Strong in Unix / Linux Shell Scripting.
Skilled programmer, expert designer and accomplished team leader.
Experienced in using agile approaches including Extreme Programming, Test-Driven Development and Agile Scrum.
Excellent communication skills have helped me work in large teams, mentor peers, gathering user requirement.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop - HDFS, Hive, Impala, Kafka, Sqoop, Pig, HBase, Oozie, YARN, Zookeeper, Scala, Spark SQL, Phoenix, Nifi

Programming/Scripting: Java, Javascript, C, SQL, PLSQL, Pro*C, Shell Scripting, Scala, Python

BI/ETL: Informatica BigData Edition, Informatica Power Center 9.6PowerCenter Real Time Edition B2B Data Transformation Studio

Databases: Oracle 11g, DB2, SQL Server2008/2005, Netezza, Teradata

Tools: IntelliJ, Eclipse, SQL Developer, Aginity Workbench, Management studioTOAD, DT studio, DB Visualizer

Methodologies: Agile, Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, Richardson, Texas

Hadoop Consultant

Responsibilities:

Work closely with the business and analytics team in gathering the system requirements.
Provide design recommendations and leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Used Kafka for real time streaming along with Apache Nifi to configure the processors.
Used HBase to persist the steaming data received from Kafka.
Used Apache Phoenix to execute queries on HBase.
Created Hive tables for batch data loading and analyzing data using hive queries
Created complex Hive queries to help business users to analyze and spot emerging trends by comparing fresh data with historical metrics.
Managed and reviewed Hadoop log files.
Also involved in doing Analytics on claims and reject claims processing in the DataLake.
Tested raw data and executed performance scripts.
Actively involved in code review and bug fixing for improving the performance.

Environment: Unix Shell Scripts, HDFS, Hive, Apache Kafka, Apache Nifi, Apache Phoenix, Teradata

Confidential, Irving,Texas

ETL / Hadoop Consultant

Responsibilities:

Work closely with the business and analytics team in gathering the system requirements.
Provided design recommendations and leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Used Sqoop to bulk data load from DB2 to HDFS for the intial load.
Used Spark RDD, Dataframes and Datasets to perform validations and aggregations on the provider and claims files.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Created Hive tables for data loading and analyzing data using hive queries
Created complex Hive queries to help business users to analyze and spot emerging trends by comparing fresh data with historical metrics.
Developed Scala scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Managed and reviewed Hadoop log files.
Tested raw data and executed performance scripts.
Actively involved in code review and bug fixing for improving the performance.

Environment: DB2, Spark, Scala, Unix Shell Scripts, HDFS, Hive, Sqoop, Eclipse

Confidential, Round Rock, Texas

ETL / Hadoop Consultant

Responsibilities:

Working with cross functional Business and IT teams of Business Analysts, Data analysts, Data Modelers, Solution Architects, DBA's, Developers and Project Managers.
Creating data mappings from the Enterprise Datawarehouse System, Customer MDM to the Dashboard related data.
Used Kafka for the steaming orders related data into HDFS.
Used Sqoop scripts to moved the required data from Teradata into HDFS.
Used Pig scripts to transform the data to 360 JSON Format, where the data was ingested into AllSight.
Created Hive queries to compare the raw data with EDW reference tables and performing aggregates.
The composite view was indexed in Elastic Search.
Developed Complex Hive queries for the analysts.
Cluster co-ordination services through ZooKeeper.
End to End testing of the entire system.
Conducted demo to the end users.

Environment: Oracle, Teradata, Unix Shell Scripts, Java, HDFS, Pig, Hive, Sqoop, HBase, Kafka, Allsight

Confidential

ETL/Hadoop Consultant

Responsibilities:

Working with cross functional Business and IT teams of Business Analysts, Data analysts, Data Modelers, Solution Architects, DBA's, Developers and Project Managers.
Involved in the Big Data implementation for Mercury Insurance
Migrated from Informatica 9.6 to Informatica Big Data Edition.
Migrated the data from Oracle, Netezza into HDFS using Sqoop and Pig scripts.
Used UDFs for specific transformation logic.
Experienced on loading and transforming of large sets of structured, and semi structured data from Oracle through Sqoop and placed in HDFS for further processing.
Monitored Hadoop scripts which take the input from HDFS and load the data into Hive.
Designed and developed Hive queries using partitions/buckets for data analysis.
Used Informatica BDE for Data Profiling.
Implemented test scripts to support test driven development and continuous integration

Environment: Netezza 6, UNIX Shell Programming, HDFS, Pig, Python, Hive, Impala, Sqoop, HBase, Informatica PowerCenter 9.6/ Big Data Edition.

Confidential, Tampa Florida

ETL Consultant

Responsibilities:

Work with cross functional Business and IT teams of Business Analysts, Data analysts, Data Modelers, Solution Architects, DBA's, Developers and Project Managers.
Converted functional requirements into technical specification, mapping document,Interface Control Documents.
Assisted the team in the development of design standards and codes for effective ETL procedure development and implementation.
Created complex mappings using Connected / Unconnected Lookups, Normalizer, Union transformations
Enhanced performance for Informatica session for sources such as large data files by using partitions, increasing block size, data cache size and target based commit interval, push down optimizations.
Designed workflows with decision, assignment task, event wait, and event raise tasks.
Tested the ETL components.
Mentored the offshore team .
Implementation of the Datamart into production and provide support.

Environment: Oracle 11g, PL/SQL, UNIX Shell Programming, Java, Informatica PowerCenter 9.1, MS-Visio, Control-M

Confidential, Malta, New York

ETL Consultant

Responsibilities:

Analyzed legacy application data and determined conversion rules for migration to data warehouse
Performed data analysis, data mapping including validating data quality and data consistency to arrive at gap-analysis
Developed detailed programming specifications for ETL, data migration, and data scrubbing processes.
Ensured proper data movement is available to achieve the business objectives for application development, analytical reporting
Used Informatica Designer to create several mappings, transformations using source as flat files and databases to move data to a target Data Warehouse.
Performance Tuning.
Post implementation knowledge transfer to client and peers

Environment: Teradata, Informatica PowerCenter 8.6.1, Java, UNIX Shell Programming, ERStudio, Control-M

Confidential, New York

ETL Onsite Lead

Responsibilities:

Gathering and documenting business requirements into Technical Specifications.
Analyzed the source system to understand the architecture to get deeper understanding of business rules and data integration checks.
Designing the fact and dimension tables for the Star Schema using ER Studio.
Developed complex database structures for data validations and coded complex PL/SQL procedures.
Designed and constructed complex MViews, Partitions, Stored Procedures and Packages for implementing the ETL process
Developed mappings for ETL process.
Using shell scripting to automate the batch jobs to reduce manual intervention.
Tuning and Performance Optimization to considerably reduce the total time for ETL processes and reports.
Testing the entire ETL process .
Support and maintain the DataWarehouse.
Co-ordination with Application Teams, Business Analysts, System Administrators for day-to-day maintenance and implementations.

Environment: Oracle 11g, PL/SQL, Java, UNIX Shell Programming, Informatica 8.6,ERStudio

Confidential, San Francisco CA

ETL Developer

Responsibilities:

Gathering and documenting business requirements into Technical Specifications.
Analyze the central datawarehouse and determined conversion rules for migration .
Developed detailed programming specifications for ETL, data migration, and data scrubbing processes.
Coding and testing the ETL using packages,shell scripts.
Performance optimization of Oracle PL/SQL .
Co-ordinated with Application Teams, Business Analysts, System Administrators for day-to-day maintenance and implementations.

Environment: Oracle 10g, PL/SQL, Linux, Java, ERStudio, MS SQL Server 2000, UNIX Shell Programming

Confidential

Software Engineer

Responsibilities:

Preparing the detailed design document from functional spec provided by the client.
Design, document and develop complex solutions from requirements
Automated the process of loading of data warehouse feeds.
Worked as Team Lead for a team of 7 members.
Worked with large database containing millions rows involving high net worth value customers based in US.
Automated processes such as daily loading data of data using SQL Loader from other systems
Effectively reduced the overall processing time by implementing packages, triggers, IOT, Partitioned Tables, Indices, materialized Views.
Extensively used TKProf, SQL Trace to constantly monitor the system and fine tuned the system for better performance removing bottlenecks during processing.

Environment: Oracle 9i, PL/SQL, Forms 9i, Java, JSP, Reports, SQL* Loader, TOAD, ERStudio

Confidential

Software Engineer

Responsibilities:

Lead a team of 10 members in the capacity of ‘Team Lead ‘.
Prepared the detailed design document from functional spec provided by the client.
Designed the logical and physical database in keeping with the Business Rules.
Designed and Constructed complex Views, Triggers, Stored Procedures and Packages
Coordinated the Review Meeting.
Briefed the Management with the Project Status Report every week.
Performance tuning and Optimization of large database

Environment: Oracle 9i, Developer 2000, Forms 5.0, PL/SQL, Java, JSP, UNIX shell scripts, TOAD, Pro*C

We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

Richardson, TexaS

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship