Hadoop Consultant Resume Fort Lee - Hire IT People

SUMMARY

10+ Years of Comprehensive and in - depth experience in Information Technology with a strong background in Hadoop/Big Data technology
Having 5+ Years of experience in Hadoop/Big Data technologies
Having 5+ Years of experience in Java
Having 3+ years of experience in working in onsite with US clients
Done continuous service improvements in the projects
As an Idea Champion provided ideas to improve the process and performance
Extensive knowledge ofHadoop& Spark architecture and core components
Experienced in Spark scripts using Scala, Spark SQL to access hive tables in spark for faster data processing
Experienced with Apache Spark Streaming API on Big Data Distributions in an active cluster environment.
Good in Scala programming for writing applications in Apache Spark
Done performance tuning in Spark
Expertise in Hadoop, HDFS, Spark, Hive, HBase, Pig, Flume, Kafka, Oozie
Done real time processing using Kafka and Spark Streaming
Experienced in Impala for doing MPP (Massive Parallel Processing) in data
In depth understanding of Hadoop Architecture and its components such as HDFS, Name Node, Data Node, Job Tracker, Task Tracker and Map Reduce
Knowledge in administrative tasks such as Cluster setup, Hadoop installation, Configuring Hadoop and installing its ecosystem components such as HBase, Pig and Flume
Good knowledge in Balancer Daemon for redistributing the blocks
Experienced in high volume ingestion of event based data into Hadoop using Flume
Expertise in Hadoop jobs for analyzing huge data using MapReduce, Pig and Hive
Having work experience in writing queries in HBase
Experience in importing and exporting data using SQOOP from HDFS to Relational Database Systems and vice-versa
Experienced in job workflow scheduling and monitoring tool Oozie
Created MapReduce programs in Java for the requirements
Strong Data Warehousing experience in Application development & Quality Assurance testing using Informatica Power Center 9.6.1/8.6 Source Analyzer, Data Warehousing Designer, Mapping Designer, Mapplet Designer, Transformation developer
Experienced in redesigning Informatica Analytics applications into Big Data Analytics applications
Experience in redesigning other applications into Informatica Analytics applications
Experience in designing complex ETL/ELT solutions for Data Warehouse Programs
Extensive Experience in identifying bottlenecks and performance tuning in Informatica
Hands on experiencewith industry standard methodologies Waterfall and Agile Scrum methodology within the Software Development Life Cycle
Expertise in working as IC (Individual contributor) working independently
Proposed changes in the process within the project and implemented the same for better service to the client
Defined and proposed Agile Scrum standards to be followed in the project. Also trained the project teams in Agile Scrum
Having good Domain experience in Banking, Insurance, Automobile and Retail
Experienced in UNIX, Windows and Mainframes platforms

TECHNICAL SKILLS

Big Data Technologies: Hadoop, Spark, Spark SQL, Spark Streaming, Hive, HBase, Pig, SQOOP, Flume, Ozzie, Kafka, HDFS, Impala, Cloudera, Avro, Parquet, Hortonworks, Phoenix, Ranger, Atlas, Ambari

Web Technologies: Java, HTML, XML, JSON

ETL Technologies: Informatica Power Center 10.x/9.x/8.x

IDE: SBT, ECLIPSE, Intellij, Maven

Languages: Java, Python, Scala, COBOL, SQL, CICS

Databases: Oracle, DB2, VSAM, MSSQL Server

DB Tools: TOAD, SQL Developer

Operating System: Windows, Unix, Z/800

Job Scheduler: Informatica Scheduler, JCL

Version Control: Git

Methodology: Agile Scrum

Other Tools: Win SCP, Notepad++, MS Visio 2010, Putty, Facets, Squirrel

PROFESSIONAL EXPERIENCE

Confidential, Fort Lee

Hadoop Consultant

Responsibilities:

Worked in the Big Data proposal in Hortonworks distribution
Coded Spark SQL to access HBase tables and load into MSSQL Server by applying business rules
Accessed HBase tables using Phoenix
Done Hive on HBase
Configured Kafka to send the data streams to Spark streaming and then store it in HBase table
Imported Data from Databases using Spark and JDBC
Done real time data processing using Spark streaming and Kafka connectors
Done performance tuning in Spark scripts
Created reports by using Spark SQL and Scala
Done real time analytics using Spark Steaming and Kafka
Done importing and exporting od data using SQOOP from HDFS to Relational Database Systems and vice-versa
Used Impala to query the database and also done performance tuning

Environment: Hadoop 2.0, Hortonworks, Kafka, Spark, Spark Streaming, Spark SQL, SBT, Intellij, Impala, MSSQL Server, CentOs, Win SCP, Hadoop, HBase, Hive, SQOOP, Unix Shell Scripts, Windows 10, Notepad ++, Atlas, Ranger, Phoenix, impala, Ambari

Confidential

Hadoop/ Spark / Informatica Big Data Management (BDM) Developer

Responsibilities:

Worked in the Big Data proposal in Cloudera distribution
Worked in application improvement proposals
Involved in the complete life cycle of the project from Requirement gathering, Estimation Designing, Coding, Testing and production support
Worked in redesigning the applications and solutions. Also prepared the presentations for the same
Done estimation for the requirements
Done analysis and prepared design document based on the client requirements
Coded Spark scripts using Scala, Spark SQL to access HIVE tables in spark for faster data processing
Configured Kafka to send the data streams to Spark streaming and then store it in HDFS
Coded Spark scripts using various Spark Transformations and Actions for cleansing the data
Imported Data from Databases using Spark and JDBC
Done real time data processing using Spark streaming and Kafka connectors
Done performance tuning in Spark scripts
Created reports by using Spark SQL and Scala
Done real time analytics using Spark Steaming and Scala
Done Cluster setup, Hadoop installation, Configuring and installing Hadoop ecosystems in Cloudera distribution
Done importing and exporting healthcare subscriber, Plans, Benefits, products and claims data from various applications using SQOOP from HDFS to Relational Database Systems and vice-versa
Created NoSQL tables in HBase for processing XML data from healthcare member portals
Coded Pig scripts and Hive for analyzing structured and semi structured used in the project
Used Impala to query the database and also done performance tuning
Configured Oozie for scheduling and monitoring the Hadoop jobs
Installed and configured Flume for collecting and replicating data from subscribers and claims information from web portals of healthcare providers
Involved in testing the huge data in Hadoop Multi node cluster in Cloudera distribution
Involved in daily scrum meeting as part of Scrum methodologies followed in the project
Managed the team of 10
Reviewed the design document, code, unit testing and UTC of team members

Environment: Hadoop 2.0, Cloudera, Spark, Spark Streaming, Spark SQL, SBT, Eclipse, Impala, Informatica BDM 10.1.0, Oracle, Unix, Win SCP, SQL Developer, Hadoop, HBase, Hive, SQOOP, Flume, Pig, Unix Shell Scripts, SQL Developer, Windows 7, Notepad ++, MS Visio

Confidential, O’Follon, MO

Hadoop Developer

Responsibilities:

Involved in Business Analysis, requirement gathering, Designing Logical and Physical Data models.
Worked with Business and technology teams for redesigning ETL Analytics applications to Big Data Analytics applications
Based on the client requirement, done analysis and created design documents
Designed and developed Spark applications using Scala in Cloudera distribution
Coded in Spark-SQL to use various data sources like JSON, Parquet and Hive
Coded extracts for real time data using Kafka and Spark streaming to store it in HDFS
Developed Spark scripts to perform Transformations and Actions in Spark RDDs
Done performance tuning in Spark scripts
Configured the Hadoop Cluster for adding new nodes into cluster for Cloudera distribution
Done importing and exporting data into Oracle 10g and Hive using Sqoop
Created Sqoop jobs with incremental load to populate Hive external tables
Loaded streaming of log and XML data from various web servers into HDFS using Flume
Developed queries in HBase for loading and querying semi structured data
Involved in loading bank promotion details from Unix file system to HDFS
Developed Ozzie workflow for scheduling
Created Pig scripts for join queries from multiple tables
Created partition tables in Hive for bank transactions table which grows faster
Worked on Informatica Power Center tool - Source Analyzer, Data warehousing designer, Mapping & Mapplet Designer and Transformation Developer
Developed complex mappings to load data from multiple source systems
Done performance tuning in session level, Mapping Level and System level

Environment: Java, Hadoop, Cloudera, Spark, Spark SQL, SBT, Eclipse, Hive, HBase, Pig, SQOOP, Flume, Ozzie, Windows 8, Red Hat Linux 6, Amazon AWS, Informatica 9.6.1, Unix, Oracle 10g, Microsoft Visio 2010

Confidential, Michigan, MI

Lead Informatica Developer

Responsibilities:

Worked with Business and technology teams for redesigning ETL Analytics applications to Big Data Analytics applications
Worked with the business analysts in requirement analysis to implement the ETL process.
Designed the scripts for constraints, triggers and stored procedures.
Created design documents for creating maps to load the data from the ODS to warehouse system.
Used loading techniques like Slowly Changing Dimensions and Incremental Loading using parameter files and mapping variables.
Extensively worked with Slowly Changing Dimensions Type1, Type2, and Type3 for Data Loads.
Developed batch file to automate the task of executing the different workflows and sessions associated with the mappings.
Created workflows using Workflow manager for different tasks like sending email notifications, timer that triggers when an event occurs, and sessions to run a mapping.
Used designer debugger to test the data flow and fix the mappings.
Done performance tuning at the Mapping, Session, Source, Target and Databases levels.
Created re-usable transformations and mapplets.
Created Pig scripts for join queries from multiple tables
Done importing and exporting data into Oracle 10g and Hive using Sqoop
Created Sqoop jobs with incremental load to populate Hive external tables
Used Spark for in-memory processing
Used HBase for loading databases for further processing
Responsible for determining the bottlenecks and fixing the bottlenecks with performance tuning.
Involved in Unit Testing, User Acceptance Testing (UAT) and integration testing

Environment: Java, Hadoop, Spark, Hive, HBase, Pig, SQOOP, Flume, Ozzie, Windows 8, Red Hat Linux 6, Amazon AWS, Informatica 9.6.1, Oracle 10g, Eclipse, SBT

Confidential, Providence, RI

Informatica Onsite Lead

Responsibilities:

Closely worked with Business users, Project manager & Stakeholder for gathering requirements
Done impact analysis, Design, Coding, preparing test plans and testing
Participated in the full life cycle for multiple projects: analysis, design, documentation, development and testing
Used Session parameters, mapping variable/parameters and create Parameter files for imparting flexible runs of work flows based on changing variable values.
Implemented reusable mappings & sessions for Operational Audit Process
Migrating code using deployment groups from development box to test
Performance tuning of sources, targets, mappings and SQL queries in the transformations

Environment: Informatica Power Center 8.6.1, Oracle11/10g, SQL Server, Flat files, SQL, TOAD, Windows XP, UNIX Shell Scripts, Microsoft visio

Confidential

SME (Subject Matter Expert)

Responsibilities:

Interacted with user and gathered information related to the production issues and enhancements
As part of enhancement involved in estimation, analysis, design and coding of a windows and mainframe applications
As part of production support taken care of 13 applications including 2 windows based applications.
Done analysis and prepared design documents based on the client requirements
Developed and maintained ETL/ELT applications
Done performance tuning in the applications
Used scripts to import and export data to from the database.
Developed UNIX Shell scripts for File Manipulation, FTP, Executing DB2 SQLs and archiving log files
Implemented various Performance Tuning techniques on Sources, Targets, Mappings, Sessions, Workflows and database

Environment: Informatica Power Center 8.6.1, Oracle11/10g, SQL Server, Flat files, SQL, TOAD, Windows XP, UNIX Shell Scripts, Microsoft visio, Z/800, COBOL, JCL, CICS, DB2, SPUFI, QMF, File Aid, CA-7

We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

Fort, LeE

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship