Hadoop Developer Resume Plainsboro, NJ - Hire IT People

SUMMARY

Having 3.5 years of experience in Bigdata as Hadoop Developer
Good experience in Parallel Processing tool Spark with Python/Scala
Good experience working with tools like HIVE, PIG, Python, Oozie and Sqoop, HUE
Worked using Spark - SQL utility for SQL query execution on Hadoop environment
Working with Data frames in Spark to Process complex queries and Analyze Data
Extensively worked on Hive data warehouse and implemented ETL
Knowledge on Machine Learning tool R Programming Language
Experience on Python to Implement Data warehouse solutions, Data Ingestion
Developed ETL Solution on HIVE for Teradata Offload
Worked on EXCEL-Hive Business Intelligence Integration
Developed UDFs in core Java supporting for PIG and HIVE data warehouse
Excellent understanding of Hadoop architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager and Spark programming paradigm
Hands-on-experience on major components in Hadoop Ecosystem such as Hive, PIG, MapReduce, Sqoop, Hbase, Hbase-Hive Integration and good knowledge of Mapper/Reducer/HDFS Framework and YARN
Exposure to Cloudera development environment and management using HUE.
Experience in data management and implementation of Big Data applications using Hadoop frameworks
Knowledge of manipulating/analyzing large datasets/stored data and finding patterns and insights out of it based on the requirement
Experience in importing and exporting the different formats of data into HDFS, HBASE from different RDBMS databases and vice versa
Experience in extending Hive and Pig core functionality by writing custom UDFs using Java
Experience in importing and exporting data from RDBMS to HDFS and vice versa using Sqoop, python
Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs
Excellent technical, logical, code debugging and problem solving capabilities and ability to watch the future environment, the competitor and customers probable activities carefully
Good team player with strong analytical and communication skills
2+ Years of experience in Business Intelligence reporting tool Tableau
Hands on Experience on creating various Views and Dashboards in Tableau
Have Good Knowledge on Working with Joins and Custom SQL
Experience in creating Aggregates, Hierarchies, Formatting, Sorting and Grouping
Experience in working on Filters, Quick Filters, Context Filters and Parameters
Good Expertise on working with Multiple Measures, Blended axis, Dual axis
Have good knowledge on Working with String, Date, Table calculations and calculated measures
Good Knowledge on creating Actions like Filter, Highlight and URL
Expertise in Working on Maps and good knowledge on Custom Geo coding

TECHNICAL SKILLS

BI Tools: Apache Spark, Tableau 8.X, Hadoop, Apache PIG, HIVEBase, Sqoop, Python, Oozie

Databases: Oracle 10g, MS SQL Server 2005, Teradata

Languages: SQL, Core Java, Python, UNIX Shel

lVersioning: Tortoise SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential, Plainsboro, NJ

Hadoop Developer

Responsibilities

Python scripts to automate data ingestion process
Implemented auto ingestion on different input data formats and sources
Python scripts to clean data from input data files
Python scripts to updated ingestion in respected HDFS locations
Implemented Oozie jobs on Python auto ingestion process
Using Hive ETL normalized data and applied business logic
Developed the PIG, Hive scripts & UDF's to study the User sessions and member behavior
Wrote Pig Scripts to perform transformation procedures on the data in HDFS
Processed HDFS data and created external tables using Hive, in order to analyze Products sold per day, Locations and different Vendors
Extended PIG & Hive framework through the use of custom UDF to meet the requirements
Developed multiple PIG scripts for clustering and grouping user sessions
Developed Oozie workflows to process the Hadoop jobs

Environment: Hadoop, Python,2.7, HDFS, Hive 0.12.1, Java, Hadoop distribution of Cloudera, Pig 0.11.1,, Linux, Sqoop 1.4.4, Oozie 3.3.0, Tableau, Notepad++

Confidential, San Jose, CA

Hadoop Developer

Responsibilities

Implemented Hive ETL solution for Teradata Offload
Writing Spark Scripts to Process and analyze large sets of data
Ingested data from various tables and performed Sqoop imports
Applied Confidential Supply chain Business Logic on source data
Using Hive ETL normalized data and applied business logic
Hive performance tuning to process 2 Billion record sets
TeraData-Hadoop(Tpump) utility to export data into Teradata Tables
Involved in creating Hive tables, and loading and analyzing data using hive queries
Optimizing of existing algorithms in Hadoop using Spark Context, Spark-Sql, Data Frames and Pair RDD’s
Implemented Spark using Python and utilizing Data frames and Spark SQL API for faster processing of data
Created RDD’s, Data Frames and Datasets
Used ORC, Parquet file formats for storing the data
Used java code for SQl Queries and also code to retrieve the Sql Queries through Text File
Log4j framework has been used for logging debug, info & error data
Created Hive External and Managed tables
Designed and Maintained Tez workflows to manage the flow of jobs in the cluster
Loaded the Spark RDD and do in memory data Computation to generate the Output response

Environment: Hadoop, HDFS, PySpark, Teradata, TPump, Hive 0.11.1, Java, MapR, Pig 0.11.1,, Linux, Sqoop 1.4.4, Oozie 3.3.0, Notepad++

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities

Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology
Imported the data from SQL Server and landed it on to HDFS Using Sqoop import
Developed data pipeline using, Sqoop, Pig and Java map reduce to ingest and dump customer behavioral data and purchase histories into HDFS prior to analysis
Used Hive to analyze the data and compute
Used Pig for various data joins and data enrichment
Optimizing Map-reduce code, pig scripts, user interface analysis, performance tuning and analysis
Loaded the aggregated data onto SQL Server again Using Sqoop export for reporting on the dashboard

Environment: Hadoop, HDFS, Hive 0.12.1, Java, Hadoop distribution of Cloudera, Pig 0.11.1, Linux, Sqoop, Microsoft Excel Reporting, Notepad++

Confidential

Business Analysis

Responsibilities

Understanding and analyzing the business requirements
Designing and develop Tableau Reports, Documents, Dashboards and Scorecards per specified requirements and timelines
Extracted data from various sources and performed data blending
Created various interactive dashboards
Developed various Reports as per customer requirements.
Experience on KPI (key performance Indicators)
Creating Customized and Interactive dashboards using data sources and custom objects
Created quick filters, table calculations, calculated fields and parameters

Environment: Tableau, SQLServer2008, Microsoft Excel

Confidential

Automation and Database Testing

Responsibilities

Understanding and analyzing the business requirements
Extracted data from various sources and performed data blending
Created various interactive dashboards
Designed hybrid (modular and data driven frame work) as per the discussion with the client and on-site team
Created library files for reusable operations, for run time settings, for logging the results in excel
Handled client status calls and have put the status mails on daily basis
Presented the demonstration on execution of Framework to client
Gave KT's to the newly added resources regarding ongoing automation framework and approach

Environment: SQLServer2008, HPE UFT, VBScript, Microsoft Excel,

Confidential

Database and Automation Tester

Responsibilities

Written test cases for integration and end to end testing
Involved in documenting automation test plan and automation strategy
Involved in documenting the approach and discussion about designing the framework with the client and on-site team
Identified the reusable operations in scenarios to be automated
Worked with SOAP UI and XML files for automating the web services
Designed hybrid (modular and data driven frame work) as per the discussion with the client and on-site team
Created library files for reusable operations, for run time settings, for logging the results in excel

Environment: SQLServer2008, TestComplete, JScript, Microsoft Excel, SOAP UI

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Plainsboro, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship