We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Plainsboro, NJ

SUMMARY

  • Having 3.5 years of experience in Bigdata as Hadoop Developer
  • Good experience in Parallel Processing tool Spark with Python/Scala
  • Good experience working with tools like HIVE, PIG, Python, Oozie and Sqoop, HUE
  • Worked using Spark - SQL utility for SQL query execution on Hadoop environment
  • Working with Data frames in Spark to Process complex queries and Analyze Data
  • Extensively worked on Hive data warehouse and implemented ETL
  • Knowledge on Machine Learning tool R Programming Language
  • Experience on Python to Implement Data warehouse solutions, Data Ingestion
  • Developed ETL Solution on HIVE for Teradata Offload
  • Worked on EXCEL-Hive Business Intelligence Integration
  • Developed UDFs in core Java supporting for PIG and HIVE data warehouse
  • Excellent understanding of Hadoop architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager and Spark programming paradigm
  • Hands-on-experience on major components in Hadoop Ecosystem such as Hive, PIG, MapReduce, Sqoop, Hbase, Hbase-Hive Integration and good knowledge of Mapper/Reducer/HDFS Framework and YARN
  • Exposure to Cloudera development environment and management using HUE.
  • Experience in data management and implementation of Big Data applications using Hadoop frameworks
  • Knowledge of manipulating/analyzing large datasets/stored data and finding patterns and insights out of it based on the requirement
  • Experience in importing and exporting the different formats of data into HDFS, HBASE from different RDBMS databases and vice versa
  • Experience in extending Hive and Pig core functionality by writing custom UDFs using Java
  • Experience in importing and exporting data from RDBMS to HDFS and vice versa using Sqoop, python
  • Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs
  • Excellent technical, logical, code debugging and problem solving capabilities and ability to watch the future environment, the competitor and customers probable activities carefully
  • Good team player with strong analytical and communication skills
  • 2+ Years of experience in Business Intelligence reporting tool Tableau
  • Hands on Experience on creating various Views and Dashboards in Tableau
  • Have Good Knowledge on Working with Joins and Custom SQL
  • Experience in creating Aggregates, Hierarchies, Formatting, Sorting and Grouping
  • Experience in working on Filters, Quick Filters, Context Filters and Parameters
  • Good Expertise on working with Multiple Measures, Blended axis, Dual axis
  • Have good knowledge on Working with String, Date, Table calculations and calculated measures
  • Good Knowledge on creating Actions like Filter, Highlight and URL
  • Expertise in Working on Maps and good knowledge on Custom Geo coding

TECHNICAL SKILLS

BI Tools: Apache Spark, Tableau 8.X, Hadoop, Apache PIG, HIVEBase, Sqoop, Python, Oozie

Databases: Oracle 10g, MS SQL Server 2005, Teradata

Languages: SQL, Core Java, Python, UNIX Shel

lVersioning: Tortoise SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential, Plainsboro, NJ

Hadoop Developer

Responsibilities

  • Python scripts to automate data ingestion process
  • Implemented auto ingestion on different input data formats and sources
  • Python scripts to clean data from input data files
  • Python scripts to updated ingestion in respected HDFS locations
  • Implemented Oozie jobs on Python auto ingestion process
  • Using Hive ETL normalized data and applied business logic
  • Developed the PIG, Hive scripts & UDF's to study the User sessions and member behavior
  • Wrote Pig Scripts to perform transformation procedures on the data in HDFS
  • Processed HDFS data and created external tables using Hive, in order to analyze Products sold per day, Locations and different Vendors
  • Extended PIG & Hive framework through the use of custom UDF to meet the requirements
  • Developed multiple PIG scripts for clustering and grouping user sessions
  • Developed Oozie workflows to process the Hadoop jobs

Environment: Hadoop, Python,2.7, HDFS, Hive 0.12.1, Java, Hadoop distribution of Cloudera, Pig 0.11.1,, Linux, Sqoop 1.4.4, Oozie 3.3.0, Tableau, Notepad++

Confidential, San Jose, CA

Hadoop Developer

Responsibilities

  • Implemented Hive ETL solution for Teradata Offload
  • Writing Spark Scripts to Process and analyze large sets of data
  • Ingested data from various tables and performed Sqoop imports
  • Applied Confidential Supply chain Business Logic on source data
  • Using Hive ETL normalized data and applied business logic
  • Hive performance tuning to process 2 Billion record sets
  • TeraData-Hadoop(Tpump) utility to export data into Teradata Tables
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-Sql, Data Frames and Pair RDD’s
  • Implemented Spark using Python and utilizing Data frames and Spark SQL API for faster processing of data
  • Created RDD’s, Data Frames and Datasets
  • Used ORC, Parquet file formats for storing the data
  • Used java code for SQl Queries and also code to retrieve the Sql Queries through Text File
  • Log4j framework has been used for logging debug, info & error data
  • Created Hive External and Managed tables
  • Designed and Maintained Tez workflows to manage the flow of jobs in the cluster
  • Loaded the Spark RDD and do in memory data Computation to generate the Output response

Environment: Hadoop, HDFS, PySpark, Teradata, TPump, Hive 0.11.1, Java, MapR, Pig 0.11.1,, Linux, Sqoop 1.4.4, Oozie 3.3.0, Notepad++

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities

  • Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology
  • Imported the data from SQL Server and landed it on to HDFS Using Sqoop import
  • Developed data pipeline using, Sqoop, Pig and Java map reduce to ingest and dump customer behavioral data and purchase histories into HDFS prior to analysis
  • Used Hive to analyze the data and compute
  • Used Pig for various data joins and data enrichment
  • Optimizing Map-reduce code, pig scripts, user interface analysis, performance tuning and analysis
  • Loaded the aggregated data onto SQL Server again Using Sqoop export for reporting on the dashboard

Environment: Hadoop, HDFS, Hive 0.12.1, Java, Hadoop distribution of Cloudera, Pig 0.11.1, Linux, Sqoop, Microsoft Excel Reporting, Notepad++

Confidential

Business Analysis

Responsibilities

  • Understanding and analyzing the business requirements
  • Designing and develop Tableau Reports, Documents, Dashboards and Scorecards per specified requirements and timelines
  • Extracted data from various sources and performed data blending
  • Created various interactive dashboards
  • Developed various Reports as per customer requirements.
  • Experience on KPI (key performance Indicators)
  • Creating Customized and Interactive dashboards using data sources and custom objects
  • Created quick filters, table calculations, calculated fields and parameters

Environment: Tableau, SQLServer2008, Microsoft Excel

Confidential

Automation and Database Testing

Responsibilities

  • Understanding and analyzing the business requirements
  • Extracted data from various sources and performed data blending
  • Created various interactive dashboards
  • Designed hybrid (modular and data driven frame work) as per the discussion with the client and on-site team
  • Created library files for reusable operations, for run time settings, for logging the results in excel
  • Handled client status calls and have put the status mails on daily basis
  • Presented the demonstration on execution of Framework to client
  • Gave KT's to the newly added resources regarding ongoing automation framework and approach

Environment: SQLServer2008, HPE UFT, VBScript, Microsoft Excel,

Confidential

Database and Automation Tester

Responsibilities

  • Written test cases for integration and end to end testing
  • Involved in documenting automation test plan and automation strategy
  • Involved in documenting the approach and discussion about designing the framework with the client and on-site team
  • Identified the reusable operations in scenarios to be automated
  • Worked with SOAP UI and XML files for automating the web services
  • Designed hybrid (modular and data driven frame work) as per the discussion with the client and on-site team
  • Created library files for reusable operations, for run time settings, for logging the results in excel

Environment: SQLServer2008, TestComplete, JScript, Microsoft Excel, SOAP UI

We'd love your feedback!