Senior Hadoop Developer Resume

OBJECTIVE:

To be instrumental in achieving challenging goals for the organization, putting my technical knowledge in the field of Big Data Analytics and Data Warehousing and my functional know - how in Retail and Manufacturing domain into relevant application.

SUMMARY:

7 years of progressive IT work experience, 4 Years worked as a senior Big Data developer on Consulting Assignment with Wal-Mart expertise in Hadoop Ecosystem Technologies - Apache SPARK, Hive, Oozie, Sqoop, Java MR and supporting technologies Oracle PL/SQL, Greenplum DB, PIG, Unix Shell Scripting with exposure to Impala, Active MQ, Cassandra, etc.
Experience in working on Hadoop distributions with Cloudera(5.4,5.5,5.7), Pivotal HD(2.0 and 3.0) and Hortonworks(2.0).
Hadoop Developer and Big Data Analyst with experience in design, development, deployment and supporting large scale distributed systems.
Working primarily in the Retail, Manufacturing Domain.
Participated in Hackathon events - Hackathon Charlotte MMXVI, Walmart Datathon2015. Imported data by Pentaho, Projected analytics with Tableau.
Proficient in working with Hive, Oozie, Sqoop and many other Hadoop Eco System components for data storage and analysis.
Having hands on Benchmarking & Performance Tuning of hive queries using Partitions, Bucketing and Map Side join’s. Enhanced performance with TEZ.
Expertise in handling File Formats Sequence Files, RC, ORC, Text/CSV, Avro, Parquet and analyzed using HiveQL.
Optimized hive queries with Spark-Scala programs and thereby reduced run time of capabilities.
Experience in troubleshooting errors in Shell, Hive and MapReduce.
Performed importing and exporting data between HDFS and Relational Systems like MySQL, Oracle, DB2 and Teradata using Sqoop. Adhoc data with SSIS, Pentaho.
Designed and Implemented a generic data export to GreenPlum using GPLOAD utility through Local and Named Pipe transfer.
Expert in executing Oozie workflows by automating parallel job executions.
Experience in writing Pig Latin scripts to group, join and filter the data.
Led the agile transformation for the team using Kanban and Redefine the support model for IT operations resulted in more effective data delivery to the customer. Strong skills in agile environment using Kanban and Scrum.
Maintained code using GitHub, Tortoise SVN, MS VSS and Team Forge.
Monitored and Followed-Up tasks using JIRA, Confluence and SharePoint.
Good experience in generating Statistics/Extracts/Reports from the Hadoop.
Used Kanban, Waterfall, Scaled Agile Scrum software development methodology in several projects.
Experienced in developing custom UDF's for Hive to in corporate methods and functionality of Java into HiveQL.
Also worked as Release Engineer supporting code releases to Production.
Initial 2 years, associated with Confidential internal Project Confidential having in-depth hands on Oracle PL/SQL with experience in Software design, development, testing, deployment and maintenance of Web applications Framework on SABA using Core Java, WDK, SQL Server and Oracle.
My technological forte used here are Oracle PL/SQL and Java programming.
Proficient in working with UNIX servers, WebLogic Server System Administration and WDK page development (a framework based on XML).
Extensive experience in developing applications using Core Java.
Experience working with RDBMS ORACLE Database.
Analyzed performance of database objects and suggesting DBA for Indexes, schema gathering, partitioning, explain Plan, TK PROF.
Implemented PL/SQL to perform application security and batch job scheduling Written UNIX shell scripts for data files handling, FTP and executing the SQL*Loader.
Created email & file I/O operations utilities using Stored Procedures. Performed thorough Unit testing, System Testing and User Acceptance Testing on given environment providing quality work for functional users.
Experienced in Identifying improvement areas for systems stability and providing end-end high availability architectural solutions.
Good in negotiation, bug fixing and developing complex algorithms.
Determined, committed and hardworking individual with strong communication, interpersonal and organizational skills.

TECHNICAL SKILLS:

Hadoop Platforms: Cloudera, Pivotal HD, Hortonworks

Big Data Ecosystem: Apache SPARK: Scala, Hadoop, MapReduce, YARN, HDFS, Hive, Pig, Sqoop, Oozie.

File formats: Sequence Files, RC Files, ORC Files, Text/CSV, Avro, Parquee

Databases: Hive, Impala, Greenplum, Oracle9i, Oracle10g, SQL Server 2005, MySQL,NoSQL (Cassandra)

Languages: Hadoop Technologies, Scala, Python, Java, J2EE, PL/SQL, Unix Shell Scripting.

Open Frame works: Hadoop, WDK

IDE: Eclipse Kepler, PL/SQL Developer, PGAdminIII (for Greenplum),TOAD

Version control: GIT, MS VSS, Tortoise SVN, Team Forge

Project Tracking: JIRA, Confluence, Share Point

Server Access: WinSCP, Putty, Reflection FTP

Build Tools: Maven, Jenkins

Web Technologies: XML, XSLT, Java Script, HTML

Operating Systems: Unix, Linux, Windows, Mac OS X

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Platform: Pivotal HD, Horton Works

Senior Hadoop Developer

Responsibilities:

Havebusiness continuityif one of the Hadoop cluster goes down
Seamlessly handle migrationsand other cluster downtimes.
Load balance based onresource availability(Memory, CPU) in future
Load balance based ondata availabilityin different cluster as an enhancement to the policy.
First team to be onHortonworks distribution working out unexplored issues in 2 months.
Participated in Hackathon Charlotte MMXVI and suggested a new analytical model to raise the fund for a NGO organization through donation in terms of money and items.

Technologies: Spark Scala, Hive QL, Oozie, Sqoop, Pentaho, Oracle SQL, Unix Shell Scripting, Tableau

Confidential, Bentonville

Hadoop Platform: Pivotal HD, Horton Works (Supported from Pivotal HD 3.0)

Senior Hadoop Developer

Responsibilities:

Interacts with business analysts and prospective application managers to gather requirements, guide implementations and production rollouts for ETL batch & real-time applications.
Created Base Data Layer module which has set of common tables derived which can be used across capabilities in Assortment Discipline tool.
Developed Store Clustering Module across 1000 demographics variable, later fed into R program to form reclassified store clustering.
Calculated and Developed Substitutability model for determining best substitutable item from the distance calculated using two point formula.
Developed and Implemented Yules Q model by deriving household counts on visits and other aggregated metrics values.
Analyzed and developed item loyalty with household based on the visits and items purchased.
Performed data processing using HIVE.
Built customer analytical attributes using HIVE.
Enhanced HIVE queries performance using TEZ.
Involved in loading data from UNIX file system to HDFS.
Performance Management & Monitoring of ETL applications to monitor the health of the environments to proactively address potential issues
Analyze customer patterns based on the attributes
Understood the business needs and lead the team accordingly for deliverables.
Appreciated for Delivering Assortment Discipline tool in short time without compromising on quality
Participated in DATATHON and suggested a new analytical model with the available data and I have been ed a on the same.

Technologies:Spark-Scala, Java-ActiveMQ, Hive QL, Oracle SQL, Oozie, Sqoop, Pentaho, Unix ShellScripting, Tableau

Confidential, Bentonville

Delivery Model: Scaled Agile

Hadoop Platform: Pivotal HD

Senior Hadoop Developer

Responsibilities:

Transferred and load datasets from Hadoop Tables to Greenplum.
Developed and Delivered Demand Transference Module to identify the items performing poor in the stores and analyze most suitable substitutes with prospective dollar at risk.
Built and Optimized HIVE queries for Customer Attribution datasets.
Orchestrated HIVE queries and Shell script using oozie workflows
Developed Hive queries to process the data for visualizing and reporting.
Managed Hadoop cluster using Pivotal HD
Appreciated for developing a generic module GPLOAD from Hadoop to Greenplum which was started using by all teams as a horizontal tool and commended for the versatility and flexibility of the code.

Technologies:HiveQL, Greenplum, Oozie, Sqoop, Shellscripting

Confidential

Delivery Model: Scaled Agile

Hadoop Platform: Cloudera, Pivotal HD

Developer

Responsibilities:

Implemented Partitioning, Dynamic Partitions and SMBJ in HIVE for efficient data access.
Optimized hive queries and modified oozie workflow design to reduce overall time taken from several days to hours
Automated script Flat CTM module to refresh data on required time frame (full,delta,partial).
Developed Generic Drop Partitions module which prevent duplication of data in any run.
Processed Market Basket transaction data for Walmart customers
Orchestrated HIVE queries and shell scripts using oozie workflows
Managed Hadoop cluster using Cloudera
Appreciated from all levels of management from Confidential and Walmart for handling a huge volume of without compromising on quality of data.

Technologies: Java MR, HiveQL, Oozie, Sqoop, Unix Shell Scripting

Confidential

Delivery Model:Scaled Agile

Hadoop Platform: Cloudera

Release Engineer

Responsibilities:

Delivered CKP Version1.0 which is first of its kind in Hadoop Big data both from Confidential and Walmart accounts.
Worked on automating importing and exporting jobs into HDFS and Hive using Sqoop from relational databases like Oracle and Teradata across all channels Walmart BM, Sams BM, Walmart.com, Sams.com, Layaway, TLE etc.
Designed, developed automated script to SVN Project structure, build and distribution for Release Management process.
Analysis and development of automated reusable scripts to resolve the critical issue of validation results.
Created UDF’s, UDAF’s and Worked on automating importing and exporting jobs into HDFS and Hive using Sqoop from relational databases like Oracle and Teradata.
Created MR Program for handling large data.
Knowledge in performance troubleshooting and tuning Hadoop clusters in Cloudera.

Technologies: Impala, Java, HiveQL, Oozie, Sqoop, Shell scripting

Confidential

Delivery Model: Waterfall

System Administrator

Responsibilities:

Designed, developed an interface with Career Management System leading Un-Allocated associates to maintain learning curve and stay competent for a new project.
Suggested, developed, implemented process improvement on Batch Logs for better management of Associates issues contributing to reduced tickets against application.
Designed and developed automated solution of batch monitoring with front end process killing.
Generated various reports on PL/SQL with output format in Excel/Pdf/Html.
Create procedures to help in complex data transformations for data warehouse.

Technologies: PL/SQL, Java, Shell scripting, SABA, WDK, XML

Special Software:

Oracle 11g database
Weblogic 8.1.4 & 8.1.6
Java, J2EE

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship