Sr. Big Data/ Hadoop Developer Resume Philadelphia, PA - Hire IT People

PROFESSIONAL SUMMARY:

Having 9+ years of IT experience and expertise in Hadoop, HDFS, HBase, Hive, Sqoop, Oozie, SQL, PLSQL, Teradata, Netezza, Sql Server with hands - on project experience in various Vertical Applications which includes Banking, Financial Services, Department of Health & Education, and eSales.
Highly dedicated and result oriented Hadoop Developer with 8+ years of strong end-to-end experience on Hadoop Development with varying level of expertise around different BIGDATA HADOOP projects.
Expertise in core Hadoop and Hadoop technology stack which includes HDFS, Map Reduce, Oozie, Hive, Sqoop, Pig, Flume, HBase, Spark, Storm, Kafka and Zookeeper.
Hands on experience in installing and deployment of Hadoop ecosystem components like Hadoop Map Reduce, YARN, HDFS, NoSQL, HBase, Oozie, Hive, Tableau, Sqoop, Pig, Zoo Keeper and Flume.
Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
Experienced in implementing complex algorithms on semi/unstructured data using Map reduce programs.
Experience in Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
Explored Spark, Kafka, and Storm along with other open source projects to create a POC.
Hands on experience in developing Map Reduce programs using Apache Hadoop for analyzing the Big Data.
Experience in importing and exporting data from RDBMS to HDFS, Hive tables, HBase by using Sqoop.
Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
Good knowledge on Python.
Excellent Working Knowledge in Spark Core, Spark SQL, Spark Streaming.
Developed fan-out workflow using flume for ingesting data from various data sources like Webservers, Rest API by using different sources and ingested data into Hadoop with HDFS sink.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems MYSQL, SQL SERVER and vice versa.
Actively involved in coding using Core Java and collection API's such as Lists, Sets and Maps.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Experience on different operating systems like UNIX, Linux and Windows.
Hands on Experience in Web Services using XML, HTML, JSON, Jquery and Ajax.
Strong knowledge of agile development methodologies, waterfall methodologies to minimize customer impact.
Expertise in middle-tier design and development of various web and enterprise applications using various technologies like JSP, Servlets, Struts, Hibernate, Spring, JDBC, Shell script, XML, AJAX, and Web Services
Good understanding of all aspects of Testing such as Unit, Regression, Agile, White-box, Black-box.
Ability to effectively manage deadlines. Self-motivated, highly organized and the ability to multi-task.

TECHNICAL SKILLS:

Big Data Platforms: Cloudera, Big Data, Hadoop, Yarn, Map Reduce, PIG, HIVE, Storm, Kafka, Oozie, Impala, Ignite, FLUME and SPARK

Languages: Java, C++, Python

Databases: Oracle, MySQL, SQL Server

No SQL Databases: Hbase, Cassandra, MongoDB, Accumulo

Job Scheduling Framework: Auto Sys, Quartz Scheduler

Operating Systems: Linux, Unix, Windows 7, Windows 8, XP, Windows vista

Hadoop Distribution: Cloudera, Horton Works, AWS

Web Technologies: HTML, XHTML, Java Script

Data Modelling tools: MS Visio, Rational Rose

Work Environments: Eclipse

PROFESSIONAL EXPERIENCE:

Confidential, Philadelphia, PA

Sr. Big Data/ Hadoop Developer

Responsibilities:

Extracted the data from Teradata/MySQL into HDFS using Sqoop export/import.
Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Expertise in using Data organizational design patterns in Map Reduce to convert business data into custom format.
Worked extensively on importing data using Sqoop.
Implemented Custom JOINS to create tables having the records of Items by Spark SQL.
Expertise in optimization of MapReduce algorithms using Combiners, Practitioners and Distributed Cache to deliver best results.
Experienced with handling data from different sources at a time to reducer using Object Writable in MapReduce programs.
Experienced knowledge over the Restful API's like Elastic Search.
Load log data into HDFS using Flume, Kafka.
Experienced with data processing and pipelining using Apache crunch.
Analyzed the data by performing Hive queries and running Pig scripts. Created and worked Sqoop jobs with incremental load to populate Hive External tables. Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis.
Involved in writing UNIX Shell Scripts for Informatics ETL tool to run the sessions.
Developed UDFs in Java as and when necessary to use in HIVE queries.
Developed Oozie workflow for scheduling and orchestrating the ETL process.
Implemented authentication using Kerberos authentication using Apache Sentry.
Deployed an Apache Solar search engine server to help speed up the search of the government cultural asset.
Developed and implemented a migration path from multiple Play instances to a clustered Akka actor system, using Scala capped collections as an event bus.
Implemented migration path from multiple Play instances to a clustered Akka actor system, using Scala capped collections as an event bus.
Performed iterative algorithms using Apache Spark on top of Hadoop YARN.

Environment: Hadoop, HDFS, Flume, Sqoop, Spark, Pig, Hive, Map Reduce, Elastic Search, HBase, Oozie, MRUnit, Maven, Avro, Scala, Linux, SVN, Apache Spark, Scala, MYSQL. Kafka.

Confidential, Albany, NY

Sr. Big Data/ Hadoop Developer

Responsibilities:

Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing, analyzing and testing the classifier using MapReduce, Pig and Hive jobs.
Build real-time data pipelines by leveraging open-source tools such as Apache Kafka and Spark.
Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
Analyzed the data by performing Hive queries and running Pig scripts.
Played the role in understanding the user requirement for Regional Office and how it is related to existing NYSE-CON project.
Played the role in developing the application using PL/SQL.
Involved in complete SDLC life cycle of big data project that includes requirement analysis, design, coding, testing and production
Developing Scripts and Auto Sys Jobs to schedule a bundle (group of coordinators), which consists of various Hadoop Programs using Oozie. Work with the Database Specialist and Technical Architect on the design work of the application.
Created hive tables defined with appropriate static and dynamic partitions, intended for efficiency and worked on them using HIVE QL.
Used Sqoop to import data from RDBMS into hive tables.
Used to manage and review Hadoop logs.
Responsible for moving the source code to Production.
Involved in gathering the requirements, Documenting and Review from the work streams & performance teams.
Involved in activity of VISIO diagrams for the complete flow of this application.
Involved in the mock up design work with the Java Architect and Analyst for the UI.
Responsible for moving the source code to UAT.
Responsible for installation of Oracle software on Windows.

Environment: s: Hadoop, MapReduce, HDFS, Hive, Pig, Linux, XML, Cloudera, CDH3/4 Distribution, Oracle 11i, MySQL, Flume, Oozie, Hbase

Confidential, Long Island, NY

Sr. Big Data/ Hadoop Developer

Responsibilities:

Involved in writing MapReduce jobs.
Used Hive to do transformations, event joins, filter both traffic and some pre-aggregations before storing the data onto HDFS.
Involved in developing Hive queries and UDFs for the needed functionality that is not out of the box available from Apache Hive.
Involved in using SQOOP for importing and exporting data into HDFS and Hive.
Involved in extracting user’s data from various data sources into Hadoop HDFS.
Implemented Commissioning and Decommissioning of new nodes to existing cluster.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
Using Avro and Parquet in MapReduce Jobs with Hadoop, Sqoop, Hive, Impala.
Collecting and aggregating large amounts of log data of staging data in HDFS for further analysis.
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Participated in evaluation and selection of new technologies to support system efficiency.
Participated in development and execution of system and disaster recovery processes.
Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
Active member for developing POC on streaming data using Apache Kafka and Spark Streaming.
Involved in preparing the Proof of Concept and the Presentations to demonstrate the solution to the business users on Data Integration.
Working on Agile scrum methodologies.
Analyzing new opportunities for my group. This include daily interaction with team to understand the business flow and analyze the application of technology to increase the time efficiency in a business work flow.

Environment: s: Hadoop, Hive 1.2, Oozie, Spark, Kafka, SQL Developer, TOAD, Oracle, Data Point, Agile - Version One, Windows 8, Unix, Teradata SQL Assistant, Agility, SQL Server.

Confidential, New Jersey

Hadoop Developer

Responsibilities:

Written Map/Reduce programs, Pig scripts to specify the conditions to separate the fraudulent claims
Good knowledge and understanding of REST architecture style and its application to well performing web sites for global usage.
Worked on Cloudera distribution of Hadoop
Worked on optimizing Shuffle and Sort phase in Map Reduce Phase.
Experience in writing business logic using Hive UDF's to perform ad-hoc queries on structured data.
Experience with HIVE DDLs and Hive Query language (HQLs)
Worked on dash boards that internally use Hive queries to perform analytics on structured data, Avro and Json data.
The Data Interface is implemented to get information of customers using Rest API and Pre-Process data using Map Reduce and store into HDFS.
Experience with SQOOP to Import/Export data from RDBMs to HDFS.
The Oozie work flows are configured to automate data flow, preprocess and cleaning tasks using Hadoop Actions.
Implemented Generic writable to in corporate multiple data sources into reducer to implement recommendation based reports using Map Reduce programs.
Implemented Map Reduce programs to find out top failure locations of the ATM’s using different tacking device.
The Cassandra CQL is used with Java API’s to retrieve data from Cassandra tables
Implemented Optimized joins to perform analysis on different data sets using Map Reduce programs.
Experienced in handling Avro and Json data in Hive using Hive SerDe's.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Cassandra, Eclipse

Confidential, NY

Hadoop Developer/ Admin

Responsibilities:

Involved in requirement analysis, design, coding and implementation.
Responsible for building scalable distributed data solutions using Hadoop Cloudera.
Installed Oozie workflow engine to run multiple Hive and Pig jobs.
Experience in supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS) cloud, performed Export and import of data into s3.
Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, PIG, and Hive to produce summary results from Hadoop to downstream systems.
Used Sqoop to import the data from Hadoop Distributed File System (HDFS) to RDBMS.
Established custom Map Reduce programs in order to analyze data and used Pig Latin to clean unwanted data.
Participated in SOLR schema and ingested data into SOLR for data indexing.
Extensive experience in designing and implementing Data Flow pipeline from RDBMS to Hadoop.
Worked on S3 buckets on AWS to store Cloud Formation Templates.
Worked on AWS to create EC2 instances.
Worked on various performance optimizations like using distributed cache for small datasets, partition, Bucketing and Map side joins.
Involved in creating Hive tables and applied those HQL on the tables for data validation.
Responsible for installation and configuration of Hive, Pig, Hbase and Sqoop on the Hadoop cluster.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
Used Zookeeper to manage coordination among the clusters.
Worked with Impala to pull the data from Hive tables.
Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
Hands on experience with NoSQL databases like MongoDB, Cassandra for POC (proof of concept) in storing URL's, images, products and supplements information at real time.

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, Oozie, Mongo DB, Java 6/7, Oracle 10g, Sub Version, Toad, UNIX Shell Scripting, SOAP, REST services, Oracle 10g, Agile Methodology, JIRA, Auto Sys.

Confidential, GA

Java Developer/ Hadoop Developer

Responsibilities:

Involved in Use Case meeting to understand and analyze the requirements, Coded as per Prototype.
Developed various UI (User Interface) components using Struts (MVC), JSP, and HTML.
Developed Controllers, created JSPs and configured in Struts-config.xml, Web.xml files.
Developed MVC architecture, Business Delegate, Service Locator, Session facade, and Data Access Object and Singleton patterns
Involved in writing all client side validations using Java Script, JSON.
Involved in the complete development, testing and maintenance process of the application.
Used Hibernate as the ORM tool to communicate with the database.
Designed and created a web-based test client using Struts up on client’s request, which is used to test the different parts of the application.
Involved in writing the test cases for the application using JUnit.
Used extensive JSP, HTML, and CSS to develop presentation layer to make it more user friendly.
Involved in different Testing phases like Unit Test, Integration Test and Regression Test.
Involved in Development process and have knowledge in usage of Tracker Tools like JIRA.
Involved in Restful Web services with JQuery using Jackson API,
Involved in Web services (SOAP, RESTful) Testing using Infor EAM Web Service tool kit

Environment: Core Java, JSP, Servlets, Struts, EJB2.0, Ext JS, XML, Oracle 11g, PostgreSQL, Java Script, Web Service, SQL Server 2008R2, Eclipse, TOAD, JIRA, SVN, Tortoise, Log4j.

We provide IT Staff Augmentation Services!

Sr. Big Data/ Hadoop Developer Resume

Philadelphia, PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship