We provide IT Staff Augmentation Services!

Sr. Big Data/ Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Philadelphia, PA

PROFESSIONAL SUMMARY:

  • Having 9+ years of IT experience and expertise in Hadoop, HDFS, HBase, Hive, Sqoop, Oozie, SQL, PLSQL, Teradata, Netezza, Sql Server with hands - on project experience in various Vertical Applications which includes Banking, Financial Services, Department of Health & Education, and eSales.
  • Highly dedicated and result oriented Hadoop Developer with 8+ years of strong end-to-end experience on Hadoop Development with varying level of expertise around different BIGDATA HADOOP projects.
  • Expertise in core Hadoop and Hadoop technology stack which includes HDFS, Map Reduce, Oozie, Hive, Sqoop, Pig, Flume, HBase, Spark, Storm, Kafka and Zookeeper.
  • Hands on experience in installing and deployment of Hadoop ecosystem components like Hadoop Map Reduce, YARN, HDFS, NoSQL, HBase, Oozie, Hive, Tableau, Sqoop, Pig, Zoo Keeper and Flume.
  • Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • Experienced in implementing complex algorithms on semi/unstructured data using Map reduce programs.
  • Experience in Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
  • Explored Spark, Kafka, and Storm along with other open source projects to create a POC.
  • Hands on experience in developing Map Reduce programs using Apache Hadoop for analyzing the Big Data.
  • Experience in importing and exporting data from RDBMS to HDFS, Hive tables, HBase by using Sqoop.
  • Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
  • Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
  • Good knowledge on Python.
  • Excellent Working Knowledge in Spark Core, Spark SQL, Spark Streaming.
  • Developed fan-out workflow using flume for ingesting data from various data sources like Webservers, Rest API by using different sources and ingested data into Hadoop with HDFS sink.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems MYSQL, SQL SERVER and vice versa.
  • Actively involved in coding using Core Java and collection API's such as Lists, Sets and Maps.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Experience on different operating systems like UNIX, Linux and Windows.
  • Hands on Experience in Web Services using XML, HTML, JSON, Jquery and Ajax.
  • Strong knowledge of agile development methodologies, waterfall methodologies to minimize customer impact.
  • Expertise in middle-tier design and development of various web and enterprise applications using various technologies like JSP, Servlets, Struts, Hibernate, Spring, JDBC, Shell script, XML, AJAX, and Web Services
  • Good understanding of all aspects of Testing such as Unit, Regression, Agile, White-box, Black-box.
  • Ability to effectively manage deadlines. Self-motivated, highly organized and the ability to multi-task.

TECHNICAL SKILLS:

Big Data Platforms: Cloudera, Big Data, Hadoop, Yarn, Map Reduce, PIG, HIVE, Storm, Kafka, Oozie, Impala, Ignite, FLUME and SPARK

Languages: Java, C++, Python

Databases: Oracle, MySQL, SQL Server

No SQL Databases: Hbase, Cassandra, MongoDB, Accumulo

Job Scheduling Framework: Auto Sys, Quartz Scheduler

Operating Systems: Linux, Unix, Windows 7, Windows 8, XP, Windows vista

Hadoop Distribution: Cloudera, Horton Works, AWS

Web Technologies: HTML, XHTML, Java Script

Data Modelling tools: MS Visio, Rational Rose

Work Environments: Eclipse

PROFESSIONAL EXPERIENCE:

Confidential, Philadelphia, PA

Sr. Big Data/ Hadoop Developer

Responsibilities:

  • Extracted the data from Teradata/MySQL into HDFS using Sqoop export/import.
  • Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Expertise in using Data organizational design patterns in Map Reduce to convert business data into custom format.
  • Worked extensively on importing data using Sqoop.
  • Implemented Custom JOINS to create tables having the records of Items by Spark SQL.
  • Expertise in optimization of MapReduce algorithms using Combiners, Practitioners and Distributed Cache to deliver best results.
  • Experienced with handling data from different sources at a time to reducer using Object Writable in MapReduce programs.
  • Experienced knowledge over the Restful API's like Elastic Search.
  • Load log data into HDFS using Flume, Kafka.
  • Experienced with data processing and pipelining using Apache crunch.
  • Analyzed the data by performing Hive queries and running Pig scripts. Created and worked Sqoop jobs with incremental load to populate Hive External tables. Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis.
  • Involved in writing UNIX Shell Scripts for Informatics ETL tool to run the sessions.
  • Developed UDFs in Java as and when necessary to use in HIVE queries.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process.
  • Implemented authentication using Kerberos authentication using Apache Sentry.
  • Deployed an Apache Solar search engine server to help speed up the search of the government cultural asset.
  • Developed and implemented a migration path from multiple Play instances to a clustered Akka actor system, using Scala capped collections as an event bus.
  • Implemented migration path from multiple Play instances to a clustered Akka actor system, using Scala capped collections as an event bus.
  • Performed iterative algorithms using Apache Spark on top of Hadoop YARN.

Environment: Hadoop, HDFS, Flume, Sqoop, Spark, Pig, Hive, Map Reduce, Elastic Search, HBase, Oozie, MRUnit, Maven, Avro, Scala, Linux, SVN, Apache Spark, Scala, MYSQL. Kafka.

Confidential, Albany, NY

Sr. Big Data/ Hadoop Developer

Responsibilities:

  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing, analyzing and testing the classifier using MapReduce, Pig and Hive jobs.
  • Build real-time data pipelines by leveraging open-source tools such as Apache Kafka and Spark.
  • Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
  • Analyzed the data by performing Hive queries and running Pig scripts.
  • Played the role in understanding the user requirement for Regional Office and how it is related to existing NYSE-CON project.
  • Played the role in developing the application using PL/SQL.
  • Involved in complete SDLC life cycle of big data project that includes requirement analysis, design, coding, testing and production
  • Developing Scripts and Auto Sys Jobs to schedule a bundle (group of coordinators), which consists of various Hadoop Programs using Oozie. Work with the Database Specialist and Technical Architect on the design work of the application.
  • Created hive tables defined with appropriate static and dynamic partitions, intended for efficiency and worked on them using HIVE QL.
  • Used Sqoop to import data from RDBMS into hive tables.
  • Used to manage and review Hadoop logs.
  • Responsible for moving the source code to Production.
  • Involved in gathering the requirements, Documenting and Review from the work streams & performance teams.
  • Involved in activity of VISIO diagrams for the complete flow of this application.
  • Involved in the mock up design work with the Java Architect and Analyst for the UI.
  • Responsible for moving the source code to UAT.
  • Responsible for installation of Oracle software on Windows.

Environment: s: Hadoop, MapReduce, HDFS, Hive, Pig, Linux, XML, Cloudera, CDH3/4 Distribution, Oracle 11i, MySQL, Flume, Oozie, Hbase

Confidential, Long Island, NY

Sr. Big Data/ Hadoop Developer

Responsibilities:

  • Involved in writing MapReduce jobs.
  • Used Hive to do transformations, event joins, filter both traffic and some pre-aggregations before storing the data onto HDFS.
  • Involved in developing Hive queries and UDFs for the needed functionality that is not out of the box available from Apache Hive.
  • Involved in using SQOOP for importing and exporting data into HDFS and Hive.
  • Involved in extracting user’s data from various data sources into Hadoop HDFS.
  • Implemented Commissioning and Decommissioning of new nodes to existing cluster.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
  • Using Avro and Parquet in MapReduce Jobs with Hadoop, Sqoop, Hive, Impala.
  • Collecting and aggregating large amounts of log data of staging data in HDFS for further analysis.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Participated in evaluation and selection of new technologies to support system efficiency.
  • Participated in development and execution of system and disaster recovery processes.
  • Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
  • Active member for developing POC on streaming data using Apache Kafka and Spark Streaming.
  • Involved in preparing the Proof of Concept and the Presentations to demonstrate the solution to the business users on Data Integration.
  • Working on Agile scrum methodologies.
  • Analyzing new opportunities for my group. This include daily interaction with team to understand the business flow and analyze the application of technology to increase the time efficiency in a business work flow.

Environment: s: Hadoop, Hive 1.2, Oozie, Spark, Kafka, SQL Developer, TOAD, Oracle, Data Point, Agile - Version One, Windows 8, Unix, Teradata SQL Assistant, Agility, SQL Server.

Confidential, New Jersey

Hadoop Developer

Responsibilities:

  • Written Map/Reduce programs, Pig scripts to specify the conditions to separate the fraudulent claims
  • Good knowledge and understanding of REST architecture style and its application to well performing web sites for global usage.
  • Worked on Cloudera distribution of Hadoop
  • Worked on optimizing Shuffle and Sort phase in Map Reduce Phase.
  • Experience in writing business logic using Hive UDF's to perform ad-hoc queries on structured data.
  • Experience with HIVE DDLs and Hive Query language (HQLs)
  • Worked on dash boards that internally use Hive queries to perform analytics on structured data, Avro and Json data.
  • The Data Interface is implemented to get information of customers using Rest API and Pre-Process data using Map Reduce and store into HDFS.
  • Experience with SQOOP to Import/Export data from RDBMs to HDFS.
  • The Oozie work flows are configured to automate data flow, preprocess and cleaning tasks using Hadoop Actions.
  • Implemented Generic writable to in corporate multiple data sources into reducer to implement recommendation based reports using Map Reduce programs.
  • Implemented Map Reduce programs to find out top failure locations of the ATM’s using different tacking device.
  • The Cassandra CQL is used with Java API’s to retrieve data from Cassandra tables
  • Implemented Optimized joins to perform analysis on different data sets using Map Reduce programs.
  • Experienced in handling Avro and Json data in Hive using Hive SerDe's.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Cassandra, Eclipse

Confidential, NY

Hadoop Developer/ Admin

Responsibilities:

  • Involved in requirement analysis, design, coding and implementation.
  • Responsible for building scalable distributed data solutions using Hadoop Cloudera.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Experience in supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS) cloud, performed Export and import of data into s3.
  • Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, PIG, and Hive to produce summary results from Hadoop to downstream systems.
  • Used Sqoop to import the data from Hadoop Distributed File System (HDFS) to RDBMS.
  • Established custom Map Reduce programs in order to analyze data and used Pig Latin to clean unwanted data.
  • Participated in SOLR schema and ingested data into SOLR for data indexing.
  • Extensive experience in designing and implementing Data Flow pipeline from RDBMS to Hadoop.
  • Worked on S3 buckets on AWS to store Cloud Formation Templates.
  • Worked on AWS to create EC2 instances.
  • Worked on various performance optimizations like using distributed cache for small datasets, partition, Bucketing and Map side joins.
  • Involved in creating Hive tables and applied those HQL on the tables for data validation.
  • Responsible for installation and configuration of Hive, Pig, Hbase and Sqoop on the Hadoop cluster.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Used Zookeeper to manage coordination among the clusters.
  • Worked with Impala to pull the data from Hive tables.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Hands on experience with NoSQL databases like MongoDB, Cassandra for POC (proof of concept) in storing URL's, images, products and supplements information at real time.

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, Oozie, Mongo DB, Java 6/7, Oracle 10g, Sub Version, Toad, UNIX Shell Scripting, SOAP, REST services, Oracle 10g, Agile Methodology, JIRA, Auto Sys.

Confidential, GA

Java Developer/ Hadoop Developer

Responsibilities:

  • Involved in Use Case meeting to understand and analyze the requirements, Coded as per Prototype.
  • Developed various UI (User Interface) components using Struts (MVC), JSP, and HTML.
  • Developed Controllers, created JSPs and configured in Struts-config.xml, Web.xml files.
  • Developed MVC architecture, Business Delegate, Service Locator, Session facade, and Data Access Object and Singleton patterns
  • Involved in writing all client side validations using Java Script, JSON.
  • Involved in the complete development, testing and maintenance process of the application.
  • Used Hibernate as the ORM tool to communicate with the database.
  • Designed and created a web-based test client using Struts up on client’s request, which is used to test the different parts of the application.
  • Involved in writing the test cases for the application using JUnit.
  • Used extensive JSP, HTML, and CSS to develop presentation layer to make it more user friendly.
  • Involved in different Testing phases like Unit Test, Integration Test and Regression Test.
  • Involved in Development process and have knowledge in usage of Tracker Tools like JIRA.
  • Involved in Restful Web services with JQuery using Jackson API,
  • Involved in Web services (SOAP, RESTful) Testing using Infor EAM Web Service tool kit

Environment: Core Java, JSP, Servlets, Struts, EJB2.0, Ext JS, XML, Oracle 11g, PostgreSQL, Java Script, Web Service, SQL Server 2008R2, Eclipse, TOAD, JIRA, SVN, Tortoise, Log4j.

We'd love your feedback!