We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

Plantation, FL

SUMMARY

  • A motivated and result driven professional with 8+ years of experience in Software development including 3+ years of heavy exposure to Big Data /Hadoop, Actively involved in analysis and implementation of various trending technologies in Big Data Eco Systems and NoSQL Technologies under different verticals like Finance, Health - care and Insurance.
  • 5 Years of exposure to full development life cycle of Java/J2EE Application/Web development.
  • Proficient in processing large sets of structured, semi-structured and unstructured data for data mining using optimized Map Reduce programs, PIG scripts and HIVE queries.
  • Responsible for creating complex Map Reduce programs by customizing framework at various levels.
  • Good knowledge in writing customized UDF’s, UDAF’s and UDTF’s to extend Hive and Pig Latin functionality.
  • Experience in loading Log data and unstructured data from multiple sources to HDFS using Flume.
  • Performed Real time event processing of data from multiple servers in the organization usingApache Storm by integrating withApacheKafka.
  • Extensive experience inSparkStreaming and implementing Spark machine learning libraries in Scala.
  • Expertise in NOSQL databases like HBase, Cassandra and Mongo DB.
  • Experience with Cassandra in optimizing it for writes and pre-computing aggregations to perform various statistics.
  • Involved in installing and maintaining 36 node MongoDB cluster with replication and sharding enabled.
  • Involved in data modelling and designing indexing model for MongoDB.
  • Performed data modeling to connect data stored in Cassandra Database to the data processing layers and wrote queries in CQL.
  • Experience in using the Sqoop for importing and exporting data from HDFS, HBase and Hive to Relational Database Systems and vice versa.
  • Extensive experience in Oozie for designing, monitoring and scheduling both time driven and data driven automated job workflows.
  • Hands on experience with puppet for automating the Hadoop Installations, configuring and maintaining the clusters.
  • Experience in using Cloudera Manager, Apache Ambari, Ganglia and Nagios for monitoring jobs running on cluster.
  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities of Apache Spark written in Scala.
  • Experienced in Spark Streaming in order to ingest data from multiple data sources into HDFS.
  • Experience with Flume and Apache Kafka to create data pipeline to ingest browsing data into HBase/HDFS for analysis.
  • Working experience in creating, configuring and monitoring Hadoop clusters on EC2, VM, and Horton works Data Platform 2.1 & 2.2, CDH3, and CDH4 using Cloudera Manager.
  • Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
  • Worked on all phases of data warehouse development lifecycle, ETL design and implementation, and support of new and existing applications.
  • Extensive experience in designing and developing the enterprise applications using Java, J2EE Technologies, JavaScript, Struts, Hibernate, EJB and Spring Framework.
  • Widely used different Web/Application servers like WebLogic, Web Sphere 6.x, JBoss and Tomcat Servers for deployment of builds, Server Configuration and performance tuning including troubleshooting and maintenance.
  • Extensive experience with RDBMS integration with enterprise applications in writing SQL Queries, Stored Procedures, Functions and Triggers using Oracle 9i/10g/11g, IBM DB2 and MySql.
  • Strong understanding of Agile (Scrum) and Waterfall SDLC methodologies.
  • Experience in developing web-based User Interface using ExtJS, Javascript, jQuery, CSS, HTML, HTML5 and XHTML.

TECHNICAL SKILLS

Big data/Hadoop Ecosystem: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, Storm, Spark, Scala, Avro, Mrunit, Solr.

Java / J2EE Technologies: Core Java, Servlets, JSP, JDBC, XML, REST, SOAP, WSDL

Programming Languages: C, C++, Java, Scala, SQL, PL/SQL, Linux shell scripts.

NoSQL Databases: MongoDB, Cassandra, HBase

Database: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, Teradata.

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP, WSDL

Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2.

Tools: Used: Eclipse, IntelliJ, Putty, Winscp

Operating System: RedHat, Windows 7/8, server 2008/2003, Mac OS.

ETL Tools: Informatica, pentaho.

Testing: Hadoop Testing(MRunit, Mockito), Hive Testing, Quality Center (QC)

Application/Web Servers: IBM Websphere 5.1.2/5.0/4.0/3.5 , Weblogic 5.1/7.0, Jdeveloper, Apache Tomcat, JBoss.

Monitoring and Reporting tools: Ganglia, Nagios, Custom Shell scripts.

Version control: SVN, CVS, GIT

PROFESSIONAL EXPERIENCE

Confidential, Plantation, FL

Sr. Hadoop Developer

Environment: CloudEra Hadoop, MapReduce, HDFS, Hive, Java (jdk1.7), Pig, Flume, HBase, Sqoop. Oozie. DB2, TeraData, Apache Spark Environment, Apache Kafka, Scala, Storm, Solr, REST, Jersey, Linux, XML.

Responsibilities:

  • Research and recommend suitable technology stack for Hadoop migration considering current enterprise architecture.
  • Involved in installation and configuration ofHDFS, Hadoop MapReduce and developed several Map Reduce operations in Java for data preprocessing.
  • Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
  • Experience in using Spark Sql to implement Custom JOINS to create tables containing the records of items.
  • Designed and implemented Spark-based large-scale parallel relation-learning system.
  • Experienced in bulk loading of data in Hbase using MapReduce by directly creating H-files and loading them.
  • Experienced in creating custom source and sink in flume to support client data API’s.
  • Collected and aggregated huge amount of log data from multiple sources and integrated into HDFS usingFlume.
  • Integrated the data taken from multiple databases like DB2 and Tera Data into Hadoop cluster and used Hive-HBase integration for analyzing the data.
  • Involved in developing web-services using REST implemented in java using HBase Native API and Jersey to query data from HBase.
  • Used HBase as a real time data storage and analytics platform and the reports generated from HBase are used as feedback for the production system.
  • Experience in developing data pipeline usingKafkaand Storm to store data into HDFS.
  • Experienced in collecting the real-time data from Kafka using Spark Streaming and perform transformations and aggregation on the fly to build the common learner data model and persists the data into HBase.
  • Experienced in implementing POC's to migrate iterative map reduce programs into Spark transformations using Spark and Scala.
  • Involved in installation and configuration of Hive and also written various Hive User Defined Functions for categorization.
  • Used Hive as the core database for the data warehouse where it is used to track and analyze all the data usage across our network.
  • Experienced with Solr for indexing and search operations and configuring Solr by modifying schema.xml file as per our requirements.
  • Experienced in using Oozie to coordinate and automate the flow of jobs in the cluster accordingly.
  • Experienced in managing and reviewingHadooplog files.
  • Worked on different file formats like Text files, Sequence Files, Avro and Record columnar files (RC).
  • Explored and used Hadoop ecosystem features and architectures.

Confidential, Austin, TX

Hadoop Developer

Environment: CloudEra Hadoop, MapReduce, HDFS, Hive, Java, Pig, MongoDB, Cassandra, JSON, XML. HBase, Sqoop. Oozie, Shell Scripts, Apache Crunch, Apache Spark Environment, Apache Storm, MRUnit, Mockito, Netcat, Http, Linux.

Responsibilities:

  • Implemented performance optimizations using distributed cache, Partitioning, Bucketing and Map Side joins in HIVE.
  • Experience in automating using UNIX shell scripts on the Hive data
  • Implemented Hive for data mining, internal log analysis and ad hoc queries.
  • Implemented Pig Latin scripts to describe structural and semantic conversions between data contexts.
  • Experience in using Pig loader for parsing JSON and XML files and used Regex in Pig to extract useful information from Pig Relations.
  • Experienced in using Apache Flume for log file aggregation and processing.
  • Experienced in designing and configuration of Flumeagents to collect data from the network proxy servers and store to HDFS.
  • Used Flume to extract files from Netcat and HTTP sources and place them in HDFS and process them.
  • Experience in developing applications by using find keyword and aggregations in MongoDB.
  • Experience in using MongoDB Map reduce connector in order to run MapReduce programs on the data residing in MongoDB for some user stories.
  • Expertise in developing MapReduce programs implementing various data processing logics by customizing the framework at various levels.
  • Experience configuring spouts and bolts in various Storm topologies and validating data in the bolts.
  • Integrating bulk data into Cassandra file system using Map Reduce programs.
  • Worked on connecting to a 5-node Cassandra cluster from java using DataStax Java Driver and developed a web application used for searching.
  • Involved in configuring 36 node MongoDB cluster with data replication and hash based sharding.
  • Expert in MRUnit and Mockito for implementing test class for MapReduce programs.
  • Involved in Hive testing using custom written shell scripts.
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode high availability, capacity planning, and slots configuration.
  • Experienced in using Apache Crunch for data cleaning and processing.

Confidential, Leesburg, GA

Hadoop Developer

Environment: CloudEra Hadoop, MapReduce, HDFS, Hive, Java, Pig, Flume, HBase, Sqoop. Oozie. Shell Scripts, Cron, Linux, XML.

Responsibilities:

  • As a Big Data Developer, implemented solutions for ingesting data from various sources and processing the Data-at-Rest by utilizing Big Data technologies such as Hadoop, MapReduce Frameworks, HBase, Hive, Oozie, Flume and Sqoop.
  • Real time experience in designing and implementing Big Data processing to enable real-time analytics, event detection and notification for Data-in-Motion.
  • Involved in developing various MapReduce programs in order to implement various transformations and filtrations according to various user stories.
  • Experienced in developing applications to process, cleanse, and report on data utilizing various analytics platforms like Hadoop and various NO-SQL Databases.
  • Experienced in processing server, application and user log files using Hive in combination with Pig.
  • Experience in using Pig to sort and prep our data before it is handed off to our Java Map/Reduce jobs.
  • Implemented Hive queries to pre-process and analyze streaming data by granting read only structure.
  • Experience in using Oozie workflows to organize/arrange many Hive queries.
  • Responsible for migrating ETL scripts into hadoop framework by using Hive, Pig and Map Reduce programs wherever necessary.
  • Experience in automating migrated ETL applications using Oozie workflows and error handling using shell scripts.
  • Experienced in collecting and aggregating large amounts of log data using Apache Flumeand using HDFS as staging layer for further analysis.
  • Involved in developing Shell Scripts and automated those using CRON job scheduler.
  • Involved in Commissioning and Decommissioning Hadoop nodes, monitoring and troubleshooting of cluster, manage and review data backups and Hadoop log files.
  • Experience in developing scripts for SQOOP Ingestion and Hadoop Copy Merge.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance.

Confidential, Mayfield, OH

Sr. Java/J2EE Developer

Environment: J2EE, Spring framework, Spring MVC, Hibernate, JSP, Servlets, JDBC, AJAX, JQuery, JavaScript, Oracle 10g, IBM RAD, Tomcat 7, CVS, JUnit.

Responsibilities:

  • Played key role in design and development of enterprise application using J2EE technologies and Spring framework using Service Oriented Architecture (SOA).
  • Implemented Spring Beans using IOC and Transaction management features to handle the transactions and business logic.
  • Participated in Production deployment and change management process.
  • Worked in all the modules of the application which involved front-end presentation logic developed using Tiles, JSP, JSTL and java script, Business objects developed using POJOs and data access layer using Hibernate framework.
  • Created and injected spring services, spring controllers and DAOs to achieve dependency injection and to wire objects of business classes.
  • Used Apache Axis as the Web Service framework for creating and deploying Web Service Clients using SOAP and WSDL
  • Developed various generic JavaScript functions used for validations.
  • Used AJAX extensively to implement front end /user interface features in the application.
  • Design and developed different PL/SQL blocks, Stored Procedures in DB2 database
  • Focused on Test Driven Development; thereby creating detailed JUnit tests for every single piece of functionality before actually writing the functionality.
  • Developed and implemented several test cases using JUnit framework
  • Used Ant scripts to build and deploy the applications in Tomcat Server.
  • Used Log4j utility to generate run-time logs.
  • CVS was used for project management and version management.
  • Involved in troubleshoot technical issues, conduct code reviews, and enforce best practices.

Confidential, Pittsburgh, PA

Sr. Java/J2EE Developer

Environment: J2EE, EJB, Struts framework, JSP, Servlets, REST, JDBC, AJAX, JQuery, JavaScript, PL/SQL, Oracle 10g, Web sphere, Ant, JUnit.

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Involved in developing UI pages using JSP, Java Script HTML/DHTML and Ajax.
  • Developed Dispatch Actions, Action Forms and Custom tag libs in Struts framework.
  • Loaded external data using RESTful web service and managing the XML data.
  • Extensively applied various design patterns such as MVC-2, Front Controller, Factory, Singleton, Business Delegate, Session Façade, Service Locator, DAO etc. throughout the application for a clear and manageable distribution of roles.
  • Developed different interfaces using EJB Session Beans (Stateless) and Message Driven Beans for both synchronous and asynchronous communication.
  • Developed different Components and Adapters of the integration framework using Stateless Session EJB.
  • Actively involved in configuration management tool CVS in managing the code.
  • Involve in Initial designing and creating Use case diagrams, Sequence Diagrams and class diagrams using the Rational Rose tool.
  • Set up Application server like Web sphere and used Ant tool to build the application and deploy the application in Web sphere.
  • Wrote PL/SQL queries to access data from Oracle database.

Confidential

Java/J2EE Developer

Environment: J2EE, EJB, Struts framework, Hibernate, JSP, Servlets, REST, JDBC, AJAX, JQuery, JavaScript, XML, SAX, DOM, PL/SQL, Oracle 10g, WebLogic, Maven, JUnit.

Responsibilities:

  • Applied MVC Design Pattern with JSP as view, Struts Action Servlets as controller and EJB session beans as model, deployed it on WebLogic server.
  • Developed the business logic inJavaback-end using Struts Framework.
  • Used Hibernate to fetch data from Oracle database.
  • Used WSAD/Eclipse development environment for building EnterpriseJavaBeans.
  • Worked in Linux environment to run batch jobs and used Maven to build the application.
  • Used JavaScript for Client side validation.
  • Parsed the data which is in XML format using SAX and DOM parsers.
  • Created UML diagrams (use case, class, sequence, and collaboration) based on the business requirements
  • Implement the back end business logic involved in registering new users and managing user related functionalities.
  • Used CVS for version control.
  • Used Log4j and JUnits to log and unit test the functionality.

We'd love your feedback!