Sr. Hadoop Developer Resume Plantation, FL - Hire IT People

SUMMARY

A motivated and result driven professional with 8+ years of experience in Software development including 3+ years of heavy exposure to Big Data /Hadoop, Actively involved in analysis and implementation of various trending technologies in Big Data Eco Systems and NoSQL Technologies under different verticals like Finance, Health - care and Insurance.
5 Years of exposure to full development life cycle of Java/J2EE Application/Web development.
Proficient in processing large sets of structured, semi-structured and unstructured data for data mining using optimized Map Reduce programs, PIG scripts and HIVE queries.
Responsible for creating complex Map Reduce programs by customizing framework at various levels.
Good knowledge in writing customized UDF’s, UDAF’s and UDTF’s to extend Hive and Pig Latin functionality.
Experience in loading Log data and unstructured data from multiple sources to HDFS using Flume.
Performed Real time event processing of data from multiple servers in the organization usingApache Storm by integrating withApacheKafka.
Extensive experience inSparkStreaming and implementing Spark machine learning libraries in Scala.
Expertise in NOSQL databases like HBase, Cassandra and Mongo DB.
Experience with Cassandra in optimizing it for writes and pre-computing aggregations to perform various statistics.
Involved in installing and maintaining 36 node MongoDB cluster with replication and sharding enabled.
Involved in data modelling and designing indexing model for MongoDB.
Performed data modeling to connect data stored in Cassandra Database to the data processing layers and wrote queries in CQL.
Experience in using the Sqoop for importing and exporting data from HDFS, HBase and Hive to Relational Database Systems and vice versa.
Extensive experience in Oozie for designing, monitoring and scheduling both time driven and data driven automated job workflows.
Hands on experience with puppet for automating the Hadoop Installations, configuring and maintaining the clusters.
Experience in using Cloudera Manager, Apache Ambari, Ganglia and Nagios for monitoring jobs running on cluster.
Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities of Apache Spark written in Scala.
Experienced in Spark Streaming in order to ingest data from multiple data sources into HDFS.
Experience with Flume and Apache Kafka to create data pipeline to ingest browsing data into HBase/HDFS for analysis.
Working experience in creating, configuring and monitoring Hadoop clusters on EC2, VM, and Horton works Data Platform 2.1 & 2.2, CDH3, and CDH4 using Cloudera Manager.
Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
Worked on all phases of data warehouse development lifecycle, ETL design and implementation, and support of new and existing applications.
Extensive experience in designing and developing the enterprise applications using Java, J2EE Technologies, JavaScript, Struts, Hibernate, EJB and Spring Framework.
Widely used different Web/Application servers like WebLogic, Web Sphere 6.x, JBoss and Tomcat Servers for deployment of builds, Server Configuration and performance tuning including troubleshooting and maintenance.
Extensive experience with RDBMS integration with enterprise applications in writing SQL Queries, Stored Procedures, Functions and Triggers using Oracle 9i/10g/11g, IBM DB2 and MySql.
Strong understanding of Agile (Scrum) and Waterfall SDLC methodologies.
Experience in developing web-based User Interface using ExtJS, Javascript, jQuery, CSS, HTML, HTML5 and XHTML.

TECHNICAL SKILLS

Big data/Hadoop Ecosystem: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, Storm, Spark, Scala, Avro, Mrunit, Solr.

Java / J2EE Technologies: Core Java, Servlets, JSP, JDBC, XML, REST, SOAP, WSDL

Programming Languages: C, C++, Java, Scala, SQL, PL/SQL, Linux shell scripts.

NoSQL Databases: MongoDB, Cassandra, HBase

Database: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, Teradata.

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP, WSDL

Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2.

Tools: Used: Eclipse, IntelliJ, Putty, Winscp

Operating System: RedHat, Windows 7/8, server 2008/2003, Mac OS.

ETL Tools: Informatica, pentaho.

Testing: Hadoop Testing(MRunit, Mockito), Hive Testing, Quality Center (QC)

Application/Web Servers: IBM Websphere 5.1.2/5.0/4.0/3.5 , Weblogic 5.1/7.0, Jdeveloper, Apache Tomcat, JBoss.

Monitoring and Reporting tools: Ganglia, Nagios, Custom Shell scripts.

Version control: SVN, CVS, GIT

PROFESSIONAL EXPERIENCE

Confidential, Plantation, FL

Sr. Hadoop Developer

Environment: CloudEra Hadoop, MapReduce, HDFS, Hive, Java (jdk1.7), Pig, Flume, HBase, Sqoop. Oozie. DB2, TeraData, Apache Spark Environment, Apache Kafka, Scala, Storm, Solr, REST, Jersey, Linux, XML.

Responsibilities:

Research and recommend suitable technology stack for Hadoop migration considering current enterprise architecture.
Involved in installation and configuration ofHDFS, Hadoop MapReduce and developed several Map Reduce operations in Java for data preprocessing.
Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
Experience in using Spark Sql to implement Custom JOINS to create tables containing the records of items.
Designed and implemented Spark-based large-scale parallel relation-learning system.
Experienced in bulk loading of data in Hbase using MapReduce by directly creating H-files and loading them.
Experienced in creating custom source and sink in flume to support client data API’s.
Collected and aggregated huge amount of log data from multiple sources and integrated into HDFS usingFlume.
Integrated the data taken from multiple databases like DB2 and Tera Data into Hadoop cluster and used Hive-HBase integration for analyzing the data.
Involved in developing web-services using REST implemented in java using HBase Native API and Jersey to query data from HBase.
Used HBase as a real time data storage and analytics platform and the reports generated from HBase are used as feedback for the production system.
Experience in developing data pipeline usingKafkaand Storm to store data into HDFS.
Experienced in collecting the real-time data from Kafka using Spark Streaming and perform transformations and aggregation on the fly to build the common learner data model and persists the data into HBase.
Experienced in implementing POC's to migrate iterative map reduce programs into Spark transformations using Spark and Scala.
Involved in installation and configuration of Hive and also written various Hive User Defined Functions for categorization.
Used Hive as the core database for the data warehouse where it is used to track and analyze all the data usage across our network.
Experienced with Solr for indexing and search operations and configuring Solr by modifying schema.xml file as per our requirements.
Experienced in using Oozie to coordinate and automate the flow of jobs in the cluster accordingly.
Experienced in managing and reviewingHadooplog files.
Worked on different file formats like Text files, Sequence Files, Avro and Record columnar files (RC).
Explored and used Hadoop ecosystem features and architectures.

Confidential, Austin, TX

Hadoop Developer

Environment: CloudEra Hadoop, MapReduce, HDFS, Hive, Java, Pig, MongoDB, Cassandra, JSON, XML. HBase, Sqoop. Oozie, Shell Scripts, Apache Crunch, Apache Spark Environment, Apache Storm, MRUnit, Mockito, Netcat, Http, Linux.

Responsibilities:

Implemented performance optimizations using distributed cache, Partitioning, Bucketing and Map Side joins in HIVE.
Experience in automating using UNIX shell scripts on the Hive data
Implemented Hive for data mining, internal log analysis and ad hoc queries.
Implemented Pig Latin scripts to describe structural and semantic conversions between data contexts.
Experience in using Pig loader for parsing JSON and XML files and used Regex in Pig to extract useful information from Pig Relations.
Experienced in using Apache Flume for log file aggregation and processing.
Experienced in designing and configuration of Flumeagents to collect data from the network proxy servers and store to HDFS.
Used Flume to extract files from Netcat and HTTP sources and place them in HDFS and process them.
Experience in developing applications by using find keyword and aggregations in MongoDB.
Experience in using MongoDB Map reduce connector in order to run MapReduce programs on the data residing in MongoDB for some user stories.
Expertise in developing MapReduce programs implementing various data processing logics by customizing the framework at various levels.
Experience configuring spouts and bolts in various Storm topologies and validating data in the bolts.
Integrating bulk data into Cassandra file system using Map Reduce programs.
Worked on connecting to a 5-node Cassandra cluster from java using DataStax Java Driver and developed a web application used for searching.
Involved in configuring 36 node MongoDB cluster with data replication and hash based sharding.
Expert in MRUnit and Mockito for implementing test class for MapReduce programs.
Involved in Hive testing using custom written shell scripts.
Worked on installing cluster, commissioning & decommissioning of datanode, namenode high availability, capacity planning, and slots configuration.
Experienced in using Apache Crunch for data cleaning and processing.

Confidential, Leesburg, GA

Hadoop Developer

Environment: CloudEra Hadoop, MapReduce, HDFS, Hive, Java, Pig, Flume, HBase, Sqoop. Oozie. Shell Scripts, Cron, Linux, XML.

Responsibilities:

As a Big Data Developer, implemented solutions for ingesting data from various sources and processing the Data-at-Rest by utilizing Big Data technologies such as Hadoop, MapReduce Frameworks, HBase, Hive, Oozie, Flume and Sqoop.
Real time experience in designing and implementing Big Data processing to enable real-time analytics, event detection and notification for Data-in-Motion.
Involved in developing various MapReduce programs in order to implement various transformations and filtrations according to various user stories.
Experienced in developing applications to process, cleanse, and report on data utilizing various analytics platforms like Hadoop and various NO-SQL Databases.
Experienced in processing server, application and user log files using Hive in combination with Pig.
Experience in using Pig to sort and prep our data before it is handed off to our Java Map/Reduce jobs.
Implemented Hive queries to pre-process and analyze streaming data by granting read only structure.
Experience in using Oozie workflows to organize/arrange many Hive queries.
Responsible for migrating ETL scripts into hadoop framework by using Hive, Pig and Map Reduce programs wherever necessary.
Experience in automating migrated ETL applications using Oozie workflows and error handling using shell scripts.
Experienced in collecting and aggregating large amounts of log data using Apache Flumeand using HDFS as staging layer for further analysis.
Involved in developing Shell Scripts and automated those using CRON job scheduler.
Involved in Commissioning and Decommissioning Hadoop nodes, monitoring and troubleshooting of cluster, manage and review data backups and Hadoop log files.
Experience in developing scripts for SQOOP Ingestion and Hadoop Copy Merge.
Reviewed the HDFS usage and system design for future scalability and fault-tolerance.

Confidential, Mayfield, OH

Sr. Java/J2EE Developer

Environment: J2EE, Spring framework, Spring MVC, Hibernate, JSP, Servlets, JDBC, AJAX, JQuery, JavaScript, Oracle 10g, IBM RAD, Tomcat 7, CVS, JUnit.

Responsibilities:

Played key role in design and development of enterprise application using J2EE technologies and Spring framework using Service Oriented Architecture (SOA).
Implemented Spring Beans using IOC and Transaction management features to handle the transactions and business logic.
Participated in Production deployment and change management process.
Worked in all the modules of the application which involved front-end presentation logic developed using Tiles, JSP, JSTL and java script, Business objects developed using POJOs and data access layer using Hibernate framework.
Created and injected spring services, spring controllers and DAOs to achieve dependency injection and to wire objects of business classes.
Used Apache Axis as the Web Service framework for creating and deploying Web Service Clients using SOAP and WSDL
Developed various generic JavaScript functions used for validations.
Used AJAX extensively to implement front end /user interface features in the application.
Design and developed different PL/SQL blocks, Stored Procedures in DB2 database
Focused on Test Driven Development; thereby creating detailed JUnit tests for every single piece of functionality before actually writing the functionality.
Developed and implemented several test cases using JUnit framework
Used Ant scripts to build and deploy the applications in Tomcat Server.
Used Log4j utility to generate run-time logs.
CVS was used for project management and version management.
Involved in troubleshoot technical issues, conduct code reviews, and enforce best practices.

Confidential, Pittsburgh, PA

Sr. Java/J2EE Developer

Environment: J2EE, EJB, Struts framework, JSP, Servlets, REST, JDBC, AJAX, JQuery, JavaScript, PL/SQL, Oracle 10g, Web sphere, Ant, JUnit.

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
Involved in developing UI pages using JSP, Java Script HTML/DHTML and Ajax.
Developed Dispatch Actions, Action Forms and Custom tag libs in Struts framework.
Loaded external data using RESTful web service and managing the XML data.
Extensively applied various design patterns such as MVC-2, Front Controller, Factory, Singleton, Business Delegate, Session Façade, Service Locator, DAO etc. throughout the application for a clear and manageable distribution of roles.
Developed different interfaces using EJB Session Beans (Stateless) and Message Driven Beans for both synchronous and asynchronous communication.
Developed different Components and Adapters of the integration framework using Stateless Session EJB.
Actively involved in configuration management tool CVS in managing the code.
Involve in Initial designing and creating Use case diagrams, Sequence Diagrams and class diagrams using the Rational Rose tool.
Set up Application server like Web sphere and used Ant tool to build the application and deploy the application in Web sphere.
Wrote PL/SQL queries to access data from Oracle database.

Confidential

Java/J2EE Developer

Environment: J2EE, EJB, Struts framework, Hibernate, JSP, Servlets, REST, JDBC, AJAX, JQuery, JavaScript, XML, SAX, DOM, PL/SQL, Oracle 10g, WebLogic, Maven, JUnit.

Responsibilities:

Applied MVC Design Pattern with JSP as view, Struts Action Servlets as controller and EJB session beans as model, deployed it on WebLogic server.
Developed the business logic inJavaback-end using Struts Framework.
Used Hibernate to fetch data from Oracle database.
Used WSAD/Eclipse development environment for building EnterpriseJavaBeans.
Worked in Linux environment to run batch jobs and used Maven to build the application.
Used JavaScript for Client side validation.
Parsed the data which is in XML format using SAX and DOM parsers.
Created UML diagrams (use case, class, sequence, and collaboration) based on the business requirements
Implement the back end business logic involved in registering new users and managing user related functionalities.
Used CVS for version control.
Used Log4j and JUnits to log and unit test the functionality.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Plantation, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship