Sr. Hadoop Developer Resume Boston, MA - Hire IT People

PROFESSIONAL SUMMARY

7+ years of expertise in data analytics, data modeling, data integration, object oriented programming and advanced analytics.
Around 4 years of experience working with Big Data using Hadoop Eco - system components ( HDFS, MapReduce (MR Version 1/YARN), Pig, Hive, HBase, Sqoop, Flume, Oozie, Zookeeper).
Worked on systems programming, requirements gathering, technical documentation writing, extensive designing and developing Big Data solutions for different enterprise application systems across the domains .
Experience in full project development including Software Design, Analysis, Coding, Development, Testing, Implementation, Maintenance and Support using Java and J2EE technologies.
Good knowledge of Java, J2EE, SQL & related technologies.
Excellent understanding of Object Oriented Programming and Core Java concepts such as multi-threading, exception handling, generics, collections, serialization and I/O.
In-depth understanding of Data Structures and Design Analysis of Algorithms.
Worked with data from different sources like Databases, Log files, Flat files and XML files.
Hands on experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster using Cloudera, Hortonworks distributions .
Hands on experience in using Hive and Pig scripts to perform data analysis, data cleaning and data transformation.
Hands on experience in capturing data and importing using Sqoop from existing relational databases (Oracle, MySQL, SQL) with the help of connectors and fast loaders .
Solid experience in Storage, Querying, Processing and Analysis of Big Data using Hadoop framework .
Good Experience in managing and reviewing Hadoop log files.
Experienced in developing MapReduce programs using Hadoop java API .
Developed simple to complex MapReduce jobs using Hive scripts and Pig Latin scripts to handle files in multiple formats (JSON, Text, XML).
Improved the performance of MapReduce jobs by using Combiners, Partitioners and Distributed cache.
Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop MapReduce and Pig jobs.
Experienced in loading unstructured data into HDFS using Flume.
Good hands on experience in NoSQL databases such as HBase and MongoDB.
Experienced in developing custom UDFs for Pig and Hive using Java by extending core functionality.
Expertise in Real time data ingestion into HBase and HIVE using Storm.
Hands on experience in dealing with Compression Codecs like Snappy, Gzip.
Proficient in using various IDEs like Eclipse, NetBeans, Visual Studio 2013, JDeveloper, and Android Studio .
Experience with Enterprise Java Beans (EJB) components Technical Expertise & demonstrated high standards of skills in J2EE framework, MVC Frameworks.
Hands on experience in creating Hive Scripts, HIVE tables, UDFs, Partitioning and Bucketing.
Strong Knowledge on HIVE storage format (Sequence Files, AVRO, RC Files, ORC Files) compressions .
Experience in importing streaming data into HDFS using Flume sources and Flume sinks, and transforming data using Flume Interceptors and also build Custom Flume & Inceptor sterilizers.
Proficient in Big data ingestion tools like Sqoop, Flume and developed customized MapReduce jobs using Apache Hadoop .
Extensive experience in Data Modeling, Database Design and Development using Relational Databases (Oracle 10g, MySQL Server 2003/2005) and NoSQL Databases (HBase, MongoDB) .
Excellent understanding of Hadoop Architecture and different daemons of Hadoop cluster which include Job Tracker, Task Tracker, Name Node, and Data Node and also YARN daemons like Resource Manager and Node Manager and also Job History Server daemon.

TECHNICAL EXPERIENCE

Hadoop Ecosystem: Apache Hadoop 1.x/2.x(YARN), HDFS, Map Reduce, Hive, Pig, Zookeeper, Sqoop, Oozie, Flume, Cloudera QuickstartVM, Horton Works Ambari, JSON.

Web Technologies: Core Java, J2EE, Servlets, JSP, JDBC, XML.

Modeling languages: Use case, UML, Design Patterns (Core Java and J2EE), Sequence, Deployment and Component diagrams.

Programming Languages: C, C++, Java, XML, Unix Shell scripting, HTML.

NoSQL Databases: HBase, MongoDB

Methodologies: Agile/Scrum, Waterfall

Databases: Oracle 11g/10g, MS-SQL Server 2003/2005, MySQL Server, MS-Access

Web Services: Web Logic, Apache Tomcat

Monitoring & Reporting tools: Ganglia, Nagios, Custom Shell scripts, Cloudera Manager.

Operating Systems: MS DOS, Windows XP/Vista/7/8.x/10, Ubuntu 14.04, Debian, Red Hat Linux, Backtrack 5 R3 Linux.

Other Tools: Eclipse, Net Beans, Android Studio, GitHub

PROFESSIONAL EXPERIENCE

Confidential, Boston, MA

Sr. Hadoop Developer

Responsibilities:

Worked on scalable distributed data system using Hadoop ecosystem .
Developed Simple to complex Map/reduce streaming jobs using java and also with Hive and Pig.
Used various compression mechanisms to optimize Map/Reduce Jobs to use HDFS efficiently.
Transformed the imported data using Hive and MapReduce .
Used Sqoop to extract the data from MySQL and load data into HDFS .
Wrote Hive queries and Pig scripts to study customer behaviour by analysing the data.
Used Cloudera Manager for continuous monitoring and managing the Hadoop cluster.
Loaded data into Hive tables from Hadoop Distributed File System (HDFS) to provide SQL-like access on Hadoop data.
Installed Oozie workflow engine to run multiple Hive and Pig jobs.
Wrote python scripts to process semi-structured data in formats like JSON .
Worked closely with the data modelers to model the new incoming data sets.
Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Experienced in loading and transforming of large sets of structured, semi structured and unstructured data.
Analysed large amounts of data sets to determine optimal way to aggregate and report on it.
Developed Oozie workflow for MapReduce and HiveQL jobs.
Responsible for analysing and data cleaning using Spark SQL Queries.
Was involved with the team in updating the cluster configurations.
Installed Ganglia Monitoring Tool to generate reports related to Hadoop cluster like CPUs running, Hosts Up and Down etc., operations were performed to maintain Hadoop cluster.
Helped the administration team to bring the Hadoop clusters up and running with the high availability.
Installed Zookeeper to maintain the high availability of the NameNode.
Troubleshooting and finding the bugs in the Hadoop applications and to clear off all the bugs took help from the testing team.
Optimization of complex joins in Pig by using techniques such as skewed joins and hash based aggregations.
Developed User Defined Functions (UDF) in the scripts to achieve the specific functionality defined by the business use cases.

Environment: Hadoop - Pig, Hive, Apache Sqoop, Spark, Python, Oozie, HBase, Zoo keeper, Cloudera Manager, 45 Node cluster with Linux-Ubuntu, Ganglia.

Confidential, Boston, MA

Jr. Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop .
Performed Installation of Hadoop in fully and Pseudo Distributed Mode for POC in early stages of the project.
Installed and configured Hadoop cluster using Cloudera Distribution and configured other ecosystems like Hive, Sqoop, Flume, Pig and Oozie .
Designed and developed multiple MapReduce jobs in Java for complex analysis. Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Job duties involved the design, development of various modules in Hadoop Big Data Platform and processing data using MapReduce, Hive, Pig, Sqoop, and Oozie.
Used Oozie to automate the data loading into Hadoop Distributed File System.
Configured Flume to transport web server logs into HDFS. Also used Kite logging module to upload webserver logs into HDFS.
Designed and implemented multiple MapReduce jobs in java to support distributed data cleaning and preprocessing.
Developed Hive queries to join click stream data with the relational data for determining the interaction of search guests on the website .
Developed Junit tests for testing MapReduce and also performed testing using small sample data.
Created MapReduce jobs which were used in processing survey data and log data stored in HDFS.
Used Pig Scripts for data cleaning and data preprocessing.
Transformed and aggregated data for analysis by implementing work flow management of Sqoop, Hive and Pig scripts.
Used Pig to create dataflow models to process data.
Developed Unit Test Cases for Mapper and Reducer classes.
Involved in creating Hive tables and loading them with data and writing hive queries .
Used Hive to analyz e the partitioned and bucketed data and compute various metrics for reporting.
Cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise and other tools.
Collaborating with application development team to provide the updates, patches, version upgrades when required.
My team was collaborated with administration team for tuning the performance issues and to test the cluster environment.

Environment: HDFS, Pig, Map Reduce, Hive, Oozie, Sqoop, Cloudera Manager, Ganglia, Nagios, and Flume.

Confidential, Los Angeles, CA

Jr. Hadoop Developer

Responsibilities:

Gathered the business requirements from the Business Partners and Subject Matter Experts.
Utilized Agile Scrum Methodology to help manage and organize a team of 3 developers with regular code review sessions.
Hands on installing and configuring nodes on CDH4 Hadoop cluster on CentOS.
Handled data coming from different sources and of different formats.
Involved in the loading of structured and unstructured data into HDFS.
Responsible for batch processing and real time processing in HDFS and NOSQL Databases .
Loaded data from MySQL to HDFS on regular basis using Sqoop.
Handling the documentation of data transfer to HDFS system from various sources (Sqoop, Flume).
Involved in managing and reviewing Hadoop log files to check any errors or warnings that might affect the cluster.
Scheduled various Hadoop tasks to run using Scripts and Batch Jobs.
Cluster co-ordination services through Zookeeper.
Experience with streaming work flow operations and Hadoop jobs using Oozie workflow .
Performed data analysis to meet the business requirements by using Hive by creating Hive tables using Hive QL .
Created and maintained technical documentation for launching Hadoop clusters and for executing pig scripts.
Implemented business logics by writing Pig UDFs in Java and used various UDFs from sources.
File system management and monitoring it for security issues.
Performance tuning of Hadoop Clusters and MapReduce routines.
Helped in writing high speed queries using Hive Partitioning and Bucketing.
Assisted in creating and maintaining Technical documentation to launching Hadoop Clusters and even for executing Hive queries and Pig Scripts.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, MySQL, Java (jdk1.6), Oozie, Zookeeper.

Confidential, Sterling, VA

Jr. Hadoop Developer

Responsibilities:

Worked closely with EJBs to code reusable components and business logic while working on java beans.
Used JSP, Servlets and JDBC to develop web components.
Created templates and screens in HTML and JavaScript.
Handled data coming from different web servers.
Installed and configured HDFS, Hadoop MapReduce and developed multiple Map Reduce jobs in java for data cleaning and pre-processing.
Used Map Reduce to perform analytics on data present in HDFS.
Worked in loading data from UNIX file system to HDFS.
Loaded and transformed large datasets into HDFS using Hadoop fs commands.
Supported in setting up updating configurations for implementing scripts with Pig and Sqoop.
Migrated existing SQL queries to HiveQL queries to move to big data analytical platform.
Designed the logical and physical data modelling wrote DML scripts for Oracle 10g database.
Involved in templates and screens in HTML and JavaScript.
Experience with installing cluster, commissioning and decommissioning of NameNode and DataNodes .
Wrote configuration files for transferring log files from different servers to HDFS using sources, sinks and channels to transfer them.
Worked on capacity planning, node recovery and slots confuguration .

Environment: JDK, J2EE, Servlet, JSP, JDBC, JavaScript, MVC, XML, Tomcat, Eclipse, CDH, Hadoop, HDFS, Pig, MySQL and MapReduce.

Confidential, Pittsburgh, PA

Java/J2EE Developer

Responsibilities:

Responsible for gathering Business Requirements and User Specifications from Business Analyst.
Involved in developing, testing and deployment of the application using Web Services .
Worked on Load Builder Module for Region services using RESTful Web services.
Worked with Servlets and JSP to design the user interface.
Used JSP, Java Script, HTML5, and CSS for validating and customizing error messages to the User Interface.
Used Eclipse IDE for code development and deployed in Apache Tomcat Server.
UML diagram were used to create use cases and Sequence diagram.
Wrote complex SQL queries and stored procedures.
Actively involved in the system testing.
Implemented Hibernate in the data access object layer to access and update information in the database .
CSV flat files were loaded via SFTP to test the integrations.
Tested REST API’s using advanced rest client browser extension to load master data into server to assure there were no schema changes in the XML payload .
Implemented design patterns like Data Access Objects (DAO), Value Objects/Data Transfer Object ( DTO ), Singleton etc.
As the customers were having trouble in downloading attachments from transactions API, I developed an automated java program to download any type of attachments (Audio, Video, Image, and XML) from the instance.
This java program required making GET calls to the REST API and parsing the response.

Environment: Java 6, J2EE, MVC, Apache Tomcat Application Server, SFTP, JSP, Servlets, Java Script, HTML5, CSS, Web Services, Oracle 10g, Eclipse IDE, XML.

Confidential, Carlsbad, CA

Jr. Java Developer

Responsibilities:

Used Business Requirements and User Specifications from Business Analyst.
Experience with Model View Control design pattern was implemented with MVC, Servlets, HTML, Java Script, CSS to control the flow of the application in the presentation, Application/Business layer ( JDBC ) and data layer ( Oracle 10g ).
Developed web components using JSP, Servlets and JDBC .
Used EJBs to develop business logic and coded reusable components in Java Beans.
Created use cases and Sequence diagram by using the UML diagrams .
Used EJB entity and session beans to implement business logic and session handling and transactions.
Implemented Hibernate in the data access object layer to access and update information in the database.
Actively in involved in system testing and developed DAO objects using hibernate support.
Loaded CSV flat files to test inbound integrations via SFTP .
We have developed Unit & Functional test cases for testing web applications.
Delivered the code within timeline, and logged the bugs/fixes in Tech online tracking system.
Experience with implementation of presentation layer with HTML, JavaScript .
Developed JavaScript validations on order submission forms.

Environment: Java 6, J2EE, MVC, Apache Tomcat Application Server, SFTP, JSP, Servlets, Java Script, HTML5, CSS, Web Services, Oracle 10g, Eclipse IDE, UML.

Confidential

Jr. Java Developer

Responsibilities:

Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.
Worked on JSP, Servlets and JDBC in creating web components.
Responsible for creating work model using HTML and JavaScript to understand the flow of the web application and created class diagrams.
Participated in the daily stand up SCRUM agile meetings as part of AGILE process for reporting the day to day developments of the work done.
J2EE is used to develop the application based on MVC architecture.
Used HTML, XHTML, and JavaScript to improve the interactive front end.
Designed, Implemented, Tested and Deployed Enterprise Java Beans using Apache Tomcat as Application Server.
Designed the database tables and indexes used for the project.
Developed stored procedures, packages and database triggers to enforce data integrity.
JDBC API was used with Query Statements and Prepared Statements to interact with the database using SQL.
Performed data analysis and created reports for user requirements.

Environment: Windows NT 2000/2003, XP, and Windows 7, C, Java, JSP, Servlets, JDBC, XML.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship