Hadoop Developer/Admin Resume Piscataway, NJ - Hire IT People

SUMMARY

About 8 - years of experience with emphasis on Big Data Technologies, Development and Design of Java based enterprise applications.
Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
Experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
Hands-on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase-Hive Integration, PIG, Sqoop, Flume& knowledge of Mapper/Reduce/HDFS Framework.
Set up standards and processes for Hadoop based application design and implementation.
Worked on NoSQL databases including H base, Cassandra and Mongo DB.
Good experience in analysis using PIG and HIVE and understanding of SQOOP and Puppet.
Expertise in database performance tuning & data modeling.
Developed automated scripts using Unix Shell for performing RUNSTATS, REORG, REBIND, COPY, LOAD, BACKUP, IMPORT, EXPORT and other related to database activities.
Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
Expertise in working with different databases likes Oracle, MS-SQL Server, Postures, and MS Access 2000 along with exposure to Hibernate for mapping an object-oriented domain model to a traditional relational database.
Extensive experience in data analysis using tools like Sync sort and HZ along with Shell Scripting and UNIX.
Involved in log file management where the logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 3 months.
Experienced in installing, configuring, and administrating Hadoop cluster of major Hadoop distributions.
Expertise in development support activities including installation, configuration and successful deployment of changes across all environments.
Experience in Creating a design and framework for the generic Abilities Code to handle the multiple and ever expanding list of data files coming from source
Familiarity and experience with data warehousing and ETL tools.
Good working Knowledge in OOA&OOD using UML and designing use cases.
Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
Experience in production support and application support by fixing bugs.
Used HP Quality Center for logging test cases and defects.
Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS

Big Data: Hadoop, Hive, Sqoop, Pig, Puppet, Ambary, HBase, Monod, Cassandra, Power Pivot, Defamer, Pentaho, spark, Flume, SolrCloud

Operating Systems: Windows, Ubuntu, Red Hat Linux, Linux, UNIX

Project Management: Plan View, MS-Project

Programming or Scripting Languages: Java, SQL, Unix Shell Scripting, C, Python

Modeling Tools: UML, Rational Rose

IDE/GUI: Eclipse

Framework: Struts, Hibernate

Database: MS-SQL, Oracle, MS-Access

Middleware: Web Sphere, TIBCO

ETL: Informatica, Pentaho, Netezza

Business Intelligence: OBIEE, Business Objects

Testing: Quality Center, Win Runner, Load Runner, QTP

PROFESSIONAL EXPERIENCE

Confidential, Piscataway, NJ

Hadoop Developer/Admin

Responsibilities:

Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop for analysis, visualization and to generate reports.
Developed multiple MapReduce jobs in java for data cleaning.
Developed Hive UDF to parse the staged raw data to get the Hit Times of the claims from a specific branch for a particular insurance type code.
Schedule these jobs with workflow engine like Oozie. Actions can be performed both sequentially and parallel using Oozie.
Built wrapper shell scripts to hold thisOozie workflow.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Involved in creating Hadoop streaming jobs using Python.
Used Ganglia to Monitor and Nagios to send alerts about the cluster around the clock
Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Worked on MapReduce Joins in querying multiple semi-structured data as per analytic needs.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Created many Java UDF and UDAFs in hive for functions that were not preexisting in Hive like the rank, Scum, etc.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Developed POC for Apache Kafka.
Do various performance optimizations like using distributed cache for small datasets, partition and bucketing in hive, doing map side joins etc..
Storing and loading the data from HDFS to Amazon S3 and backing up the Namespace data into NFS Filers.
Created concurrent access for hive tables with shared and exclusive locking that can be enabled in hive with the help of Zookeeper implementation in the cluster.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Familiarity with NoSQL databases including HBase, Monod.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, Monod, Cassandra, Oracle, NoSQL and Unix/Linux, Kafka, Amazon web services.

Confidential, Englewood, CO

Hadoop Developer / Admin

Responsibilities:

Experience with Cloudera and Horton works distribution of Hadoop
Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Involved in analyzing system failures, identifying root causes and recommended course of actions.
Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Managing and scheduling Jobs on a Hadoop cluster.
Deployed Hadoop Cluster in the following modes.
Developed multiple MapReduce jobs in java for data cleaning and accessing.
Managed Hadoop clusters: monitor, maintain, setup.
Importing and exporting data into HDFS and Hive using Sqoop
Experienced in defining job flows.
Implemented NameNode backup using NFS. This was done for High availability.
Worked on importing and exporting data from Oracle and DB2 into HDFS using Sqoop.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
Monitored workload, job performance and capacity planning using Cloudera Manager.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Wrote shell scripts to automate document indexing to SolrCloud in production.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Used CGI scripts to access Dbccdb database.
Analyzed the web log data using the Havel to extract number of unique visitors per day, page-views, visit duration, most purchased product on website.
Converting the Oracle table components to Teradata Table Components in Abilities Graphs
Used Ambary to manage, provision and monitor Hadoop cluster.
Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.

Environment: Hadoop 1x, Hive, Pig, HBASE, Sqoop, Flume, Zookeeper, Pig, HDFS, Ambary, Oracle, CDH3.

Confidential - Cambridge, MA

Java/Hadoop Developer

Responsibilities:

Exported data from DB2 to HDFS using Sqoop.
Developed MapReduce jobs using Java API.
Installed and configured Pig and also wrote Pig Latin scripts.
Wrote MapReduce jobs using Pig Latin.
Developed workflow using Oozie for running MapReduce jobs and Hive Queries.
Worked on Cluster coordination services through Zookeeper.
Worked on loading log data directly into HDFS using Flume.
Involved in loading data from LINUX file system to HDFS.
Responsible for managing data from multiple sources.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Responsible to manage data coming from different sources.
Assisted in exporting analyzed data to relational databases using Sqoop.
Implemented JMS for asynchronous auditing purposes.
Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts
Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters
Experience in defining, designing and developing Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
Experience in Develop monitoring and performance metrics for Hadoop clusters.
Experience in Document designs and procedures for building and managing Hadoop clusters.
Strong Experience in troubleshooting the operating system, maintaining the cluster issues and also java related bugs.
Experienced import/export data into HDFS/Hive from relational database and Teradata using Sqoop.
Successfully loaded files to Hive and HDFS from Mongo DB Solar.
Experience in Automate deployment, management and self-serve troubleshooting applications.
Define and evolve existing architecture to scale with growth data volume, users and usage.
Design and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.
Installed and configured Hive and also written Hive UDFs.
Experience in managing the CVS and migrating into Subversion.
Experience in managing development time, bug tracking, project releases, development speed, release forecast, scheduling and many more.

Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse, MySQL and Ubuntu, Zookeeper, Java (JDK 1.6)

Confidential, Chicago, IL

Java Developer

Responsibilities:

Gathered user requirements followed by analysis and design. Evaluated various technologies for the client.
Developed HTML and JSP to present Client side GUI.
Involved in development of JavaScript code for client side Validations.
Designed the HTML based web pages for displaying the reports.
Developed the HTML based web pages for displaying the reports.
Developed java classes and JSP files.
Extensively used JSF framework.
Extensively used XML documents with XSLT and CSS to translate the content into HTML to present to GUI.
Developed dynamic content of presentation layer using JSP.
Develop user-defined tags using XML.
Developed Java Mail for automatic emailing and JNDI to interact with the knowledge server.
Used Struts Framework to implement J2EE design patterns (MVC).
Developed, Tested and Debugged the Java, JSP and EJB components using Eclipse.
Developed Enterprise java Beans like Entity Beans, session Beans (both Stateless and State full Session beans) and Message Driven Beans.

Environment: Java, J2EE, EJB 2.1, JSP 2.0, Servlets 2.4, JNDI 1.2, Java Mail 1.2, JDBC 3.0, Struts, HTML, XML, CORBA, XSLT, Java Script, Eclipse3.2, Oracle10g, Weblogic8.1, Windows 2003.

Confidential

Java Developer

Responsibilities:

Created the Database, User, Environment, Activity, and Class diagram for the project (UML).
Implement the Database using Oracle database engine.
Designed and developed a fully functional generic n-tiered J2EE application platform the environment was Oracle technology driven.
The entire infrastructure application was developed using Oracle JDeveloper in conjunction with Oracle ADF-BC and Oracle ADF- Rich Faces.
Created an entity object (business rules and policy, validation logic, default value logic, security)
Created View objects, View Links, Association Objects, Application modules with data validation rules (Exposing Linked Views in an Application Module), LOV, dropdown, value defaulting, transaction management features.
Web application development using J2EE: JSP, Servlets, JDBC, Java Beans, Struts, Ajax, JSF, JSTL, Custom Tags, EJB, JNDI, Hibernate, ANT, JUnit and Apache Log4J, Web Services, Message Queue (MQ).
Designing GUI prototype using ADF 11G GUI component before finalizing it for development.
Create Reusable Component (ADF Library and ADF Task Flow)
Experience using Version controls such as CVS, PVCS, and Rational Clear Case.
Creating Modules Using Task Flow with Bounded and Unbounded
Generating WSDL (Web Services) And Create Work Flow Using BPEL
Handel the AJAX functions (partial trigger, partial Submit, auto Submit)
Created the Skin for the layout.

Environment: Java core, Servlet, JSF, ADF Rich client UI Framework ADF-BC (BC4J) 11g, web services Using Oracle SOA (Bell), Oracle WebLogic.

We provide IT Staff Augmentation Services!

Hadoop Developer/admin Resume

Piscataway, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship