We provide IT Staff Augmentation Services!

Hadoop Developer/admin Resume

0/5 (Submit Your Rating)

Piscataway, NJ

SUMMARY

  • About 8 - years of experience with emphasis on Big Data Technologies, Development and Design of Java based enterprise applications.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
  • Experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
  • Hands-on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase-Hive Integration, PIG, Sqoop, Flume& knowledge of Mapper/Reduce/HDFS Framework.
  • Set up standards and processes for Hadoop based application design and implementation.
  • Worked on NoSQL databases including H base, Cassandra and Mongo DB.
  • Good experience in analysis using PIG and HIVE and understanding of SQOOP and Puppet.
  • Expertise in database performance tuning & data modeling.
  • Developed automated scripts using Unix Shell for performing RUNSTATS, REORG, REBIND, COPY, LOAD, BACKUP, IMPORT, EXPORT and other related to database activities.
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
  • Expertise in working with different databases likes Oracle, MS-SQL Server, Postures, and MS Access 2000 along with exposure to Hibernate for mapping an object-oriented domain model to a traditional relational database.
  • Extensive experience in data analysis using tools like Sync sort and HZ along with Shell Scripting and UNIX.
  • Involved in log file management where the logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 3 months.
  • Experienced in installing, configuring, and administrating Hadoop cluster of major Hadoop distributions.
  • Expertise in development support activities including installation, configuration and successful deployment of changes across all environments.
  • Experience in Creating a design and framework for the generic Abilities Code to handle the multiple and ever expanding list of data files coming from source
  • Familiarity and experience with data warehousing and ETL tools.
  • Good working Knowledge in OOA&OOD using UML and designing use cases.
  • Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
  • Experience in production support and application support by fixing bugs.
  • Used HP Quality Center for logging test cases and defects.
  • Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS

Big Data: Hadoop, Hive, Sqoop, Pig, Puppet, Ambary, HBase, Monod, Cassandra, Power Pivot, Defamer, Pentaho, spark, Flume, SolrCloud

Operating Systems: Windows, Ubuntu, Red Hat Linux, Linux, UNIX

Project Management: Plan View, MS-Project

Programming or Scripting Languages: Java, SQL, Unix Shell Scripting, C, Python

Modeling Tools: UML, Rational Rose

IDE/GUI: Eclipse

Framework: Struts, Hibernate

Database: MS-SQL, Oracle, MS-Access

Middleware: Web Sphere, TIBCO

ETL: Informatica, Pentaho, Netezza

Business Intelligence: OBIEE, Business Objects

Testing: Quality Center, Win Runner, Load Runner, QTP

PROFESSIONAL EXPERIENCE

Confidential, Piscataway, NJ

Hadoop Developer/Admin

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop for analysis, visualization and to generate reports.
  • Developed multiple MapReduce jobs in java for data cleaning.
  • Developed Hive UDF to parse the staged raw data to get the Hit Times of the claims from a specific branch for a particular insurance type code.
  • Schedule these jobs with workflow engine like Oozie. Actions can be performed both sequentially and parallel using Oozie.
  • Built wrapper shell scripts to hold thisOozie workflow.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Involved in creating Hadoop streaming jobs using Python.
  • Used Ganglia to Monitor and Nagios to send alerts about the cluster around the clock
  • Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Worked on MapReduce Joins in querying multiple semi-structured data as per analytic needs.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Created many Java UDF and UDAFs in hive for functions that were not preexisting in Hive like the rank, Scum, etc.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Developed POC for Apache Kafka.
  • Do various performance optimizations like using distributed cache for small datasets, partition and bucketing in hive, doing map side joins etc..
  • Storing and loading the data from HDFS to Amazon S3 and backing up the Namespace data into NFS Filers.
  • Created concurrent access for hive tables with shared and exclusive locking that can be enabled in hive with the help of Zookeeper implementation in the cluster.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Familiarity with NoSQL databases including HBase, Monod.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, Monod, Cassandra, Oracle, NoSQL and Unix/Linux, Kafka, Amazon web services.

Confidential, Englewood, CO

Hadoop Developer / Admin

Responsibilities:

  • Experience with Cloudera and Horton works distribution of Hadoop
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Involved in analyzing system failures, identifying root causes and recommended course of actions.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Deployed Hadoop Cluster in the following modes.
  • Developed multiple MapReduce jobs in java for data cleaning and accessing.
  • Managed Hadoop clusters: monitor, maintain, setup.
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Experienced in defining job flows.
  • Implemented NameNode backup using NFS. This was done for High availability.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS using Sqoop.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Wrote shell scripts to automate document indexing to SolrCloud in production.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Used CGI scripts to access Dbccdb database.
  • Analyzed the web log data using the Havel to extract number of unique visitors per day, page-views, visit duration, most purchased product on website.
  • Converting the Oracle table components to Teradata Table Components in Abilities Graphs
  • Used Ambary to manage, provision and monitor Hadoop cluster.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.

Environment: Hadoop 1x, Hive, Pig, HBASE, Sqoop, Flume, Zookeeper, Pig, HDFS, Ambary, Oracle, CDH3.

Confidential - Cambridge, MA

Java/Hadoop Developer

Responsibilities:

  • Exported data from DB2 to HDFS using Sqoop.
  • Developed MapReduce jobs using Java API.
  • Installed and configured Pig and also wrote Pig Latin scripts.
  • Wrote MapReduce jobs using Pig Latin.
  • Developed workflow using Oozie for running MapReduce jobs and Hive Queries.
  • Worked on Cluster coordination services through Zookeeper.
  • Worked on loading log data directly into HDFS using Flume.
  • Involved in loading data from LINUX file system to HDFS.
  • Responsible for managing data from multiple sources.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Responsible to manage data coming from different sources.
  • Assisted in exporting analyzed data to relational databases using Sqoop.
  • Implemented JMS for asynchronous auditing purposes.
  • Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts
  • Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters
  • Experience in defining, designing and developing Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
  • Experience in Develop monitoring and performance metrics for Hadoop clusters.
  • Experience in Document designs and procedures for building and managing Hadoop clusters.
  • Strong Experience in troubleshooting the operating system, maintaining the cluster issues and also java related bugs.
  • Experienced import/export data into HDFS/Hive from relational database and Teradata using Sqoop.
  • Successfully loaded files to Hive and HDFS from Mongo DB Solar.
  • Experience in Automate deployment, management and self-serve troubleshooting applications.
  • Define and evolve existing architecture to scale with growth data volume, users and usage.
  • Design and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.
  • Installed and configured Hive and also written Hive UDFs.
  • Experience in managing the CVS and migrating into Subversion.
  • Experience in managing development time, bug tracking, project releases, development speed, release forecast, scheduling and many more.

Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse, MySQL and Ubuntu, Zookeeper, Java (JDK 1.6)

Confidential, Chicago, IL

Java Developer

Responsibilities:

  • Gathered user requirements followed by analysis and design. Evaluated various technologies for the client.
  • Developed HTML and JSP to present Client side GUI.
  • Involved in development of JavaScript code for client side Validations.
  • Designed the HTML based web pages for displaying the reports.
  • Developed the HTML based web pages for displaying the reports.
  • Developed java classes and JSP files.
  • Extensively used JSF framework.
  • Extensively used XML documents with XSLT and CSS to translate the content into HTML to present to GUI.
  • Developed dynamic content of presentation layer using JSP.
  • Develop user-defined tags using XML.
  • Developed Java Mail for automatic emailing and JNDI to interact with the knowledge server.
  • Used Struts Framework to implement J2EE design patterns (MVC).
  • Developed, Tested and Debugged the Java, JSP and EJB components using Eclipse.
  • Developed Enterprise java Beans like Entity Beans, session Beans (both Stateless and State full Session beans) and Message Driven Beans.

Environment: Java, J2EE, EJB 2.1, JSP 2.0, Servlets 2.4, JNDI 1.2, Java Mail 1.2, JDBC 3.0, Struts, HTML, XML, CORBA, XSLT, Java Script, Eclipse3.2, Oracle10g, Weblogic8.1, Windows 2003.

Confidential

Java Developer

Responsibilities:

  • Created the Database, User, Environment, Activity, and Class diagram for the project (UML).
  • Implement the Database using Oracle database engine.
  • Designed and developed a fully functional generic n-tiered J2EE application platform the environment was Oracle technology driven.
  • The entire infrastructure application was developed using Oracle JDeveloper in conjunction with Oracle ADF-BC and Oracle ADF- Rich Faces.
  • Created an entity object (business rules and policy, validation logic, default value logic, security)
  • Created View objects, View Links, Association Objects, Application modules with data validation rules (Exposing Linked Views in an Application Module), LOV, dropdown, value defaulting, transaction management features.
  • Web application development using J2EE: JSP, Servlets, JDBC, Java Beans, Struts, Ajax, JSF, JSTL, Custom Tags, EJB, JNDI, Hibernate, ANT, JUnit and Apache Log4J, Web Services, Message Queue (MQ).
  • Designing GUI prototype using ADF 11G GUI component before finalizing it for development.
  • Create Reusable Component (ADF Library and ADF Task Flow)
  • Experience using Version controls such as CVS, PVCS, and Rational Clear Case.
  • Creating Modules Using Task Flow with Bounded and Unbounded
  • Generating WSDL (Web Services) And Create Work Flow Using BPEL
  • Handel the AJAX functions (partial trigger, partial Submit, auto Submit)
  • Created the Skin for the layout.

Environment: Java core, Servlet, JSF, ADF Rich client UI Framework ADF-BC (BC4J) 11g, web services Using Oracle SOA (Bell), Oracle WebLogic.

We'd love your feedback!