We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Alpharetta, GA

SUMMARY

  • Over 7years of experience spread across Hadoop, Java and ETL, that includes extensive experience into Big Data Technologies and in development of standalone and web applications in multi - tiered environment using Java, Hadoop, Hive, HBase, Pig, Sqoop, J2EE Technologies (Spring, Hibernate), Oracle, HTML, Java Script.
  • Extensive experience on Big Data Analytics with hands on experience in writing MapReduce jobs on Hadoop Ecosystem including Hive and Pig.
  • Excellent knowledge on Hadoop Architecture; as in HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Experience with distributed systems, large scale non-relational data stores, MapReduce systems, data modeling, and big data systems.
  • Involved in developing solutions to analyze large data sets efficiently
  • Excellent hands on with importing and exporting data from different Relational Database Systems like MySQL and Oracle into HDFS and Hive and vice-versa, using Sqoop.
  • Hands-on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
  • Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java.
  • Experience with web-based UI development using jQuery, ExtJS, CSS, HTML, HTML5, XHTML and JavaScript.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper
  • Experience with databases like DB2, Oracle 9i, Oracle 10g, MySQL, SQL Server and MS Access.
  • Experience in creating complex SQL Queries and SQL tuning, writing PL/SQL blocks like stored procedures, Functions, Cursors, Index, triggers and packages.
  • Very good understanding on NOSQL databases like monody and HBase.
  • Have good Knowledge in ETL and hands on experience in Informatica ETL.
  • Extensive experience in creating Class Diagrams, Activity Diagrams, Sequence Diagrams using Unified Modeling Language(UML)
  • Experienced in SDLC, Agile (SCRUM) Methodology, Iterative Waterfall
  • Experience in developing test cases, performing Unit Testing, Integration Testing, experience in QA with test methodologies and skills for manual/automated testing using tools like WinRunner, JUnit.
  • Experience with various version control systems Clear Case, CVS, SVN.
  • Expertise in extending Hive and Pig core functionality by writing custom UDFs.
  • Development Experience with all aspects of software engineering and the development life cycle
  • Strong desire to work for a fast-paced, flexible environment
  • Proactive problem solving mentality that thrives in an agile work environment
  • Good Experience on SDLC (Software Development Life cycle).
  • Exceptional ability to learn new technologies and to deliver outputs in short deadlines.
  • Worked with developers, DBAs, and systems support personnel in elevating and automating successful code to production.
  • Possess strong Communication skills of written, oral, interpersonal and presentation.
  • Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities

TECHNICAL SKILLS:

Hadoop Ecosystem: Hadoop, MapReduce, Sqoop, Spark, Hive, Oozie, PIG, HDFS, Zookeeper, Flume.

NoSQL Databases: HBase, Cassandra, MongoDB.

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI, Java Beans.

Languages: C,C++, JAVA, SQL,PL/SQL, PIG Latin, HiveQL, Unix shell scripting.

Frameworks: MVC, Spring, Hibernate, Struts 1/2, EJB, JMS, JUnit, MR-Unit.

Databases: Oracle 11g/10g/9i, My SQL,DB2, MS SQL Server.

Application Server: Apache Tomcat, JBoss, IBM Web sphere, Web Logic.

Web Services: WSDL, SOAP, Apache CXF, Apache Axis, REST.

Methodologies: Scrum, Agile, Waterfall.

PROFESSIONAL EXPERIENCE:

Confidential, Alpharetta, GA

Hadoop Developer

Responsibilities:

  • Used Scoop to dump data from relational database into HDFS for processing.
  • Configured flume to capture the news from various sources for testing the classifier.
  • Wrote extensive MapReduce Jobs in Java to train the classifier.
  • Wrote MR jobs using various Input and Output formats.
  • Also used custom formats whenever necessary.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing, analyzing and training the classifier using MapReduce jobs, Pig jobs and Hive jobs.
  • Used Open NLP in the removal of stop words and Stemming of the words.
  • Used Pig and Hive in the analysis of data.
  • Involved in development and design of a Hadoop cluster using Apache Hadoop for POC and sample data analysis.
  • Importing and exporting data into HDFS and Hive.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Teradata into HDFS using Sqoop.
  • Implemented modules using Core Java APIs, Java collection, Threads and integrating the modules.
  • Supported Map Reduce Programs those are running on the cluster.
  • Managed and reviewed Hadoop log files to identify issues when job fails.
  • Developed Pig UDFs for preprocessing the data for analysis.
  • Involved in writing shell scripts in scheduling and automation of tasks.
  • Worked on Hive for further analysis and for generating transforming files from different analytical formats to text files.

Environment: Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, Linux, MapReduce, HDFS, Hive, Java (JDK 1.6), Hadoop Distribution of Horton Works, Cloudera, MapReduce, Data tax, IBM DataStage 8.1, Oracle 11g/10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT.

Confidential, San Ramon, CA

Hadoop Developer

Responsibilities:

  • Importing data from relational data stores to Hadoop using Sqoop.
  • Creating various Map reduce jobs for performing ETL transformations on the transactional and application specific data sources.
  • Wrote and executed PIG scripts using Grunt shell.
  • Big data analysis using Pig and User defined functions (UDF).
  • The system was initially developed using Java.
  • The Java filtering program was restructured to have business rule engine in a jar that can be called from both java and Hadoop.
  • Created Reports and Dashboards using structured and unstructured data.
  • Performed joins, group by and other operations in MapReduce by using Java and PIG.
  • Processed the output from PIG, Hive and formatted it before sending to the Hadoop output file.
  • Used HIVE definition to map the output file to tables.
  • Setup and benchmarked Hadoop/HBase clusters for internal use.
  • Wrote data infesters and map reduce programs.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
  • Wrote Map Reduce/HBase jobs.
  • Worked with HBASE NOSQL database.
  • Experienced in analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Implemented Storm builder topologies to perform cleansing operations before moving data into Cassandra.

Environment: Hadoop, Java 1.4 and Java 1.5, IBM AIX 6.1, UNIX, Shell Scripting, XML, XSLT, HDFS, HBase, Cassandra, NOSQL, MapReduce, Hive, PIG.

Confidential, Miami, FL

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Developed Simple to complex Map/reduce Jobs.
  • Real time streaming the data using Spark and Kafka.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Developed Simple to complex Map/reduce Jobs using Hive.
  • Created partitioned tables in Hive.
  • Administered and supported distribution of Horton works.
  • Wrote Korn shell, Bash shell, Pearl scripts to automate most DB maintenance tasks.
  • Worked on Installed and configured Hadoop 0.22.0 Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
  • Importing data into HDFS using Spark and Kafka.
  • Used Maven for continuous build integration and deployment.
  • Importing and exporting data into HDFS and HIVE using Sqoop
  • Developed and tested scripts in Python
  • Responsible to manage data coming from different sources
  • Monitoring the running Map Reduce programs on the cluster.
  • Responsible for loading data from UNIX file systems to HDFS.
  • Installed and configured Hive and also wrote Hive UDFs.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Developed scripts and automated data management from end to end and sync up b/w all the clusters.

Environment: Apache Hadoop, Java (jdk1.6), Bash, Spark, Kafka, Korn, Horton works, Deployment tools, Python, Data tax, Flat files, Oracle 11g/10g, MySql, Toad 9.6, Window NT, UNIX, Sqoop, Hive, Oozie.

Confidential, Chicago, IL

Hadoop Developer

Responsibilities:

  • Responsible for architecting Hadoop clusters.
  • Perform data analysis using Hive and Pig.
  • Monitoring Hadoop cluster using tools like Nagios, Ganglia and Cloudera Manager.
  • Automation script to monitor HDFS and HBase through corncobs.
  • This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
  • Used Spring AOP, Spring IOC frameworks extensively during the development.
  • Designed and implemented MVC2, DAO, and DTO Design Patterns.
  • Implemented JAVA/J2EE design patterns such as Factory, DAO, Session Façade, and Singleton.
  • Plan, design, and implement processing massive amounts of marketing information, complete with information enrichment, text analytics, and natural language processing.
  • Prepare multi-cluster test harness to exercise the system for performance and failover.
  • Load log data into HDFS using Flume, Kafka.
  • Developed high-performance cache, making the site stable and improving its performance.
  • Create a complete processing engine, based on Cloud era’s distribution, enhanced to performance.
  • Build and support standard-based infrastructure capable of supporting tens of thousands of computers in multiple locations.
  • Negotiated and managed projects related to designing and deploying ties architecture.

Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Java, JDBC, JNDI, Struts, Maven, Trace, Subversion, JUnit, SQL language,, XML, Altova Xml Spy, Putty and Eclipse

Confidential, Midland, MI

Hadoop Developer

Responsibilities:

  • Involved in review of functional and non-functional requirements.
  • Facilitated knowledge transfer sessions.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Got good experience with NOSQL database.
  • Involved in loading data from UNIX file system to HDFS.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Designed various Hive and Pig scripts.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Analyzed the customer behavior by performing click stream analysis and to inject the data used flume.
  • Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
  • Installed and configured Hive and also written Hive UDFs.
  • Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
  • Designed and implemented MapReduce-based large-scale parallel relation-learning system
  • Setup and benchmarked Hadoop/HBase clusters for internal use
  • Setup Hadoop cluster on Amazon EC2 using whirr for POC.

Environment: Hadoop, HDFS, MapReduce (MR1), Pig, Hive, Scoop, Oozie, Mahout, Java, spring, Hibernate, JUnit, Oracle, Linux Shell Scripting and Big Data, UNIX Shell Scripting.

Confidential

Java/J2ee Developer

Responsibilities:

  • Developing light weight business component and integrated applications using struts 1.2.
  • Designing and developing front-end, middleware and back-end applications.
  • Optimizing server/client side validation.
  • Worked together with the team in helping transition from Oracle to DB2.
  • Developed the global logging module which was used across all the modules using Log4J components.
  • Developed the presentation layer for the credit enhancement module in JSP.
  • Struts 1.2 were used to implement the Model View Layer (MVC) architecture.
  • Validations were done on the client side as well as the server side.
  • Involved in the configuration management using ClearCase.
  • Detecting and resolving errors/defects in the quality control environment.
  • Using IBatis for mapping Java classes with database.
  • Involved in Code review and integration testing.
  • Used Debugging tools such as PMD, Find Bugs and check style.

Environment: Java v1.6, J2EE 6, Struts 1.2, IBatis, XML, JSP, CSS, HTML, JAVASCRIPT, JQuery, Oracle10g, DB2, Unix, RAD, Clear case, WebSphere V8.0(beta).

We'd love your feedback!