We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Dallas, TX

SUMMARY:

  • 5+ years of IT experience in all phases of SDLC like requirements gathering and analysis, system requirements specifications, development, test, deployment and Agile software development in a variety of technologies and environments.
  • 3 years of accumulated experience in Big - data analysis through multiple projects involving Hadoop MapReduce, Apache Spark, HDFS, Pig, Hive, Sqoop.
  • Excellent knowledge on Hadoop system Architecture and Hadoop ecosystem components.
  • Developed custom Apache Spark programs in Scala to analyze and transform unstructured data.
  • Developed Spark Streaming jobs to process incoming streams of data and store them in hive tables.
  • Experienced in ingesting data from traditional database system into the Hadoop data lake using Sqoop for analysis.
  • Analyzed large data sets using Hive queries for Structured data and Pig commands/scripts for unstructured and semi structured data.
  • Experienced in relational database systems (RDBMS) such as MySQL, Oracle.
  • Extensive experience in working on IDEs like Eclipse and Net Beans.
  • In-depth understanding of Data Structures and Algorithms.
  • Excellent knowledge in Java and SQL in application development and deployment.
  • Experience in working with various web technologies like JDBC, JavaScript, JQuery, PHP, HTML, CSS.
  • Strong knowledge in programming languages like C, C++.
  • Experienced in Linux administration and operation.
  • Exceptional ability to learn new technologies and to deliver outputs in short deadlines.
  • Excellent technical communication, analytical and problem solving skills and ability to get on well with people including cross-cultural backgrounds and trouble-shooting capabilities.

TECHNICAL SKILLS:

Big Data: Apache Hadoop, Apache Spark, Hive, Pig, Sqoop

Hadoop Distributions: Cloudera, Hortonworks

Languages: Core Java, J2EE, C, C++, Scala

Web Services & Technologies: SOAP, REST, PHP, JavaScript, JQuery, XML, HTML, CSS and JSON.

Databases: SQL Server, MySQL, Oracle

Web/Application Servers: Apache Tomcat, Web Sphere.

IDE: Eclipse, NetBeans.

Operating System: Linux, Windows XP/7/8/10.

Version Controls: GitHub

PROFESSIONAL EXPERIENCE:

Confidential, Dallas, TX

Hadoop Developer

Responsibilities:

  • Working on a cluster of size 200 nodes and 2 Petabyte capacity.
  • Upgraded the Hadoop system from CDH4 to CDH5 for better performance.
  • Ingested transactional data from Oracle into HDFS using Sqoop.
  • Performed data profiling and quality validation using transient and staging tables in Hive. After all the actions are done, data is loaded into the staging tables.
  • Developed custom Apache Spark programs for data validation to filter unwanted data and cleanse the data.
  • Developed Spark Streaming jobs to process incoming streams of data from Kafka sources.
  • Used Pig to perform analysis on subscription data and extracted required information on how much percentage of customers are subscribing for additional data.

Environment: Hadoop MapReduce, HDFS, Apache Spark, Hive, Sqoop, Kafka, Linux, Cloudera CDH 5, Pig, Teradata, Tableau and Oracle.

Confidential, Apple Valley, MN

Hadoop Developer

Responsibilities:

  • Worked on a Hadoop Cluster with current size of 56 Nodes and 896 Terabytes capacity.
  • Developed custom Apache Spark programs for data validation to filter unwanted data and cleanse the data.
  • Loading files to HDFS and writing Hive Queries to process required data.
  • Experience in managing and reviewing Hadoop log files.
  • Ingested traditional RDBMS data into the HDFS from the existing SQL Server using Sqoop.
  • Performance tuned and optimized Hadoop clusters to achieve high performance.
  • Responsible to manage different test data coming from different sources.
  • Used PIG to process small samples of transformed data for a product purchase prediction to create an efficient product recommendation system.

Environment: Hadoop, MapReduce, Hive 0.10.1, Hadoop distribution on Cloudera, Pig 0.11.1, Linux, Sqoop, Oozie 3.3.0, java.

Confidential

Hadoop Developer

Responsibilities:

  • Setup and configured a cluster of size 25 nodes with a running Hadoop ecosystem.
  • Troubleshot various configuration issues between different components in the ecosystem to ensure seamless performance.
  • Ingested data using Sqoop into Hadoop data lake from traditional RDBMS.
  • Experimented with running various Pig commands and Pig Latin scripts on the data and analyzed the results in business perspective.
  • Developed custom MapReduce programs in java to transform loaded data and analyzed the results for better business insights.
  • Created Hive tables and implemented partitioning technique to improve query performance.
  • Moved large amounts of Archived historical data from the existing systems into the Hadoop data lake for future analysis.

Environment: Hadoop MapReduce, HDFS, Hive, Sqoop, Pig, Linux and MySQL

Confidential

Java/ J2EE Developer

Responsibilities:

  • Implemented Servlets, JSP and Ajax to design the user interface
  • Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface
  • Used Servlets, JSP, Java Script, HTML, and CSS, RESTful for manipulating, validating, customizing, error messages to the User Interface
  • Implemented Object-relation mapping in the persistence layer using Hibernate framework
  • Presentation components in JSP pages are built using ICE faces tag libraries
  • ICE Faces libraries are used in all presentation pages like Search/Inquiry and data collection pages
  • All the Business logic in all the modules is written in core Java
  • Wrote Web Services using SOAP for sending and getting data from the external interface
  • Used XSL/XSLT for transforming and displaying reports Developed Schemas for XML
  • Involved in writing the ANT scripts to build and deploy the application
  • Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework
  • Used Design patterns such as Business delegate, Service locator, Model View Controller, Session, DAO

Environment: JAVA multithreading, collections, J2EE, EJB, UML, SQL, PHP, Sybase, Eclipse, JavaScript, WebSphere, JBOSS, HTML5, DHTML, CSS, XML, ANT, JUNIT, JSP, Servlets, Hibernate.

We'd love your feedback!