We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Little Rock, AR

SUMMARY

  • 8+ Years of IT experience while having 3+ years of extensive experience in Big Data Technologies and web applications in multi - tiered environment using Java, Hadoop, Spark, Hive, HBase, Pig, Sqoop, Kafka, Cassandra, AWS, J2EE (JSP, Servlets), JDBC, HTML, CSS, JavaScript.
  • Over 1 years of experience in Spark SQL and Spark Streaming.
  • Gained in depth understanding of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Hadoop MapReduce Paradigm.
  • Hands on experience in developing and deploying enterprise based applications using major components in Hadoop ecosystem like Hadoop 2.x, YARN, Hive, Pig, Map Reduce, HBase, Flume, Sqoop, Spark, Storm, Kafka, Oozie and Zookeeper.
  • Gained good administrative skills and experience in installing, configuring and maintaining Hadoop cluster and its ecosystem components such as Hive, Pig and Sqoop.
  • Good knowledge in installing Hadoop using Apache Ambari.
  • Experience in converting Hive/SQL queries into Spark transformations using Java.
  • Have good experience in writing Hadoop jobs in Hive (Query Language), Pig (Data Flow Language) and Custom MapReduce Programs in java.
  • Knowledge on Cloud technologies like AWS Cloud.
  • Wrote Hive Queries, Pig Scripts for data analysis to meet the requirements.
  • Experience in writing UDFs (User Defined Functions) to enhance the functionalities of Hive and Pig.
  • Expert in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the Hive queries.
  • Have good experience in using Pig Latin operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, UNION, SPLIT to extract data from data files to load into HDFS.
  • Experience in importing and exporting data using Sqoop from HDFS file system to RDBMs and vice-versa.
  • Written multiple MapReduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
  • Extracted data from log files and push into HDFS using Flume.
  • Familiar with Ubuntu, Linux and Centos operating systems and have good experience in shell scripting and using shell commands and has good understanding of OOPS and Data structures.
  • Experience in managing and reviewing Hadoop Log files.
  • Have good experience in dealing with SQL and database concepts such as ERD, EERD, Normalization Techniques.
  • Worked on Oozie workflow engine which works on DAG (Directed Acyclic Graph) concept for job scheduling.
  • Able to work in both independently and in collaborative team environments.

TECHNICAL SKILLS

Bigdata Ecosystem: Hadoop, Map Reduce, HDFS, Hive, Pig, HBase, Zookeeper, Sqoop, Spark, Storm, Kafka, Cassandra.

Oozie Java / J2EE Technologies: Java 6.0, J2EE, Servlets, JSP, JDBC, XML, AJAX

Methodologies: Agile, SDLC, Waterfall

Enterprise Frameworks: Ajax, MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2

Programming Languages: Java, XML, Unix Shell scripting, SQL and PLSQL Web

Technologies: HTML, DHTML, XML, XSLT, JavaScript, CSS

Web Servers: Web Logic, Web Sphere, Apache Tomcat, JBoss

Databases: Oracle 11g/10g, DB2, MS - SQL Server, MySQL, MS - Access Operating Oracle 8.x/9.x/10g/11g, SQL Server 2005/2008/2012 DB2, My SQL 5.0/4.1, MS-Access. Editors (SQL Navigator, Toad), Teradata.

Systems: Windows 9x/NT/XP, UNIX, Linux

PROFESSIONAL EXPERIENCE

Hadoop Developer

Confidential - Little Rock, AR

Responsibilities:

  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Sqoop, Flume, Spark, Impala with Cloudera distribution.
  • Installed Hadoop, Map Reduce, and HDFS and developed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and HBase.
  • Developed workflows and coordinator jobs in Oozie.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Experience in deploying data from various sources into HDFS and building reports using Tableau.
  • Developed a data pipeline using Kafka and Strom to store data into HDFS.
  • Performed real time analysis on the incoming data.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Loading data into HBase using Bulk Load and Non-bulk load.

Environment: Hadoop, HDFS, MapReduce, Spark, Pig, Hive, Sqoop, Flume, Kafka, HBase, Oozie, Java, SQL scripting, Linux shell scripting, Eclipse and Cloudera.

Hadoop Developer

Confidential - Columbus, OH

Responsibilities:

  • Helped business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
  • Hands on Experience in using Apache Ambari for installing, configuring and monitoring different prime components of the cluster on CentOS systems.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Supported code/design analysis, strategy development and project planning.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Created reports for the BI team using Sqoop to export data into HDFS and Hive.
  • Developed UDFs using JAVA, PIG and HIVE queries as per the task requirement.
  • Created Sqoop job with incremental load to populate Hive External tables.
  • Assisted with data capacity planning and node forecasting.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Administrator for Pig, Hive and HBase installing updates, patches and upgrades.
  • Handling structured and unstructured data and applying ETL processes.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Very good experience with both MapReduce 1 and MapReduce 2(YARN).
  • Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop cluster.
  • Actively participating in the code reviews, meetings and solving any technical issues.

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, Ambari, DataStax, Spring 2.5, Hibernate 3.0, JSF, Servlets, JDBC, JSP, JSTL, JPA, JavaScript, Eclipse 3.4, log4j, CVS, CSS, Xml, XSLT, SMTP.

Hadoop Developer

Confidential - Denver, CO

Responsibilities:

  • Installed and configured fully distributed Hadoop cluster.
  • Performed Hadoop cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting.
  • Extensively used Cloudera Manager to manage the Hadoop cluster.
  • Configured Hive metastore, which stores the metadata for Hive tables and partitions in a relational database.
  • Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements.
  • Developed Sqoop scripts to import and export the data from relational sources.
  • Worked with various HDFS file formats like Avro, Sequence File and various compression formats like Snappy, bzip2.
  • Developed efficient MapReduce programs for filtering out the unstructured data.
  • Developed the Pig UDF's to pre-process the data for analysis.
  • Developed Hive queries for data sampling and analysis to the analysts.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Developed custom Unix SHELL scripts to do pre and post validations of master and slave nodes, before and after configuring the name node and data nodes respectively.
  • Involved in HDFS maintenance and administering it through Hadoop-Java API.
  • Supported Map Reduce Programs those are running on the cluster.
  • Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate the benefits of migrating to Hadoop.
  • Involved in implementing several POC’s that demonstrate the advantages Businesses gain by migrating to Hadoop.

Environment: RedHat Linux 5, MS SQL Server, Mongo DB, Oracle, Hadoop CDH 3/4/5, PIG, Hive, ZooKeeper, HDFS, HBase, Sqoop, Python, Java, Oozie, Hue, Tez, UNIX Shell Scripting, PL/SQL, Maven, Ant

Java/J2EE Developer

Confidential - New York City, NY

Responsibilities:

  • Involved in analysis, design, coding and testing phases of software development.
  • Analyzed and prepared time estimates for the assigned tasks.
  • Implemented MVC Design Pattern using Struts MVC Framework.
  • Used various JSP, Java script, XHTML and CSS for developing the view components.
  • Involved in writing validation classes using core java and Struts validation rules.
  • Developed Business layer and Data layer with Enterprise Java and Hibernate.
  • Created Hibernate mapping files for business objects with tables in the database.
  • Developed test cases using JUNIT, and followed test first development.
  • Wrote stored procedures and triggers. Also involved in SQL query tuning and optimization.
  • Implemented using eclipse IDE.
  • Resolved issues and dependencies with components of different subsystems by effectively communication with other groups.

Environment: J2EE, MVC Architecture, Struts 1.3, Java, JSP, Servlets, Hibernate, JSTL, JUnit, XML, HTML, JavaScript, DB2, Informix, CVS, UNIX, Windows XP, UML, Eclipse 3.0, Web logic 8.0 Application Server

Jr.Java/J2EE Developer

Confidential

Responsibilities:

  • Involved in requirements analysis and prepared Technical Design Document.
  • Designed implementation logic for core functionalities.
  • Developed service layer logic for core modules using JSPs and Servlets and involved in integration with presentation layer.
  • Involved in implementation of presentation layer logic using HTML, CSS, JavaScript and XHTML.
  • Design of MySQL database to store customer's general and billing details.
  • Used JDBC connections to store and retrieve data from the database.
  • Development of complex SQL queries and stored procedures to process and store the data.
  • Used ANT, a build tool to configure application.
  • Developed test cases using JUnit.
  • Involved in unit testing and bug fixing using BugZilla.
  • Prepared design documents for code developed and defect tracker maintenance.

Environment: Java, J2EE (JSPs & Servlets), JUnit, HTML, Subversion, CSS, JavaScript, Apache Tomcat, MySQL.

We'd love your feedback!