We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Scottsdale, AZ

SUMMARY

  • 6 years of professional IT experience which includes experience in Big data ecosystem related technologies
  • Excellent understanding on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper and Flume Hive, Flume, Oozie, Storm, Kafka..
  • Good Exposure on Apache Hadoop Map Reduce programming, Pig Scripting and Distributed Application and HDFS.
  • Experience in algorithm analysis using R - language and implementing regression model algorithm on bigdata using Spark MLlib by converting scripts to scala.
  • Developed Oozie coordinator for scheduling and orchestrating teh ETL process.
  • Implemented ETL process with Pentaho and Talend.
  • Experience in developing customized UDF's in Java to extend Hive and Pig Latin functionality.
  • Experience in NOSQL column oriented databases like HBase and its integration with Hive & Pig.
  • Implemented Pig scripts, integrated them into Oozie workflows and performed integrated testing.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in importing streaming logs and aggregating teh data to HDFS through Flume.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Excellent understanding of Virtualization, with experience of setting up a POC multi-node virtual cluster by leveraging underlying Bridge Networking and NAT technologies.
  • Strong Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
  • Experience in Extraction, Transformation and Loading (ETL) of data from multiple sources.
  • Highly utilized Teradata, Oracle, SQL Server, Teradata SQL Assistant, BTEQ
  • Excellent experience in creating join indexes, partitioned indexes, and adding collect statistics for better query performance.
  • Proficient in using Teradata Utilities (BTEQ, Fast load, Multiload, FastExport, and TPump) for development.
  • Extensive experience in developing applications using Java and multi-threading.
  • Experience in designing and building web applications using Core Java and Java Enterprise Technologies- JSP, Servlets and JDBC.
  • Good Knowledge on Apache Spark and Scala.
  • Experience in converting PL-SQL packages to Scala as a part of client requirement.
  • Experienced in building projects using Ant and Maven.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, Spark, HDFS, Hive, MapReduce, Pig, Sqoop, Flume, Zookeeper, Crunch, Oozie, Hue, Spark MLlib, Kafka, Spark SQL, Spark streaming, Hadoop distribution of Cloudera CDH3, Hadoop distribution of Hortonworks HDP2, Crunch API, HCatalog, Tez and HBase

Scripting/ Web Languages: JavaScript, Perl, Python, HTML, XML, SQL, Shell Scripting

Programming Languages: C, C++, R, Java and scala

Java/J2EE Technologies: Java, Java Beans, J2EE (JSP, Servlets, EJB), JDBC, SOLR, Pentaho.

Frameworks: Hibernate 2.x/3.x, spring 2.x/3.x/4.x

Databases/ RDBMS: MySQL, PL-SQL, PostgreSQL, MS-SQL Server 2005/2008,Oracle 9i/10g/11g, Hbase, Cassandra.

Statistical Programming: Programming in R, SAS, H2O

Operating Systems: Unix, Windows XP/7/NT/8/2003, MS DOS, Mac, Linux( SUSE, RHEL, UBUNTU)

Software Life Cycles: SDLC, Waterfall and Agile models

Office Tools: MS-OFFICE - Excel, Word, and PowerPoint

Utilities/ Tools: Eclipse, Tomcat, NetBeans, TOAD, JUnit, SQL, SVN, Log4j, Tiles, Developer, SQL*PLUS, Advanced REST client, ANT, Maven, Visio, Mule ESB and MRUnit, RStudio, Talend

Cloud Platforms: AWS EC2, VPC, Redshift, EMR, S3

PROFESSIONAL EXPERIENCE

Confidential, Scottsdale, AZ

Hadoop Developer

Environment: Cloudera, Hadoop, HDFS, Map Reduce, Hive, Pig, HBase, Linux Shell Scripting, Oracle MySQL, Java.

Responsibilities:

  • Installed and configured Hadoop, developed multiple Map Reduce jobs in Java for data cleaning and processing.
  • Installed and configured Pig for ETL jobs.
  • Troubleshooting teh cluster by reviewing Hadoop LOG files.
  • Imported data using Sqoop from Teradata using Teradata connector.
  • Used Oozie to orchestrate teh work flow.
  • Creating Hive tables and working on them for data analysis in order to meet teh business requirements.
  • Good experience on NoSQL database.
  • Designed and implemented Map Reduce-based large-scale parallel relation-learning system.
  • Installed and benchmarked Hadoop / HBase clusters for internal use.
  • Written HBASE Client program in Java and web services.
  • Model, serialize, and manipulate data in multiple forms (xml).
  • Supported post production enhancements.
  • Experience on data model concepts-star schema dimensional modeling relational design (ER).

Confidential, Cary, NC

Hadoop Developer

Environment: Cloudera, Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Flume, HBase, ZooKeeper, Oracle, NoSQL, MySQL and Unix/Linux.

Responsibilities:

  • Installed, configured and maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop for POC.
  • Created HDFS (Hadoop Distributed File System), and MapReduce jobs in java.
  • Implemented NameNode backup using NFS for High availability.
  • Used Pig as ETL tool to do transformations, event joins and some aggregations before storing teh data onto HDFS.
  • Developed data pipeline using flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
  • Used Oozie workflow engine to run multiple Hive and Pig Jobs.
  • Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Created Hive tables and involved in data loading and writing Hive UDFs.
  • Exported teh analyzed data to teh relational database MySQL using Sqoop for visualization and to generate reports.
  • Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
  • Automated workflows using shell scripts to pull data from various databases into Hadoop.

Confidential, NJ

Hadoop Developer

Environment: Cloudera, Eclipse, Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Core java.

Responsibilities:

  • Involved in Installing, Configuring Hadoop ecosystem, and Cloudera Manager using CDH Distribution.
  • Involved in creating Hive tables, loading teh data and writing hive queries that will run internally in Map Reduce.
  • Involved in writing Map Reduce jobs.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract teh data from Weblogs and store in HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in emitting processed data from Hadoop to relational databases or external file systems using Sqoop, HDFS GET or CopyToLocal.
  • Developed data pipeline using Flume, Sqoop, Pig andJavaMapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Experienced in managing and reviewing Hadoop log files.
  • Used Pig to do transformations, event joins, filter boot traffic and aggregations before storing teh data onto HDFS.
  • Written Hive queries for data to meet teh business requirements.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Worked on tuning teh performance of Pig queries.
  • Involved in developing Pig Scripts for data change capture and delta record processing between newly arrived data and already existing data in HDFS.

Confidential

Java Developer

Environment: JDK, J2EE, Eclipse IDE, ANT, JDBC, Servlets, JSP, EJB, Struts, XML and Oracle.

Responsibilities:

  • Developed Stateless Session Beans in teh model layer to implement business logic for teh application.
  • Developed Action Classes for workflow control and Data Access Object for getting database connections from connection pool.
  • Extensively used teh Jakarta Struts Framework.
  • Implemented user session management using Http Sessions.
  • Used JDBC to access Oracle Database and used Stored Procedures.
  • Developed JSP Pages made them accessible to teh Client using Web Logic Application Server.
  • Extensively used complex SQL statements including joins and nested queries
  • Developed Stored Procedures
  • Extensively used XPath for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document.
  • Coded JSP pages and used JavaScript for client side validations and to achieve other client-side functionality.
  • Extensively worked on AJAX
  • Used ANT scripts for building teh application.
  • Developed Java Helper classes for updating Customer Accounts and Customer information.
  • Adopted Sun's coding and documentation standards.

Confidential

Java Developer

Environment: Java, Eclipse, Oracle, HTML, JSP, Tomcat

Responsibilities:

  • Involved in Design, Development, Testing and Integration of teh application.
  • Involved in development of user interface modules using HTML, CSS and JSP.
  • Involved in writing SQL queries
  • Involved in coding, maintaining, and administering Servlets, and JSP components to be deployed on Apache Tomcat application servers
  • Database access was done using JDBC. Accessed stored procedures using JDBC.
  • Worked on bug fixing and enhancements on change requests.
  • Coordinated tasks with clients, support groups and development team.
  • Worked with QA team for test automation using QTP
  • Participated in weekly design reviews and walkthroughs with project manager and development teams.

We'd love your feedback!