We provide IT Staff Augmentation Services!

Hadoop Developer Resume

San Rafael, CA

SUMMARY

  • 7 plus years of experience in IT industry which includes 2 plus years of experience in Big Data technologies and widespread experience of 4 plus years in Java, Database Management Systems and Data warehouse systems.
  • Hands on experience in working with Hadoop Ecosystems Including Hive, Pig, HBase, Cassandra, Oozie, Kafka, and Flume.
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and MapReduce programming paradigm.
  • Highly capable of processing large sets of Structured, Semi - structured and Unstructured datasets and supporting BigData applications.
  • Strong experience in writing custom UDFs for Hive and Pig with strong understanding in Pig and Hive analytical functions.
  • Experience in importing and exporting data using Sqoop from Relational Database to HDFS and from HDFS to Relational Database.
  • Extensively worked on Oozie for workflow management, with separate workflows for each layer like Staging, Transformations and Archive layers.
  • Experienced in installing, configuring Hadoop cluster of major Hadoop distributions.
  • Extensively worked on NOSQL Database such as HBase, Cassandra and MongoDB.
  • Worked on MapReduce programs for parallel processing of data and for custom input formats.
  • Extensively worked on Pig for ETL Transformations and optimized Hive Queries.
  • Worked on Flume to maintain log data from external source systems to HDFS.
  • Developed workflow in Oozie to automate tasks of loading the data in to HDFS and preprocessing with pig and used Zookeeper to coordinate the clusters.
  • Deployed, configured and managed Linux servers in VM.
  • Strong UNIX Shell Scripting skills.
  • Extensive experience in working with databases such as SQL Server, MySQL and writing StoredProcedures, Functions, Joins and Triggers for different Data Models.
  • Possess a strong coding experience using Core Java. Expert in developing Strong hands-on experience in Java and J2EE frameworks.
  • Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets
  • Web page interfaces using JSP, Java Swings, and HTML scripting languages.
  • Excellent understanding on Java beans and Hibernate framework to implement model logic to interact with RDBMS databases.
  • Always looking for new challenges that broaden my experience and knowledge, as well as further develop skills that was already acquired.

TECHNICAL SKILLS

Big Data Ecosystems: HDFS, Hive, Pig, MapReduce, Sqoop, HBase, Cassandra, Zookeeper, Flume, Kafka, and Oozie.

Languages: C, C++, Java, J2EE, Spring, Hibernate, Java Servlets, JDBC, JUnit, Python, and Perl

Web Technologies: HTML, DHTML, XHTML, XML, CSS, Ajax, and Java Script

Data Base: MY SQL, Oracle 10g/11g, NOSQL, MongoDB, Microsoft SQL Server, DB2, Sybase, PL/SQL, and SQL*PLUS

Operating System: Linux, Unix, Windows, and Mac OSX

Web Servers: Apache Tomcat 5.x, BEA Web logic 8.x, IBM Websphere 6.00/5.11, IDE Eclipse, and Net beans

Design & Modelling Tools: UML Use Cases, Sequence & class diagrams

Methodologies: Waterfall, Scrum, and Agile

Distributions: Cloudera, Hortonworks, and Apache Hadoop

PROFESSIONAL EXPERIENCE

Confidential, San Rafael, CA

Hadoop Developer

Responsibilities:

  • Configured, Implemented, maintained and deployed Hadoop/ Big Data Ecosystem
  • Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases
  • Used a ETL tool for processing based on business needs and extensively used Oozie workflow engine to run multiple Hive and Pig jobs
  • Load and transferred large complex sets of structured, semi-structured and unstructured data using Sqoop
  • Implemented of MapReduce jobs using techniques such as Hive, Pig, Scoop and YARN architecture
  • Provided NoSql solutions in MongoDB, Cassandra for data extraction and storing huge amount of data
  • Integrated Business Intelligence Reporting Solution like Tableau with various databases
  • Used Apache Spark for large-scale data processing, handling real-time analytics and real streaming of data.
  • Wrote complex queries in SQL for performance tuning
  • Worked closely with Business Stakeholders, UX Designers, Solution Architects and other team members to achieve results together
  • Participated in business requirement analysis, solution design, detailed design, solution development, testing and deployment of various products
  • Delivered robust, flexible and scalable solutions with a dedication to high quality that meet or exceed customer requirements and expectations.

Environment: Java, Hadoop, Hive, Pig, Oozie, Sqoop, YARN, MongoDB, Cassandra, Tableau, Spark, SQL, XML, Eclipse, Maven, JUnit, Linux, Windows, Subversion

Confidential, Kansas City, MO

Hadoop Developer

Responsibilities:

  • Installed and configuredHadoopMap reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in defining job flows.
  • Experience in managing and reviewingHadooplog files.
  • Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
  • Experience in runningHadoopstreaming jobs to process Terabytes of xml format data.
  • Got good experience with NOSQL database.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from Unix file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
  • Replaced default Derby metadata storage system for Hive with MySQL system.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF's to preprocess the data for analysis.
  • Developed Hive queries for the analysts.
  • Involved in loading data from Linux and Unix file system to HDFS.
  • Load and transform large data sets of structured, semi structured and unstructured data.
  • Worked with various Hadoop file formats, including TextFiles, SequenceFile, RCFile.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
  • Developed a custom File System plug in forHadoop.so, it can access files on Data Platform. This plugin allowsHadoopMapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
  • Designed and implemented MapReduce based large scale parallel relation learning system.

Environment: Hadoop, Hive, HBase, MapReduce, HDFS, Pig, Cassandra, Java (JDK 1.6)Hadoop Distribution of Cloudera, MapReduce, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Linux, Unix Shell Scripting

Confidential, Columbus, OH

Hadoop Developer

Responsibilities:

  • Responsible for building a system that ingests Terabytes of data per day onto Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
  • Responsible for converting wide online video and ad impression tracking system, the source of truth for billing, from a legacy stream based architecture to a MapReduce architecture, reducing support effort.
  • Used Cloudera Crunch to develop data pipelines that ingests data from multiple data sources and process them.
  • Used Sqoop to move the data from relational databases to HDFS.Used Flume tomove the data from web logs onto HDFS.
  • Used Pig to apply transformations, cleaning and reduplication of data from raw data sources.
  • Used MRUnit for doing unit testing.
  • Experienced in managing and reviewingHadoop log files.
  • Created adhoc analytical job pipeline using Hive and Hadoop Streaming to compute various metrics and dumped them in Hbase for downstream applications.

Environment: JDK1.6,Red Hat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Python, Crunch, HBase, MRUnit

Confidential

Java Developer

Responsibilities:

  • Involved in designing and implementing the User Interface for the General Information pages and Administrator functionality.
  • Designed front end using JSP and business logic in Servlets.
  • Used Struts Framework for the application based on the MVC-II Architecture and implemented validator Framework.
  • Mapping of the servlet in the Deployment Descriptor (XML).
  • Used HTML, JSP, JSP Tag Libraries, and Struts Tiles to develop presentation tier.
  • Deployed application on Jboss Application Server and also configured database connection pooling.
  • Involved in writing JavaScript functions for front-end validations.
  • Developed stored procedures and Triggers for business rules.
  • Performed unit tests and integration tests of the application.
  • Used CVS as a documentation repository and version controlling tool.

Environment: Java, J2EE, JDBC, Servlets, JSP, Struts, HTML, CSS, Java Script, UML, Jboss Application Server 4.2, MySQL

Confidential

Java Developer

Responsibilities:

  • Developed complete Business tire with Session beans.
  • Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
  • Used Web services (SOAP) for transmission of large blocks of XML data over HTTP.
  • Used XSL/XSLT for transforming common XML format into internal XML format.
  • Apache Ant was used for the entire build process.
  • Implemented the database connectivity using JDBC with Oracle 9i database as backend.
  • Designed and developed Application based on the Struts Framework using MVC design pattern.
  • Used CVS for version controlling and JUnit for unit testing.
  • Deployed the application on JBoss Application server.

Environment: EJB2.0, Struts1.1, JSP2.0, Servlets, XML, XSLT, SOAP, JDBC, JavaScript, CVS, Log4J, JUnit, JBoss 2.4.4, Eclipse 2.1.3, Oracle 9i.

Hire Now