Hadoop Developer Resume
San Rafael, CA
SUMMARY
- 7 plus years of experience in IT industry which includes 2 plus years of experience in Big Data technologies and widespread experience of 4 plus years in Java, Database Management Systems and Data warehouse systems.
- Hands on experience in working with Hadoop Ecosystems Including Hive, Pig, HBase, Cassandra, Oozie, Kafka, and Flume.
- Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and MapReduce programming paradigm.
- Highly capable of processing large sets of Structured, Semi - structured and Unstructured datasets and supporting BigData applications.
- Strong experience in writing custom UDFs for Hive and Pig with strong understanding in Pig and Hive analytical functions.
- Experience in importing and exporting data using Sqoop from Relational Database to HDFS and from HDFS to Relational Database.
- Extensively worked on Oozie for workflow management, with separate workflows for each layer like Staging, Transformations and Archive layers.
- Experienced in installing, configuring Hadoop cluster of major Hadoop distributions.
- Extensively worked on NOSQL Database such as HBase, Cassandra and MongoDB.
- Worked on MapReduce programs for parallel processing of data and for custom input formats.
- Extensively worked on Pig for ETL Transformations and optimized Hive Queries.
- Worked on Flume to maintain log data from external source systems to HDFS.
- Developed workflow in Oozie to automate tasks of loading the data in to HDFS and preprocessing with pig and used Zookeeper to coordinate the clusters.
- Deployed, configured and managed Linux servers in VM.
- Strong UNIX Shell Scripting skills.
- Extensive experience in working with databases such as SQL Server, MySQL and writing StoredProcedures, Functions, Joins and Triggers for different Data Models.
- Possess a strong coding experience using Core Java. Expert in developing Strong hands-on experience in Java and J2EE frameworks.
- Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets
- Web page interfaces using JSP, Java Swings, and HTML scripting languages.
- Excellent understanding on Java beans and Hibernate framework to implement model logic to interact with RDBMS databases.
- Always looking for new challenges that broaden my experience and knowledge, as well as further develop skills that was already acquired.
TECHNICAL SKILLS
Big Data Ecosystems: HDFS, Hive, Pig, MapReduce, Sqoop, HBase, Cassandra, Zookeeper, Flume, Kafka, and Oozie.
Languages: C, C++, Java, J2EE, Spring, Hibernate, Java Servlets, JDBC, JUnit, Python, and Perl
Web Technologies: HTML, DHTML, XHTML, XML, CSS, Ajax, and Java Script
Data Base: MY SQL, Oracle 10g/11g, NOSQL, MongoDB, Microsoft SQL Server, DB2, Sybase, PL/SQL, and SQL*PLUS
Operating System: Linux, Unix, Windows, and Mac OSX
Web Servers: Apache Tomcat 5.x, BEA Web logic 8.x, IBM Websphere 6.00/5.11, IDE Eclipse, and Net beans
Design & Modelling Tools: UML Use Cases, Sequence & class diagrams
Methodologies: Waterfall, Scrum, and Agile
Distributions: Cloudera, Hortonworks, and Apache Hadoop
PROFESSIONAL EXPERIENCE
Confidential, San Rafael, CA
Hadoop Developer
Responsibilities:
- Configured, Implemented, maintained and deployed Hadoop/ Big Data Ecosystem
- Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases
- Used a ETL tool for processing based on business needs and extensively used Oozie workflow engine to run multiple Hive and Pig jobs
- Load and transferred large complex sets of structured, semi-structured and unstructured data using Sqoop
- Implemented of MapReduce jobs using techniques such as Hive, Pig, Scoop and YARN architecture
- Provided NoSql solutions in MongoDB, Cassandra for data extraction and storing huge amount of data
- Integrated Business Intelligence Reporting Solution like Tableau with various databases
- Used Apache Spark for large-scale data processing, handling real-time analytics and real streaming of data.
- Wrote complex queries in SQL for performance tuning
- Worked closely with Business Stakeholders, UX Designers, Solution Architects and other team members to achieve results together
- Participated in business requirement analysis, solution design, detailed design, solution development, testing and deployment of various products
- Delivered robust, flexible and scalable solutions with a dedication to high quality that meet or exceed customer requirements and expectations.
Environment: Java, Hadoop, Hive, Pig, Oozie, Sqoop, YARN, MongoDB, Cassandra, Tableau, Spark, SQL, XML, Eclipse, Maven, JUnit, Linux, Windows, Subversion
Confidential, Kansas City, MO
Hadoop Developer
Responsibilities:
- Installed and configuredHadoopMap reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in defining job flows.
- Experience in managing and reviewingHadooplog files.
- Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
- Experience in runningHadoopstreaming jobs to process Terabytes of xml format data.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from Unix file system to HDFS.
- Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
- Replaced default Derby metadata storage system for Hive with MySQL system.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF's to preprocess the data for analysis.
- Developed Hive queries for the analysts.
- Involved in loading data from Linux and Unix file system to HDFS.
- Load and transform large data sets of structured, semi structured and unstructured data.
- Worked with various Hadoop file formats, including TextFiles, SequenceFile, RCFile.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
- Developed a custom File System plug in forHadoop.so, it can access files on Data Platform. This plugin allowsHadoopMapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented MapReduce based large scale parallel relation learning system.
Environment: Hadoop, Hive, HBase, MapReduce, HDFS, Pig, Cassandra, Java (JDK 1.6)Hadoop Distribution of Cloudera, MapReduce, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Linux, Unix Shell Scripting
Confidential, Columbus, OH
Hadoop Developer
Responsibilities:
- Responsible for building a system that ingests Terabytes of data per day onto Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
- Responsible for converting wide online video and ad impression tracking system, the source of truth for billing, from a legacy stream based architecture to a MapReduce architecture, reducing support effort.
- Used Cloudera Crunch to develop data pipelines that ingests data from multiple data sources and process them.
- Used Sqoop to move the data from relational databases to HDFS.Used Flume tomove the data from web logs onto HDFS.
- Used Pig to apply transformations, cleaning and reduplication of data from raw data sources.
- Used MRUnit for doing unit testing.
- Experienced in managing and reviewingHadoop log files.
- Created adhoc analytical job pipeline using Hive and Hadoop Streaming to compute various metrics and dumped them in Hbase for downstream applications.
Environment: JDK1.6,Red Hat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Python, Crunch, HBase, MRUnit
Confidential
Java Developer
Responsibilities:
- Involved in designing and implementing the User Interface for the General Information pages and Administrator functionality.
- Designed front end using JSP and business logic in Servlets.
- Used Struts Framework for the application based on the MVC-II Architecture and implemented validator Framework.
- Mapping of the servlet in the Deployment Descriptor (XML).
- Used HTML, JSP, JSP Tag Libraries, and Struts Tiles to develop presentation tier.
- Deployed application on Jboss Application Server and also configured database connection pooling.
- Involved in writing JavaScript functions for front-end validations.
- Developed stored procedures and Triggers for business rules.
- Performed unit tests and integration tests of the application.
- Used CVS as a documentation repository and version controlling tool.
Environment: Java, J2EE, JDBC, Servlets, JSP, Struts, HTML, CSS, Java Script, UML, Jboss Application Server 4.2, MySQL
Confidential
Java Developer
Responsibilities:
- Developed complete Business tire with Session beans.
- Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
- Used Web services (SOAP) for transmission of large blocks of XML data over HTTP.
- Used XSL/XSLT for transforming common XML format into internal XML format.
- Apache Ant was used for the entire build process.
- Implemented the database connectivity using JDBC with Oracle 9i database as backend.
- Designed and developed Application based on the Struts Framework using MVC design pattern.
- Used CVS for version controlling and JUnit for unit testing.
- Deployed the application on JBoss Application server.
Environment: EJB2.0, Struts1.1, JSP2.0, Servlets, XML, XSLT, SOAP, JDBC, JavaScript, CVS, Log4J, JUnit, JBoss 2.4.4, Eclipse 2.1.3, Oracle 9i.