Hadoop Developer Resume
Sacramento, CA
SUMMARY
- 6+Years of progressive experience in software developmentwhich includes 3years as Hadoop Developer in Big Data/Hadoop technology using HDFS, MapReduce, Pig, Hive,Sqoop, HBase, Flume, Zookeeper, Apache Spark.
- Experience with Apache Hadoop ecosystem components like HDFS, Map Reduce, Pig, Hive, Impala, HBase, SQOOP, Flume, Oozie, Spark, Kafka.
- Experience working with major Hadoop distributions like Cloudera 3.x and above.
- Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
- Experience in creating UDF's for Hive and Pig.
- Experience with various performance tuning techniques like Partitioning and Bucketing.
- Experience with bulk load/extract from HBase tables and perform all CRUD operations on HBase.
- Fair understanding on File formats like sequence File, RC, ORC, Parquet and compression techniques like, gzip, snappy and LZO.
- Experience in writing both time and data driven workflows using Oozie.
- Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis
- Good Knowledge on NoSQL columnar databases like HBase.
- Experience with Cloudera Impala and Apache Spark for real - time analytical processing.
- Hands on experience in working on Spark SQL queries, Data frames,RDD’s, import data from Data sources, perform transformations, perform read/write operations, save the results to output directory into HDFS.
- Implemented POC’s using Kafka, spark Streaming and Spark SQL.
- Good knowledge and understanding of Java and Scala programming languages.
- Expertise in relational databases like Oracle, My SQL and SQL Server.
- Experience in Agile methodologies.
- Extensive experience with Java complaint IDE's like Eclipse.
- Excellent analytical ability, consultative, communication and management skills.
- Self-motivated, easily adaptable to new environments and ability to work independently as well as in small groups.
PROFESSIONAL EXPERIENCE
Confidential, Sacramento, CA
Hadoop Developer
Responsibilities:
- Handled importing of data from various data sources, performed data control checks using Spark and load data into HDFS.
- Developed several RESTful web services supporting JSON to perform tasks
- Built real time pipeline for streaming data using Kafka and Spark Streaming.
- Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data inHBase.
- Developed Spark scripts by using Scala shell commands as per the requirement.
- Load the data into Spark RDD and performed in-memory data computation to generate the output response.
- Worked on the Spark SQL for analyzing the data.
- Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components.
- Designed and Developed Hive managed/external tables using Struct, Maps and Arrays using various storage formats.
- Implemented various performance techniques(Partitioning, Bucketing) in hive to get better performance.
- Imported Hive tables into Impala for generating reports using Tableau.
- Developed workflows using Oozie to automate the tasks of loading the data into HDFS
- Used Oozie for automating the end to end data pipelines and Oozie coordinators for scheduling the work flows.
Environment: Hadoop, Sqoop, Hive, HDFS, YARN, Zookeeper, Hbase, Apache Spark, Scala, Kafka, Oracle, Java, Spring IOC, Restful webservice.
Confidential, Newark, DE
Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster using various Big Data eco systems including Hive, Sqoop, Pig, Flume, HBase
- Importing the data from Oracle into the HDFS using Sqoop. Performed full and incremental imports using Sqoop jobs.
- Responsible to manage data coming from various sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Used Pig to preprocess the data.
- Used Hive to form an abstraction on top of structured data that resides in HDFS and implemented Partitions, Dynamic Partitions, Buckets on HIVE tables.
- Handled various Hadoop file formats, including ORC and Parquet.
- Involved in integration of Hive with HBase.
- Written customized Hive UDFs in Java where the functionality is too complex.
- Used Flume to collect, aggregate, and store the log data from different web servers.
- Enabled speedy reviews and first mover advantages by defining the job flow in Oozie to automate data loading into the Hadoop Distributed File System.
- Involved in Business Requirement and Functional Specification document reviews, developed low level design documents.
- Involved in creating POCs to ingest and process streaming data using HBase and HDFS.
Environment: Cloudera, MapReduce, Hive, HBase, Flume, Sqoop, Zookeeper, CentOS, Ubuntu, Hadoop, Linux, Oracle, SQL Developer, Putty, WinSCP
Confidential, Tampa, FL
Applications Programmer
Responsibilities:
- Worked on Portal development using LifeRay portal framework.
- Responsible for handling LifeRay Server Setup and LifeRay Performance Tuning.
- Member of development team that works together to fix bugs, write up and test software enhancements, and update test documentation.
- Involved in defect fixing in order to make the application meet WCAG 2.0 accessibility guidelines.
- Involved in developing and fixing UI related bugs using XHTML,CSS and Icefaces 1.8.
- Involved in Production support, monitoring the log files and server maintenance activities.
- Involved in providing weekly statistics to higher management by writing and executing the database queries against production database.
- Involved in the complete deployment activity in delivering the new versions of the application on Linux Servers.
- Implemented fully automated build and deployment scripts to distributed environments using Jenkins, ANT Script and Shell Scripts.
- Involved in sprint planning, Daily Stand-ups, Sprint Retrospectives.
- Responsible for mentoring the new joiners on overall flow of the application, Technologies used and setting up their development environment and preparing required documentation for installations.
- Used Version One,an agile software management tool for keeping track of the stories, tasks and defects.
Environment: Linux Servers, SQL Server 2005, Java, Eclipse, TOMCAT, LifeRay, OpenATNA, OpenXDS, MIRTH, ANT, HL7, Linux, Shell Script, Putty, WinScp, CSS, XHTML, Icefaces 1.8, My SQL, MultiVue, EMPI, SVN.
Confidential, San Antonio, TX
Java Developer
Responsibilities:
- Designed and developed Use-Case Diagrams, Class Diagram and Object Diagrams using UML Rational Rose for OOA/OOD techniques using Enterprise Architect.
- Developed ER and UML diagrams for all the design, and documented all the process flows using Enterprise Architect.
- Analysis, Design, and Implementation of software applications using Java, J2EE, XML and XSLT.
- Designed and Implemented MVC architecture using Struts Framework, Coding involves writing Action Classes/Custom Tag Libraries, JSP.
- Developed Action Forms and Controllers in Struts 2.0/1.2 framework. Utilized various Struts features like Tiles, tagged libraries and Declarative Exception Handling via XML for the design.
- Responsible to develop a system to synchronize database repositories every month with external databases.
- Designed, developed and maintained the data layer using Hibernate.
- Designed and developed Web Services (SOAP, WSDL).Compiled XML Schema to generate Java Bean Classes. Build a new system Candidate Address System (CAS). The CAS allows client to enter a postal code and returns the list of all streets names in the provided postal code.
- Involved in writing Stored Procedures in Oracle and PL/SQL for back end which were used to update business logic over a set of scheduled timers.
- JUnit is used to do the Unit testing for the application.
- Used Apache Ant to compile java classes and package into jar archive.
- Managing/Fixing bugs and Client Issues in the application.
Environment: Java, J2EE, JSP, Servlets, Struts 2.0/1.2, Hibernate, CSS, DHTML, SOA, Java Script, JSTL, HTML 5, XML, XPath, Web Services (SOAP, WSDL), JUnit, Eclipse, JMS, PL/SQL, Oracle, Apache Ant, Eclipse, Ration Rose, Clear Case.