Senior Hadoop Developer Resume
Long Beach, CA
PROFESSIONAL SUMMARY:
- Senior Hadoop Developer with 7 years of experience in Software Design, Development, and Implementation of Bigdata, Mobile, Cloud, Web, Client, Server Applications.
- Well versed in Installation, Configuration, Supporting and Managing of Big Data and Underlying infrastructure of Hadoop Cluster.
- Excellent knowledge in Hadoop Architecture and its major components like Hadoop Map Reduce, HDFS Frame work, HIVE, PIG, HBase, Zookeeper, Sqoop, Oozie, Flume and Avro.
- Experience in importing and exporting the data using Sqoop and Flume from HDFS to Relational Database System and vice - versa.
- Experience with Oozie Workflow Engine to automate and parallelize Hadoop Map/Reduce and Pig jobs.
- Experience in analyzing data using HIVEQL, PIGLatin and custom MapReduce programs in Java.
- Experience in writing custom UDF’s like UDAF’s and UDTF’s for extending Hive and Pig core functionality.
- Hands on experience in python for working with Hadoop Streaming.
- Excellent understanding of NoSQL databases like HBase, Cassandra and MongoDB.
- Worked on design and implemented a Cassandra based database and related web services for storing unstructured data.
- Experience in managing Hadoop clusters using Cloudera Manager Tool.
- Good understanding of cloud configuration in Amazon web services (AWS).
- Acquired good knowledge on SOLR/Lucene for indexing of log files.
- Good understanding of MPP databases such as Teradata, Greenplum and Netezza.
- Worked close with admin teams in monitoring Hadoop cluster.
- Good experience in complete project life cycle (design, development and testing) of Client Server and Web applications.
- Hands on Experience in application developments like Java, RDBMS and UNIX Shell Scripting.
- Implemented SOAP based web services.
- Extensive knowledge in Java, J2EE, JSP, JDBC, Servlets, Hibernate, Struts and Spring Framework.
- Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.
TECHNICAL SKILLS:
Methodologies: Agile, Waterfall model.
Big data: HDFS, Hadoop, Map reduce, Hive, Pig, Flume, Hbase, Sqoop, Spark
Operating System: Linux, Windows XP, Server 2003, Server 2008, windows 7, windows8
Databases: MySQL, Oracle, MS SQL Server
Languages: Java, Python, C, C++, Java EE, Visual C++, C#
Web Tools/Frameworks: HTML, XML, JDBC, EJB, JSP, Servlets, Struts, REST API, JMS, Spring, and Hibernate.
PROFESSIONAL EXPERIENCE:
Senior Hadoop Developer
Confidential, Long Beach, CA
Responsibilities:
- Designed HBase tables for time series data. Designed rowkey to avoid region hotspotting and accommodate desired read access/query patterns, used FuzzyRowFilter for fast key search across hbase regions.
- Designed and developed solution for real time data ingestion using Kafka, Storm and HBase. This involves Kafka and storm cluster design, installation and configuration.
- Designed and configured Kafka cluster to accommodate heavy throughput of 1 million messages per second. Used kafka producer 0.8.3 API’s to produce messages.
- Install and configure Phoenix on HDP 2.1. Create views over HBase table and used SQL queries to retrieve alerts and meta data.
- Used HBase API’s to get and scan events data stored in HBase.
- Implemented Douglas Peucker - Decimation algorithm to reduce data size for a given epsilon.
- Implemented 9-point smoothing algorithm to smooth metric values for generating a smooth pattern on visualization.
- Cluster design, installation and configuration using HDP 2.1 stack on Azure and HDP 2.2 on premise setup.
- Implemented NameNode backup using NFS for High availability.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
- Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Created Hive tables and involved in data loading and writing Hive UDFs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Environment: HDP 2.2 - Hadoop, Kafka, Storm, HBase, HDFS, ZooKeeper, Java, Spark, Shell Scripting, JSON, Protobuff/Protocol Buffers, JUnit MySQL IDEA, MS Excel, MS Word, JIRA, Confluence, vi, vim, putty
Senior Hadoop Developer
Confidential, Dallas, TX
Responsibilities:
- Involved in designing and developing Hadoop MapReduce jobs Using JAVA Runtime Environment for the batch processing to search and match the scores.
- Review of functional and non-functional requirements
- Understanding existing system to come up with migration plan to Hadoop system
- Design, Development, testing and deployment of new system in Hadoop environment
- Importing and exporting data into HDFS, Hive and Hbase using Sqoop from Relational Database
- Extracted files from Hbase and placed in HDFS/HIVE for processing
- Exported the analyzed data to the relational database / datawarehouse for visualization and to generate reports (RHadoop and OBIEE) for data visualization
- Used Hive and Pig to analyze data from HDFS
- Load and transform large sets of structured, semi structured and unstructured data
- Developed Map Reduce jobs to implement COMET Batch Processing for all region using HDFS, Hbase and Hive data
- Load log data into HDFS using Flume
- Designed & scheduled workflows for recurring jobs using Oozie coordinator
- Datacenter infrastructure design and implementation for cluster, hardware (rack and servers), network (switches) and other (power and cooling) infrastructure related requirements.
- Hadoop clusters installation, configuration and maintenance for application development (local and internal cloud)
- Development and test environment setup (installation and configuration) with Hadoop Core, HDFC, MapReduce, Yarn, HBase, MongoDB, Hive, Pig, ZooKeeper, Oozie, Ambari, Java and all other major components used in the project
- Actively participate in various internal meet / sessions with other project team members to understand the requirements and suggest suitable Hadoop Stack Technologies along with training sessions
- Explored different Machine Learning Approaches (Mahout), Declarative and Predictive Analysis Tools along with Data Visualization (RHadoop packages RHadoop, RHbase etc) for the next generation Business Intelligence era
Environment: CDH5, Hadoop Map Reduce, Java, YARN, Kafka, Pig, Hive, Sqoop, HBase, flume, Oozie, Zookeeper, Perl, Cognos.
Hadoop Developer/Admin
Confidential, Mountain View, CA
Responsibilities:- The service model that Confidential has developed allows it to perform the analysis on customer data Earlier they were using MysqL Servers for data storage and analysis. The use case for this project was to use HDFS for storage and pre-processing the data. Once the data was pre-processed, the resulting filtered data was loaded into MysqL servers.
- Build 20 node Hadoop cluster from bare-metal hardware
- Configured hardware RAID on name node and job tracker nodes. RAID 1 mirror for OS and RAID 5 for data.
- Configured IPTABLES to allow required services and block unwanted ports.
- Installed and configured PXE-server, DHCP, TFTP, HTTP and FTP as a utility server for bare metal deployment.
- Upgraded the cluster from CDH3U1 to CDH3U3. The tasks were first performed on the staging platform before doing it on production cluster.
- Automated all the jobs starting from pulling the data from MySQL to pushing the result setdata to Hadoop Distributed File System.
- Implemented Namenode backup using NFS. This was done for High availability.
- Used Ganglia to monitor the cluster around the clock.
- Load log data into HDFS using Flume, Kafka.
- Used Log4J for logging purposes.
- Wrote shell scripts for log-Rolling day to day processes and it is automated.
- Implemented Capacity schedulers on the Job tracker to share the resources of the cluster for the map reduce jobs given by the users.
- Responsible for architecting Hadoop clusters.
- Develop high-performance cache, making the site stable and improving its performance.
- Created a java utility to sort and convert log files to XML files based on different parameters.
- Supported Data Analysts in running Map Reduce Programs.
- Worked on importing and exporting data into HDFS and Hive using Sqoop.
- Worked on analyzing data with Hive and Pig.
- Experience in Implementing Rack Topology scripts to the Hadoop Cluster.
- Manage the day-to-day operations of the cluster for backup and support.
- Upgraded the Hadoop Cluster to Cloud era Manager 3.7.
Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Java, Hibernate, and spring, Maven, SVN, SQL, Junit, Eclipse, XML, Log4j.
Hadoop Developer
Confidential, San Antonio, TX
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in writing MapReduce jobs.
- Involved in SQOOP, HDFS Put or CopyFromLocal to ingest data.
- Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
- Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
- Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Computed various metrics using Java MapReduce to calculate metrics that define user experience, revenue etc.
- Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify the success of the experiment.
- Used Eclipse and ant to build the application.
- Performed unit testing using MRUnit.
- Involved in using SQOOP for importing and exporting data into HDFS and Hive using Sqoop.
- Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
- Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, MapReduce) and move the data files within and outside of HDFS.
Environment: Hadoop, MapReduce MRv1, MRv2, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java.
Java Developer
Confidential
Responsibilities:
- Develop of dynamic Web Pages using JSP
- Develop control components and model components using java, servlets in Struts
- Design scalable, extensible system using the J2EE (Java 2 Enterprise Edition) framework architecture.
- Designed and developed the action form beans and action classes and implemented MVC using Struts framework.
- Install, configure and administer Web Logic Application Server and deploy JSP, Servlets and EJB applications.
- Development, and testing of the Java classes to be used in JSP and Servlets
- Developed user interfaces using JSP, HTML.
Environment: Java, JSP, Struts framework, Tiles, Servlets, Web Sphere 5.1, Oracle, EJB, Ant, and MyEclipse.
Java Developer
Confidential
Responsibilities:- Increasing productivity, improving the customer experience, and enabling management to make informed decisions based on the latest data are just some of the benefits that companies derive from adopting mobility solutions.
- Involved in the Design, Coding, Testing and Implementation of the web application.
- Developed JSP Java Server Pages starting from HTMLs and detailed technical design specification documents. Pages included HTML, CSS, JavaScript, Hibernate and JSTL.
- Developed SOAP based requests for communicating with Web Services.
- Used agile systems and strategies to provide quick and feasible solutions, based on agile system, to the organization.
- Implemented HTTP Modules for different applications in Struts Framework that uses Servlets, JSP, ActionForm, ActionClass and ActionMapping.
- Developing web applications using MVC Framework, Spring, Struts, Hibernate.
- Involved in the creation of custom interceptors for Validation purposes.
- Analyzed and fixed defects in the Login application.
- Involved in configuration and deployment of application on the JBoss Application.
- Involved in dynamic creation of error elements on demand when there is an error.
- Involved in Ajax - based Rich Browser User Interfaces.
- Ensured design consistency with client’s development standards and guidelines.
- Improved user experience by designing and creating new web components and features.
Environment: Java, J2EE, Struts, SOAP web services, SOA, Spring, Hibernate, JavaScript, jQuery, JBoss Application Server, Oracle, AJAX, JSP, Servlets, Eclipse, CVS Source control, Linux.
