We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

BellevuE

SUMMARY

  • 8+ years of experience in data analysis, data modeling and implementation of enterprise class systems spanning Big Data, Data Integration, Object Oriented programming and Advanced Analytics
  • 3+ years of Full life cycle experience delivering end to end big data projects processing petabytes of activity data by gathering business requirements, creating data models, developing logic, testing, deploying and maintaining
  • Expertise in Big Data Technologies and Hadoop Ecosystem like HDFS, Map Reduce, Hive, Pig, Sqoop, Oozie, Flume, Apache Kafka, Apache Storm, Apache spark, Zookeeper, Avro and NoSQL databases like HBase, Cassandra, MongoDB
  • Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala
  • Hands on experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster using Cloudera, Hortonworks distributions and MapR
  • Good understanding of Data Mining and Machine Learning techniques and experience implementing them with Big Data tools like Mahout, Spark Mlib, H2O
  • Excellent understanding of Hadoop architecture and different demons of Hadoop clusters which include Job Tracker, Task Tracker, Name Node and Data Node and also Yarn demons like Resource manager and Node Manager and Job history server
  • Hands on experience in various Hadoop distributions IBM Big Insights, Cloudera, Horton works and MapR
  • Experience with Apache Crunch in order to pipeline the composed UDF’S to write, easy to test, and efficient to run
  • Experience with Hadoop Business Intelligence (BI) to speed up the process of analyzing, transforming and loading huge data
  • Extensive experience in Data Modelling, Database Design and Development using Relational Databases (Oracle9i/10g, DB2, MySQL Server 2003/2005) and NoSQL Databases (HBase, Cassandra, MongoDB)
  • Proficient in Big data ingestion and streaming tools like Flume, Sqoop, Spark - Streaming, Kafka and Storm and developing Customized Map Reduce programs using Apache Hadoop
  • Strong Knowledge on Hive Storage format (Sequence Files, Avro and RC Files), compression techniques (Record, Block) and using SerDe’s for insert/load operations
  • Knowledge on Apache Solr as an end point in various data processing frameworks and Enterprise integration frameworks
  • Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop and in implementing Ad-Hoc Map Reduce programs using Pig Scripts and hive queries
  • Hands on experience in creating Hive scripts, HIVE tables, UDAFs, UDFs, UDTFs, Partition/ Bucketing
  • Hands on experience in creating Apache Spark RDD transformations on Data sets in Hadoop data lake
  • Experience in importing streaming data into HDFS using flume sources, and flume sinks and transforming the data using Flume Interceptors and also build Custom Flume & Inceptor sterilizers
  • Experience in integrating Apache Kafka with Apache Storm and created Storm data pipelines for real time processing
  • Knowledge in handling Kafka cluster and created several topologies to support real-time processing
  • Worked on ETL reports using Tableau and created statistics dashboards for Analytics

TECHNICAL SKILLS

Big Data Technologies: Hadoop, HDFS, Hive, MapReduce, Pig, Sqoop, Cassandra Flume, Zookeeper, Oozie, Json, Spark, Kafka

Batch Processing: Cascading/Scalding, Apache Spark, Hadoop MapReduce

Programming Languages: C, C++, Java, Shell Scripting

Java/J2EE Technologies: Java, Java Beans, J2EE (JSP, Servlets, EJB), JDBC, MySQL, Spring.

DB Languages: SQL, PL/SQL

NoSQL Databases: Hbase, MongoDB, Riak

Application Servers: Tomcat

Operating Systems: LINUX, MS DOS, Windows XP 10.., Windows Vista, Windows Server 2003 Windows, Server 2008 and Unix

PROFESSIONAL EXPERIENCE

Confidential, Bellevue

Sr. Hadoop Developer

Responsibilities:

  • T mobile is creating a centralized data lake for managing all the analytical processes using Hadoop as the core platform for this
  • Involved in developing a customized in built tool Data Movement Framework (DMF) for ingesting data from external and internal sources into Hadoop using Sqoop, Shell script
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Used SED, AWK and PIG scripts to clean and scrub the data before put into Data Lake.
  • Proposed an automated system using Shell script to implement import using Sqoop
  • Worked in Agile development approach and managed the Hadoop teams of various Sprints
  • Experienced in Migrating data of file sources and Mount sources from RDMS system, Databases, mount locations to Hadoop using by using DMF framework.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig
  • Production Rollout Support and resolving any issues that are discovered by the client and client services teams
  • Extensively involved in review sessions to determine, understand dataflow and data mapping from source to target databases by coordinating with End Users, Business Analysts, DBAs and Application Architects

Confidential, Atlanta

Kafka, Storm and Hadoop Developer

Responsibilities:

  • Configured, deployed and maintained multi-node Dev and Tested Kafka Clusters
  • Developed multiple Kafka Producers and Consumers from base by using low level and high level API’s and implementing
  • Hands on editing Kafka queues like modifying, adding, deleting as a part of Kafka topics
  • Developed code to write canonical model JSON records from various input sources to Kafka Queues
  • Implemented High level Kafka consumers to get data from Kafka partitions and move into Cassandra for near real time analytics
  • Develop and lead the engineer of a log analysis platform consisting of ElasticSearch to process large amount of data
  • Working on implementing Solr 4.0 to search insurance seekers across 25,000 sources
  • Hands on analyzing Indexing, Automated index replication, Server statistics logging using Solr
  • Involved in design Cassandra data model, used CQL (Cassandra Query Language) to perform CRUD operations on Cassandra file system
  • Utilized RabbitMQ as the messaging middleware
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Experienced with batch processing of data sources using Apache Spark and Elastic search.
  • Experienced in implementing Spark RDD transformations, actions to implement business analysis migrated Hive QL queries on structured into Spark QL to improve performance
  • Configured, deployed and maintained a single node storm cluster in DEV environment
  • Worked on setting and varying the configurations involving partition’s, TTL and replications factors
  • Developed code base to stream data from sample data files Kafka Kafka Spout Storm Bolt HDFS Bolt
  • Developing predictive analytic using Apache Spark Scala APIs
  • Documented the data flow form Application Kafka Storm HDFS Hive tables
  • Created Map reduce Generic writable classes to wrap up writable implementations and send it to the same reducer
  • Involved in creating hive tables, loading data into tables and writing hive queries for optimizing
  • Developed topologies and storm bolts involving Kafka spouts to stream data from Kafka
  • Designed and developed tests and POC’s to benchmark and verify data flow through the Kafka clusters

Confidential, Sterling VA

Hadoop Developer

Responsibilities:

  • Installed and configured Cassandra and good knowledge about Cassandra architecture, read, write paths and query
  • Good knowledge and experience with Hadoop stack - internals, Hive, Pig and Map Reduce
  • Worked on writing Map Reduce jobs to discover trends in data usage by customers
  • Worked on and designed Big Data analytics platform for processing customer interface preferences and comments using Java, Hadoop, Hive and Pig
  • Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie
  • Installed and configured Hive and also written Hive QL scripts
  • Experience with loading the data into relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming
  • Experience with creating ETL jobs to load JSON data and server data into MongoDB and transported MongoDB into the Data Warehouse
  • Created reports and dashboards using structured and unstructured data
  • Experienced with performing analytics on Time Series data using HBase
  • Implemented HBase co-processors, Observers to work as event based analysis
  • Hands on Installing and configuring nodes CDH4 Hadoop Cluster on CentOS
  • Implemented Hive Generic UDF's to implement business logic
  • Experienced with accessing Hive tables to perform analytics from java applications using JDBC.
  • Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
  • Applied Hot-bug fixes and also version patches for Netezza, Oracle, SQL server and Informatica in Windows environment
  • Netezza PROD Appliances (TF120, TF96, TF48 & TF12) Regular Maintenance & Monitoring make sure
  • Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala.
  • Experience with streaming work flow operations and Hadoop jobs using Oozie workflow and scheduled through AUTOSYS on a regular basis
  • Performed operation using Partitioning pattern in Map Reduce to move records into different categories
  • Experience in implementing Machine Learning library in spark implemented in Scala.
  • Developed Spark SQL scripts and involved in converting hive UDF’s to Spark SQL UDF’s
  • Responsible for batch processing and real time processing in HDFS and NOSQL Databases.
  • Responsible for retrieval of Data from Casandra and ingestion to PIG

Environment: Casandra, Map jobs, Spark SQL,ETL, Pig Scripts, Hadoop BI, Pig UDF’s, Oozie, HIVE, AVRO, Hive Scala, Map Reduce, Java, Eclipse

Confidential, San Francisco, CA

Java/Hadoop Developer

Responsibilities:

  • Developed web components using JSP, Servlets and JDBC
  • Worked closely with EJBs to code reusable components and business logic while working on java beans
  • Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing
  • Supported in setting up updating configurations for implementing scripts with Pig and Sqoop
  • Migrated existing SQL queries to HiveQL queries to move to big data analytical platform
  • Integrated Cassandra file system to Hadoop using Map Reduce to perform analytics on Cassandra data
  • Wrote test cases in Junit for unit testing of classes
  • Used Hibernate ORM framework with Spring framework for data persistence
  • Implemented Real time analytics on Cassandra data using thrift API
  • Responsible to manage data coming from different sources
  • Supported the clusters that were running on Map Reduce Programs
  • Involved in loading data from UNIX file system to HDFS
  • Involved in templates and screens in HTML and JavaScript
  • Load and transform large sets datainto HDFS using Hadoop fs commands
  • Designed the logical and physical data modeling wrote DML scripts for Oracle 9i database
  • Experience with installing cluster, commissioning as well as decommissioning of different types of nodes like Name and Data nodes
  • Worked on capacity planning, node recovery and slots configuration

Confidential

JAVA/J2EE Developer

Responsibilities:

  • Working experience in different phases of Software Development Lifecycle (SDLC) of the application, like designing, analysis, requirements gathering, development and deployment of the application
  • Experience with Model View Control (MVC) design pattern was implemented with Struts MVC, Servlets, JSP, HTML, AJAX, JavaScript, CSS to control the flow of the application in the Presentation/Web tier, Application/Business layer (JDBC) and Data layer (Oracle 10g)
  • Hands on Analysis, Design, and Implementation of software applications using Java, J2EE, XML and XSLT
  • Worked on Developed Action Forms and Controllers in Struts 2.0/1.2 framework
  • Utilized various Struts features like Tiles, tagged libraries and Declarative Exception Handling via XML for the design
  • Developed JavaScript validations on order submission forms
  • Designed, developed and maintained the data layer using Hibernate

Confidential

Software Engineer

Responsibilities:

  • Developed web components using JSP, Servlets and JDBC
  • Used EJBs to develop business logic and coded reusable components in Java Beans
  • Development of database interaction code to JDBC API making extensive use of SQL
  • Query Statements and advanced Prepared Statements
  • Used connection pooling for best optimization using JDBC interface
  • Used EJB entity and session beans to implement business logic and session handling and transactions
  • Developed user-interface using JSP, Servlets, and JavaScript
  • Wrote complex SQL queries and stored procedures
  • Actively involved in the system testing
  • Experience with Implementation of the presentation layer with HTML, XHTML and JavaScript

We'd love your feedback!