Sr. Hadoop Developer Resume Bellevue - Hire IT People

SUMMARY

8+ years of experience in data analysis, data modeling and implementation of enterprise class systems spanning Big Data, Data Integration, Object Oriented programming and Advanced Analytics
3+ years of Full life cycle experience delivering end to end big data projects processing petabytes of activity data by gathering business requirements, creating data models, developing logic, testing, deploying and maintaining
Expertise in Big Data Technologies and Hadoop Ecosystem like HDFS, Map Reduce, Hive, Pig, Sqoop, Oozie, Flume, Apache Kafka, Apache Storm, Apache spark, Zookeeper, Avro and NoSQL databases like HBase, Cassandra, MongoDB
Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala
Hands on experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster using Cloudera, Hortonworks distributions and MapR
Good understanding of Data Mining and Machine Learning techniques and experience implementing them with Big Data tools like Mahout, Spark Mlib, H2O
Excellent understanding of Hadoop architecture and different demons of Hadoop clusters which include Job Tracker, Task Tracker, Name Node and Data Node and also Yarn demons like Resource manager and Node Manager and Job history server
Hands on experience in various Hadoop distributions IBM Big Insights, Cloudera, Horton works and MapR
Experience with Apache Crunch in order to pipeline the composed UDF’S to write, easy to test, and efficient to run
Experience with Hadoop Business Intelligence (BI) to speed up the process of analyzing, transforming and loading huge data
Extensive experience in Data Modelling, Database Design and Development using Relational Databases (Oracle9i/10g, DB2, MySQL Server 2003/2005) and NoSQL Databases (HBase, Cassandra, MongoDB)
Proficient in Big data ingestion and streaming tools like Flume, Sqoop, Spark - Streaming, Kafka and Storm and developing Customized Map Reduce programs using Apache Hadoop
Strong Knowledge on Hive Storage format (Sequence Files, Avro and RC Files), compression techniques (Record, Block) and using SerDe’s for insert/load operations
Knowledge on Apache Solr as an end point in various data processing frameworks and Enterprise integration frameworks
Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop and in implementing Ad-Hoc Map Reduce programs using Pig Scripts and hive queries
Hands on experience in creating Hive scripts, HIVE tables, UDAFs, UDFs, UDTFs, Partition/ Bucketing
Hands on experience in creating Apache Spark RDD transformations on Data sets in Hadoop data lake
Experience in importing streaming data into HDFS using flume sources, and flume sinks and transforming the data using Flume Interceptors and also build Custom Flume & Inceptor sterilizers
Experience in integrating Apache Kafka with Apache Storm and created Storm data pipelines for real time processing
Knowledge in handling Kafka cluster and created several topologies to support real-time processing
Worked on ETL reports using Tableau and created statistics dashboards for Analytics

TECHNICAL SKILLS

Big Data Technologies: Hadoop, HDFS, Hive, MapReduce, Pig, Sqoop, Cassandra Flume, Zookeeper, Oozie, Json, Spark, Kafka

Batch Processing: Cascading/Scalding, Apache Spark, Hadoop MapReduce

Programming Languages: C, C++, Java, Shell Scripting

Java/J2EE Technologies: Java, Java Beans, J2EE (JSP, Servlets, EJB), JDBC, MySQL, Spring.

DB Languages: SQL, PL/SQL

NoSQL Databases: Hbase, MongoDB, Riak

Application Servers: Tomcat

Operating Systems: LINUX, MS DOS, Windows XP 10.., Windows Vista, Windows Server 2003 Windows, Server 2008 and Unix

PROFESSIONAL EXPERIENCE

Confidential, Bellevue

Sr. Hadoop Developer

Responsibilities:

T mobile is creating a centralized data lake for managing all the analytical processes using Hadoop as the core platform for this
Involved in developing a customized in built tool Data Movement Framework (DMF) for ingesting data from external and internal sources into Hadoop using Sqoop, Shell script
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Used SED, AWK and PIG scripts to clean and scrub the data before put into Data Lake.
Proposed an automated system using Shell script to implement import using Sqoop
Worked in Agile development approach and managed the Hadoop teams of various Sprints
Experienced in Migrating data of file sources and Mount sources from RDMS system, Databases, mount locations to Hadoop using by using DMF framework.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig
Production Rollout Support and resolving any issues that are discovered by the client and client services teams
Extensively involved in review sessions to determine, understand dataflow and data mapping from source to target databases by coordinating with End Users, Business Analysts, DBAs and Application Architects

Confidential, Atlanta

Kafka, Storm and Hadoop Developer

Responsibilities:

Configured, deployed and maintained multi-node Dev and Tested Kafka Clusters
Developed multiple Kafka Producers and Consumers from base by using low level and high level API’s and implementing
Hands on editing Kafka queues like modifying, adding, deleting as a part of Kafka topics
Developed code to write canonical model JSON records from various input sources to Kafka Queues
Implemented High level Kafka consumers to get data from Kafka partitions and move into Cassandra for near real time analytics
Develop and lead the engineer of a log analysis platform consisting of ElasticSearch to process large amount of data
Working on implementing Solr 4.0 to search insurance seekers across 25,000 sources
Hands on analyzing Indexing, Automated index replication, Server statistics logging using Solr
Involved in design Cassandra data model, used CQL (Cassandra Query Language) to perform CRUD operations on Cassandra file system
Utilized RabbitMQ as the messaging middleware
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
Experienced with batch processing of data sources using Apache Spark and Elastic search.
Experienced in implementing Spark RDD transformations, actions to implement business analysis migrated Hive QL queries on structured into Spark QL to improve performance
Configured, deployed and maintained a single node storm cluster in DEV environment
Worked on setting and varying the configurations involving partition’s, TTL and replications factors
Developed code base to stream data from sample data files Kafka Kafka Spout Storm Bolt HDFS Bolt
Developing predictive analytic using Apache Spark Scala APIs
Documented the data flow form Application Kafka Storm HDFS Hive tables
Created Map reduce Generic writable classes to wrap up writable implementations and send it to the same reducer
Involved in creating hive tables, loading data into tables and writing hive queries for optimizing
Developed topologies and storm bolts involving Kafka spouts to stream data from Kafka
Designed and developed tests and POC’s to benchmark and verify data flow through the Kafka clusters

Confidential, Sterling VA

Hadoop Developer

Responsibilities:

Installed and configured Cassandra and good knowledge about Cassandra architecture, read, write paths and query
Good knowledge and experience with Hadoop stack - internals, Hive, Pig and Map Reduce
Worked on writing Map Reduce jobs to discover trends in data usage by customers
Worked on and designed Big Data analytics platform for processing customer interface preferences and comments using Java, Hadoop, Hive and Pig
Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie
Installed and configured Hive and also written Hive QL scripts
Experience with loading the data into relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming
Experience with creating ETL jobs to load JSON data and server data into MongoDB and transported MongoDB into the Data Warehouse
Created reports and dashboards using structured and unstructured data
Experienced with performing analytics on Time Series data using HBase
Implemented HBase co-processors, Observers to work as event based analysis
Hands on Installing and configuring nodes CDH4 Hadoop Cluster on CentOS
Implemented Hive Generic UDF's to implement business logic
Experienced with accessing Hive tables to perform analytics from java applications using JDBC.
Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
Applied Hot-bug fixes and also version patches for Netezza, Oracle, SQL server and Informatica in Windows environment
Netezza PROD Appliances (TF120, TF96, TF48 & TF12) Regular Maintenance & Monitoring make sure
Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala.
Experience with streaming work flow operations and Hadoop jobs using Oozie workflow and scheduled through AUTOSYS on a regular basis
Performed operation using Partitioning pattern in Map Reduce to move records into different categories
Experience in implementing Machine Learning library in spark implemented in Scala.
Developed Spark SQL scripts and involved in converting hive UDF’s to Spark SQL UDF’s
Responsible for batch processing and real time processing in HDFS and NOSQL Databases.
Responsible for retrieval of Data from Casandra and ingestion to PIG

Environment: Casandra, Map jobs, Spark SQL,ETL, Pig Scripts, Hadoop BI, Pig UDF’s, Oozie, HIVE, AVRO, Hive Scala, Map Reduce, Java, Eclipse

Confidential, San Francisco, CA

Java/Hadoop Developer

Responsibilities:

Developed web components using JSP, Servlets and JDBC
Worked closely with EJBs to code reusable components and business logic while working on java beans
Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing
Supported in setting up updating configurations for implementing scripts with Pig and Sqoop
Migrated existing SQL queries to HiveQL queries to move to big data analytical platform
Integrated Cassandra file system to Hadoop using Map Reduce to perform analytics on Cassandra data
Wrote test cases in Junit for unit testing of classes
Used Hibernate ORM framework with Spring framework for data persistence
Implemented Real time analytics on Cassandra data using thrift API
Responsible to manage data coming from different sources
Supported the clusters that were running on Map Reduce Programs
Involved in loading data from UNIX file system to HDFS
Involved in templates and screens in HTML and JavaScript
Load and transform large sets datainto HDFS using Hadoop fs commands
Designed the logical and physical data modeling wrote DML scripts for Oracle 9i database
Experience with installing cluster, commissioning as well as decommissioning of different types of nodes like Name and Data nodes
Worked on capacity planning, node recovery and slots configuration

Confidential

JAVA/J2EE Developer

Responsibilities:

Working experience in different phases of Software Development Lifecycle (SDLC) of the application, like designing, analysis, requirements gathering, development and deployment of the application
Experience with Model View Control (MVC) design pattern was implemented with Struts MVC, Servlets, JSP, HTML, AJAX, JavaScript, CSS to control the flow of the application in the Presentation/Web tier, Application/Business layer (JDBC) and Data layer (Oracle 10g)
Hands on Analysis, Design, and Implementation of software applications using Java, J2EE, XML and XSLT
Worked on Developed Action Forms and Controllers in Struts 2.0/1.2 framework
Utilized various Struts features like Tiles, tagged libraries and Declarative Exception Handling via XML for the design
Developed JavaScript validations on order submission forms
Designed, developed and maintained the data layer using Hibernate

Confidential

Software Engineer

Responsibilities:

Developed web components using JSP, Servlets and JDBC
Used EJBs to develop business logic and coded reusable components in Java Beans
Development of database interaction code to JDBC API making extensive use of SQL
Query Statements and advanced Prepared Statements
Used connection pooling for best optimization using JDBC interface
Used EJB entity and session beans to implement business logic and session handling and transactions
Developed user-interface using JSP, Servlets, and JavaScript
Wrote complex SQL queries and stored procedures
Actively involved in the system testing
Experience with Implementation of the presentation layer with HTML, XHTML and JavaScript

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

BellevuE

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship