We provide IT Staff Augmentation Services!

Hadoop And Big Data Developer Resume

Woodlands, TexaS


  • Over 8 years of IT experience in Analysis, design, development, implementation, maintenance and support with experience in developing strategic methods for deploying big data technologies to efficiently solve Big Data processing requirement
  • Around 4+ years of experience on BIG DATA using HADOOP framework and related technologies such as HDFS, HBASE, MapReduce, HIVE, PIG, IMPALA, FLUME, OOZIE, SQOOP, and SPARK
  • Experience in installation, configuration, supporting and managing - Cloudera's Hadoop platform along with CDH3&4 clusters
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS
  • Strong noledge of Spark for handling large data processing in streaming process along with Scala/Python
  • Experience in data analysis using HIVE, PIG, HBASE and custom Map Reduce programs in Java
  • Experience with Cloudera and Horton works distributions
  • Extensive experience in Data Ingestion, In-Stream data processing, BATCH ANALYTICS and Data persistence strategy
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
  • Experience in working with Flume to load teh log data from multiple sources directly into HDFS
  • Expert in Data Extraction, Transformation and Loading (ETL process) from Source to target systems.
  • Experience in analyzing teh different types of data dat flow from data lakes to Hadoop Clusters
  • Excellent understanding on Hadoop and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager (YARN)
  • Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files
  • Experienced in loading data to Hive partitions and bucketing
  • Experience in using Sequence files, AVRO file, RC file formats
  • Experience in developing customized UDF's in java to extend Hive and Pig functionality
  • Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses
  • Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good noledge of J2EE design patterns and Core Java design patterns
  • Developed applications to report specific data usingJQuery, AJAX, CSS and XML/JSON
  • Worked on multipleRESTFULweb based applications using Spring MVC, Hibernate, JQuery, Oracle, MySQL
  • Experience in creating Tables, Transactions, Views, Joins, Indexes, Cursors, Triggers, User Profiles, User Defined Functions, Relational Database Models and Data Integrity in observing Business Rules
  • Experience in Oracle supplied packages, Dynamic SQL, Records, DML, DDL, PL/SQL Tables and Exception Handling
  • Ability to manage and deliver results on multiple tasks by effectively managing time and priority constraints


Big Data Technologies: Hadoop MapReduce, HDFS, Hue, HBase, Hive, Oozie, Sqoop, Pig, Flume, Impala, Spark, SparkSQL, Apache Kafka, Casandra

Programming Languages: JAVA/J2EE, C, UNIX Shell commands, Java Beans, JDBC, HTML, Servlets

Scripting Languages: Java Script, Shell Script

Web Development: HTML, JavaScript, CSS

Databases: Oracle, MySQL, Hive, HBase, Cassandra

Technologies/Tools: SQL Development, JDBC

Operating Systems: UNIX, Microsoft Windows XP/07/08/10, Linux

Reporting Tools: Tableau

IDE: Eclipse, NetBeans

Hadoop Distributions: Cloudera, HortonWorks

Methodologies: Waterfall, Agile

Data Importing Tools: Sqoop, Flume

Data Analysis Tools: Pig, Hive


Confidential - Woodlands, Texas

Hadoop and Big Data Developer


  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs in java for data cleaning and processing
  • Developed data pipeline using KAFKA, FLUME, SQOOP, PIG AND JAVA MAPREDUCE to ingest customer behavioral data and financial histories into HDFS for analysis
  • Involved in moving data from HDFS to Amazon S3
  • Experience in creating real time data streaming solutions using Spark-Streaming, Kafka and flume
  • Developed Map Reduce programs using Java to perform various ETL, cleaning and scrubbing task
  • Involved in building all domain pipelines usingSparkdata frame andSparkbatch processing
  • Developed Spark SQL to load tables into HDFS to run select queries on top
  • Import teh data from different sources like HDFS/Hbase/ Hive/ S3 into SPARK RDD
  • ImplementedSparkusing Scala, Java and utilizing Data frames andSparkSQL API for faster processing of data
  • Working on ApacheSparkwithPythonto develop and execute Big Data Analytics
  • Involved in converting Hive queries into Spark transformations using Spark RDDs on Scala and Java, Python
  • Created tables and loaded data into NoSQL database HBase, Cassandra
  • Implemented teh workflows using Apache Oozie to automate tasks
  • Excellent understanding and noledge of NOSQL database HBase and Cassandra
  • Involved in developing HIVE DDLs to create, alter and drop Hive tables and store in Amazon S3
  • Developed Pig scripts to extract teh data from teh web server output files to load into HDFS.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Applied Spark for processing wide variety and larger data sets.
  • Implemented Fair schedulers on teh Job tracker to share teh resources of teh Cluster for teh Map Reduce jobs given by teh users.
  • Developed Java, Hive Scripts, Pig scripts, Unix Shell scripts, programming for all ETL loading processes and converting teh files and storing them onto teh Hadoop File System
  • Experience in retrieving data from databases like MYSQL and Oracle into HDFS using Sqoop and ingesting them into HBase, Hive

Environment: Hortonworks, Hadoop, SPARK- SCALA/PYTHON HDFS, Hive, Pig, Sqoop, MapReduce, Impala, NoSQL, HBase, Shell Scripting, Linux, MySQL, Apache Kafka

Confidential - Livonia, MI

Hadoop Developer


  • Involved in use case analysis, design and development of BigData solutions using Hadoop for Customer spend analysis, product performance monitoring and offer generating engine
  • Involved in creating Map-Reduce pipelines through java
  • Table creation, loading and processing teh data using Hive to run teh Map Reduce jobs in backend
  • Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume
  • Implemented job lists for incremental imports into Hive tables
  • Developed custom UDF's for Pig and Hive usingjavato process and analyze teh data
  • Created Hive jobs to parse teh log data and structured them in tabular format to query teh data effectively
  • Migrated ETL jobs to Pig Scripts to do transformations, joins and some pre-aggregations before storing data onto HDFS
  • Developed Map-Reduce programs to perform Data Transformation inJava
  • Experienced in managing and reviewing Hadoop log files
  • Created workflow and scheduling for teh applications using Oozie coordinator
  • Excellent hands-on experience in creating and publishing Tableau reporting for various analytical user requirement
  • Played key role in getting teh Hadoop data to Tableau for analytical purposes
  • Implemented proof of concept in Spark using Python on live chat analysis which evaluates teh customer chat experience and various attributes to define teh type of customer and their navigation path for any clarification
  • ImplementedSparkCore in Scala to process data in memory
  • Performed job functions usingSparkAPI's in Scala for real time analysis and for fast querying purposes
  • Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
  • Worked with Avro Data Serialization system to work with JSON data formats

Environment: Hadoop, HDFS, Tableau, MapReduce, Sqoop, Oozie, Pig, Hive, Hbase, Flume, LINUX, Java, Cassandra, Hadoop Distribution of Cloudera, Putty and Eclipse.

Confidential, Flint

Big Data Research Assistant


  • Worked as a Research Assistant under teh supervision of teh professors.
  • Provided consulting services, solutions and training around Big Data ecosystem (Hadoop, NoSQL, and Cloud).
  • Trained internally on Hadoop with teh team. Ran a popular webinar series on Hadoop with teh team.
  • Worked with teh Big Data Storage Team, dat was building teh University’s Big Data application. Virtualized Hadoop in Linux environment for providing a safer, scalable analytics sandbox on teh application.
  • Developed a HDFS plugin to interface with University file system on Hadoop specific optimizations.
  • Built a scalable, cost effective, and fault tolerant dataware-house system and Developed MapReduce jobs to analyze teh data.
  • Worked on multiple virtual machines such as Cloudera.
  • Implemented Map-Reduce Programming on Classical MapReduce daemon
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Script.
  • Performed transformations using Hive, Map Reduce, extracted teh data into HDFS using Sqoop and Oozie Workflows for visualization.
  • Analyzed teh data by executing Hive queries and Pig scripts to no teh various scenarios of product sales.

Environment: CloudEra, Apache Hadoop, Linux, HDFS, Hive, Pig, Sqoop, Flume, Zookeeper, HBase, Oozie, Flume, HortonWorks, Java, Map Reduce


Java Developer


  • Responsible and active in teh analysis, design, implementation and deployment of full software development life-cycle (SDLC) of teh project
  • Designed and developed user interface using JSP, HTML and JavaScript
  • Developed struts action classes, action forms and performed action mapping using Struts, Framework and performed data validation in form beans and action classes
  • Involved in multi-tiered J2EE design utilizing MVC architecture (Struts Framework) and Hibernate
  • Extensively used Struts Framework as teh controller to handle subsequent client requests and invoke teh model based upon user requests
  • Involved in system design and development in core java using Collections, multithreading
  • Defined teh search criteria and pulled out teh record of teh customer from teh database. Make teh required changes to teh record and save teh updated information back to teh database
  • Wrote JavaScript validations to validate teh fields of teh user registration screen and login screen
  • Developed stored procedures and triggers using PL/SQL to calculate and update teh tables to implement business logic
  • Design and develop XML processing components for dynamic menus on teh application
  • Involved in postproduction support and maintenance of teh application

Environment: Oracle 11g, Java 1.5, Struts 1.2, Servlets, HTML, XML, MS SQL Server 2005, J2EE, JUnit, Tomcat 6


Java Developer


  • Developed Admission & Census module, which monitors a wide range of detailed information for each resident upon pre-admission or admission to you're facility.
  • Involved in development of Care Plans module, which provides a comprehensive library of problems, goals and approaches. You have teh option of tailoring (adding, deleting, or editing problems, goals and approaches) these libraries and teh disciplines you will use for you're care plans.
  • Involved in development of General Ledger module, which streamlines analysis, reporting and recording of accounting information. General Ledger automatically integrates with a powerful spreadsheet solution for budgeting, comparative analysis and tracking facility information for flexible reporting.
  • Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
  • Designed user-interface and checking validations using JavaScript.
  • Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
  • Developed various EJBs for handling business logic and data manipulations from database.
  • Involved in design of JSP’s and Servlets for navigation among teh modules.
  • Designed cascading style sheets and XML part of Order Entry Module & Product Search Module and did client side validations with java script.

Environment: J2EE, Java/JDK, JDBC, JSP, Servlets, JavaScript, EJB, JNDI, JavaBeans, XML, XSLT, Oracle 9i, Eclipse, HTML

Hire Now