We provide IT Staff Augmentation Services!

Big Data Developer Resume

4.00 Rating

Durham North, CarolinA


  • Over 7 years of IT experience which includes Analysis, Design, Development, Testing, Database programming and User training of Software applications using Big Data Analytics, JAVA/J2EE, Informatica, My SQL and Oracle.
  • 3 years experienced on Hadoop Eco - System development and administration, including MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie and HDFS administration.
  • Experience in developing MapReduce jobs in Java for data cleansing, transformations, preprocessing and analysis. Multiple mappers are implemented to handle data from multiple sources.
  • Skilled in installing, configuring, performance tuning, monitoring and using Apache Hadoop Eco-System components like Hadoop MapReduce, HDFS, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig and Flume.
  • Professional experience in using Cloudera Manager for Installation, configuration and management of single-node and multi-node Hadoop cluster (CDH3/CDH4).
  • Involved in implementation of Hadoop based Data Warehouses, integrated Hadoop with Enterprise Data Warehouse systems.
  • Skilled in working with Flume to load the log data from multiple sources (Teradata, Oracle, MySQL, etc.) directly into HDFS.
  • Professional experience in Hive Partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive serdes like REGEX, JSON and Avro.
  • Good hands on writing UNIX shell scripts with strong analytical and problem solving skills.
  • Experience in using Sqoop to import data into HDFS from RDBMS and vice versa.
  • Good understanding on Hadoop architecture and various components such as HDFS, MapReduce Programming and Daemons (Job Tracker, Task Tracker, Name Node, Data Node and Secondary Name Node).
  • Experience in using Spark to improve the performance of the Jobs.
  • Hands on experience in implementing business logic by writing UDFs in Java and used various UDFs from piggybanks and other sources.
  • Experience in working with Oracle, DB2, SQL Server and Java core concepts like Multithreading, Collections, OOPS and IO operations.
  • Professional Experience in Java Programming on JRE (JDK 1.4/1.5/1.5/1.6/1.7 ), J2EE (JSF, Struts, JQuery, JSON, CSS, JDBC, JSP, Spring, Hibernate), SQL, PL/SQL.
  • Good understanding on usage of Repositories.
  • Hands on experience in creating web-pages using HTML, XHTML, CSS and Java Scripts.
  • Experience in troubleshooting and resolving issues related to system performance and Informatica applications.
  • Strong Analytical and creative problem solving skills with good communication capabilities and ability to work efficiently, effectively in teams and individually.
  • Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines with quick learning skills and effective team spirit.
  • Good global exposure to various work cultures and client interaction with diverse teams.


Hadoop: Hadoop, HDFS, MapReduce, HBase, Hive, Pig, Sqoop, Zookeeper, Oozie, Flume, YARN, Cassandra, Solr, Spark, Strom

Server Side Scripting: UNIX Shell Scripting

Database: Oracle 8i/9i/10g, 11c, Teradata, Microsoft SQL Server, MySQL 4.x/5.x, DB2

Programming Languages: Core Java (JDK 1.4/1.5/1.6/1.7 ), J2EE (JSF, Struts, JQuery, JSON, CSS, JDBC, JSP, Spring, Hibernate), C, C++, SQL, PL/SQL

Frameworks: Hibernate 3.x, Spring 3.x, Struts 2.x

Application Servers: Tomcat, Web Logic, IBM Web Sphere, Web Sphere

Web Services: Apache Axis, Apache CXF/XFire, SOAP, WSDL, REST, Jersey

Client Technologies: JavaScript, CSS, HTML 5, XHTML, XML, JQuery

Web Technologies: JSP, Servlets, JNDI, JDBC, Java Beans, JavaScript

Operating Systems: UNIX, LINUX (Cent OS, Ubuntu, Solaris), MS Windows (Various Versions)

Tools: NetBeans, Eclipse, RAD, WSAD, TOAD, SOAP UI, Visio, Informatica power center 8.5/9.1, Cognos 8, Erwin Data Modeling


Confidential, Durham, North Carolina

Big Data Developer


  • Involved in architecture design, development and implementation of Hadoop deployment, backup and recovery systems.
  • Worked on importing data from various sources and performed transformations using MapReduce, Hive to load data into HDFS.
  • Developed HQL scripts by using Hive and implemented Shell to run and perform audit process once the job is completed.
  • Configured Sqoop Jobs to import data from RDBMS into HDFS using Oozie workflows.
  • Converted Customer transaction information from JSON data into pipe separated data using different Serdes.
  • Used compression techniques with file formats to leverage the storage in HDFS.
  • Experience in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Developed Hive tables to load data and wrote Hive queries which will run internally in map reduce way.
  • Developed Avro model designs on required fields from the database to represent the data to Dashboards.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for BI team.
  • Implemented automation of all jobs to pull data from HTTP server and to load data into Hive tables using Oozie workflows.
  • Experience in using Cloudera manager to monitor and manage Hadoop cluster.

Environment: Hadoop CDH 4, Hive, Sqoop, Flume, Oozie, Shell scripting, Java (JDK 1.7), Eclipse, SQL.

Confidential, Boston, MA

Hadoop/Big Data Engineer


  • Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data.
  • Designed and developed multiple MapReduce Jobs in JAVA for data cleaning and preprocessing.
  • Importing and exporting data into HDFS from database and vice versa using Sqoop.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Implemented business logic by writing UDF’s in Java and used UDF’s from Piggybanks and other sources.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing data into HDFS.
  • Worked on analyzing Hadoop clusters and different Big Data analytics tools including Pig, HBase database and Sqoop.
  • Responsible for operational support of Production system.
  • Involved in creating workflow to run multiple Hive and Pig Jobs, which run independently with time and data availability.
  • Installed and Configured the Spark cluster integrating with Hadoop cluster to run the programs faster than Hadoop MapReduce in memory.
  • Involved in collection and aggregating large amounts of web log data from different sources such as web servers, mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
  • Skilled in handling and reviewing Hadoop log files.
  • Implemented custom Avro framework capable of solving small files problems in Hadoop and also extended Pig and Hive tools to work with it.
  • Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows.
  • Developed Pig scripts to generate MapReduce jobs and performed ETL procedures on HDFS data.
  • Involved in UML, Package, State Diagrams and Class.
  • Responsible in loading and transforming huge sets of structured, semi structured and unstructured data.
  • Developed Java APIs for invocation in Pig Scripts to solve complex problems.
  • Using Informatica Power Center, ETL preparation of different reporting requirements are carried.

Environment: Hadoop, HDFS, Hive, Sqoop, Flume, Pig, Zookeeper, Oozie, HBase, Spark, Linux, Teradata, MySQL, Informatica, ETL.


Hadoop/Big Data Engineer


  • Responsible for designing and implementation of complete framework.
  • Installed, configured and performance tuned of Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and sqoop.
  • Responsible for creating Hive tables, loading with data and writing Hive queries which will run internally in map, reduce way.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression techniques.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data.
  • Responsible for managing data coming from different sources.
  • Used Hive and Pig for data analysis and ETL of the raw data coming in flat files.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs and used for exporting HBase data.
  • Used Flume to import the log data from the reaper logs and syslog’s into the Hadoop cluster.
  • Responsible for debugging problems related to system performance and Informatica applications.
  • Worked on TOAD for Data Analysis using Informatica power center for data mapping and the data transformation between the source and the target database.
  • Involved in Project sizing and in team discussions.
  • Involved in developing scripts and automated data management from end to end integration work and sync up between clusters.

Environment: Core Java, MapReduce, Hive, Pig, Flume, Sqoop, Oozie, HBase, MySQL, Oracle, Linux, Informatica, TOAD.


Java Developer


  • Responsible in J2EE front-end and back-end supporting business logic, integration and persistence.
  • Software Requirement Specification (SRS) is written and maintained for the project.
  • Involved in developing business logic components using Java Beans and Servlets.
  • Responsible for providing work direction, tracking progress, and managing workload to other application developers as required.
  • Implemented combination of Java Server Pages to render the HTML and a well-defined API interface to allow access to the application services layer.
  • Responsible in developed SQL queries to implement Struts frame work.
  • Used Struts tag libraries and jar files and Custom tags.
  • Responsible in writing and maintaining of Ant build script for the project.

Environment: Java, JSP, Servlets, JavaScript, JDBC, IBM WebSphere 5.1 Application Server, WSAD, TOAD, Change Man, MS Windows 2000, LDAP, Oracle JTA, JMS and JNDI.


Jr J2EE Developer


  • Responsible in Drawing Case diagrams, Class diagrams, and Sequence diagram for each scenario.
  • Designed and developed web interfaces and business logic using Jakarta Struts Framework (MVC architecture), JSP, Servlets, Java Beans, JDBC, AJAX, Java Script, HTML, DHTML and XML Technologies.
  • Developed web services for data transfer from client to server and vice versa using AJAX and SOAP.
  • Implemented User Interface (UI) using Java Server Pages.
  • Developed Server-side validation classes using JDBC calls.
  • Developed Java code for authentication.
  • JDBC is used for connecting web applications to the database Oracle 8.0.

Environment: JSP, JDBC, JDK, HTML, Web Logic, XML, Oracle 8i, AJAX, SOAP, Windows NT, UNIX.

We'd love your feedback!