We provide IT Staff Augmentation Services!

Hadoop Engineer Resume

San Francisco, CA


  • Around 6+ years experience on Java EE development, 3+ years hands - on data experience on Big Data Technologies on Hadoop ecosystems
  • Deep understanding/knowledge of Hadoop Architecture and major components such as HDFS, MapReduce V1, YARN architecture and good understanding of scalability, workload management, schedulers, and distributed platform architectures
  • Technical expertise in Hadoop Map Reduce, Amazon EMR, HDFS, Hbase, Hive, Cloudera Manager, Zookeeper, Pig, Sqoop, Impala, SQL, Oracle SQL, MongoDB, and also Linux/UNIX Shell Scripting.
  • Experience in developing MapReduce jobs with Java 8, Python 2.7/3.x in Hadoop
  • Experience in developing Spark applications using Scala and Python 2.7/ 3.x
  • Extensive experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Management Systems (RDBMS) and vice versa.
  • Experience in aggregating and moving large streaming data using Flume, Kafka, RabbitMQ
  • Extensive experience in writing Pig scripts and Hive/Impala Queries for processing and analyzing large volumes of data structured in different level.
  • Experience in designing both time driven and data driven automated workflows using Oozie
  • Hands on experience with data mining, machine learning and underlying algorithms
  • Strong understanding of core Java, data structure, algorithm design, Object-Oriented Design (OOD) and Object-Oriented Programming (OOP), MVC
  • Hands on Java Collections Framework, Exception handling and concurrency programming
  • Hands on experience in Amazon Web Services (AWS EC2, EMR)
  • Hands on experience in HTML, CSS, JSP, Bootstrap and Java frameworks like Spring MVC
  • Experience on Apache Tomcat and Java Servlets
  • Experienced in Tableau Desktop for data visualization and analysis
  • Experience on Java unit tests (JUnit) and integration tests
  • Hands on flow chart & diagram software Microsoft Visio, illustrator and Photoshop
  • Experience with source code management tools such as Git, GitHub, SVN, Bitbucket
  • Experience with project tracking software such as Jira
  • Experience in Agile Development environments
  • Hardworking professional with a strong ability to work well in a team environment. Exceptional time management skills with a strong work ethic.


Apache Hadoop Eco-system Relational Databases:: HDFS, MapReduce, YARN, Hive, Pig, Oracle11g/10g/PostgreSQL 8.0, Sqoop, ZooKeeper, flume, EMR, impala, MySQL 5.0, Microsoft SQL Server 9.0 Kafka, RabbitMQ, Spark, Oozie, MRUnit

NoSQL Databases ScriptingHbase, Cassandra, MongoDB Linux/UNIX shell Scripting

Languages Operation SystemJava, Python, Scala, SQL, HiveQL, Pig Linux, Windows, Mac OS

Others: Springmvc/Hibernate, REST, SOAP VMWare software, VirtualBox, XAMPP, LAMP, Git, GitHub, SVN, Bitbucket, AWS, Splunk, Mongoose


Confidential, San Francisco, CA

Hadoop Engineer


  • Installed and configured Hadoop HDFS, developed Map Reduce jobs in java for data cleaning
  • Experience on configuration of Hadoop 2.0 and Mapreduce v2.6.x
  • Experience on Zookeeper for cluster coordination services
  • Wrote Sqoop scripts to make the interaction between Hive and Oracle database 11g
  • Involved in moving all log files to HDFS for further processing through Flume 1.5
  • Experience on creating Hive tables to store the processed results in a tabular format
  • Optimizing Hive tables using optimization techniques to provide better performance with HiveQL
  • Experience on real-time analytics using Kafka 0.8 and Hbase 0.9
  • Modified database tables and used HBase queries to insert and fetch data from tables
  • Used Apache Spark 1.0 for joining disparate datasets and aggregating statistics (e.g., average or sum)
  • Tableau 9.2 experience on graphing results for reporting
  • Experience on Splunk platform to review and monitor server log files
  • Involved in review of functional and performance requirements

Environment: Zookeeper, Hadoop 2.0, YARN, Spark 1.0, MapReduce v2.6x, Kafka 0.8, Flume 1.5, Hive, Pig, Oracle database 11g, Hbase 0.9, Tableau 9.2, Splunk dashboard

Confidential, Ann Arbor, MI

Software Engineer


  • Designed and implemented Hadoop 1.x and Map Reduce based large-scale parallel processing
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
  • Writing Sqoop scripts to make the interaction between databases
  • Experience on Oracle SQL Database and SQL query.
  • Relational Databases Experience such as Teradata, Vertica and DB2
  • Wrote Hive queries for data analysis and query optimization
  • Used Flume 1.x to stream and analyze the log data
  • Data query on HDFS and HBase by using Impala 1.x
  • Designed technical solution for real-time analytics using Kafka and Hbase
  • To set up standards and processes for Hadoop based application design and implementation
  • Reviewing and monitoring data process log files

Environment: Hadoop 1.x, Map reduce, Cloudera Manager, HDFS, Sqoop, Hive, Oracle SQL Database, Teradata, Vertica, DB2, Hive, Flume, Hbase, Kafka

Confidential, Detroit, MI

Database and Software Developer


  • Experience includes Google Map API, RESTful services
  • Implemented the system using Java7 and Eclipse, Apache Tomcat
  • Code testing by using JAVA JUnit test and Integration test
  • Inside SQL Server database maintenance
  • Migrate current scale Microsoft SQL Server to Oracle Database
  • Collaborate with teammates to set up the sample database for scale test
  • Set up the testing environment through VMware software for Scale Testing
  • Experience on setting up and configuration of Amazon EMR on AWS
  • Configured and optimized Hadoop 1.x and Map Reduce V1 environment
  • Implemented (Extract, Transform, Load) ETL to load data into HDFS with Sqoop1.4
  • Handled data transformations using Hive and loaded data into HDFS
  • Implemented CRUD operations on HBase 0.9 Data
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on
  • Experience on using Oozie to manage Apache Hadoop job
  • Extending HIVE and Pig 0.1 core functionality by using Hcatalog

Environment: Google Map API, RESTful, Apache Tomcat, JUnit Hadoop 1.x, Microsoft SQL Server, Oracle Database, MapReduce v1, Hive0.1, Pig 0.1, Hcatalog, Scoop 1.4, HBase 0.9, HDFS, Oozie, Amazon EMR, Amazon AWS

Confidential, Detroit, MI

Software Developer


  • Using OOP/OOD designed point of storage interface based on Java6 and Netbean
  • Understanding of database design/architecture and experience on MS SQL/RDBMS/DB2 databases
  • Strong understanding of IIS, XML (DTD, SOAP), JSON, XSL, WSDL
  • Familiar with large data models
  • Experience of Springmvc
  • Implemented the system using Java6 and Eclipse, Log4j, Apache Tomcat
  • Experience on JSP and Servlets.
  • Working in an Agile team and team up with QA
  • Experience with project tracking software such as Jira
  • Experience on version control tool Git

Environment: Java6, Springmvc, Log4j, Apache Tomcat, SQL/RDBMS/DB2, XML, DTD, SOAP, WSDL, JSON, OOP/OOD, Shell Scripting, Linux


Java Developer


  • Used HTML, CSS, JSP to implement the payment interface based on Java 6 and Eclipse
  • Designed the MySQL database to handle the large amount of data easily by using relational database schema, table partitioning, etc
  • Familiar with MySQL data query and JDBC for database connectivity
  • Experience on REST-ful Web Services
  • Implementation information authentication with cryptographic algorithm
  • Experience on Apache Tomcat and Java Servlets
  • Experience on unit tests (JUnit), integration tests and configuration changes
  • Worked in Agile team

Environment: Java 6, MySQL, JUnit, J2EE, Eclipse, Apache Tomcat 6, HTML, CSS, JSP, JDBC, JavaScript, Agile

Hire Now