We provide IT Staff Augmentation Services!

Java Hadoop Developer Resume

4.00/5 (Submit Your Rating)



  • 8+ years of overall IT experience in a variety of industries, which includes hands on experience of 4+ years in Big Data technologies and designing and implementing Map Reduce tasks.
  • Expertize wif teh tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in designing and developing POCs in Spark using Scala to compare teh performance of spark wif Hive and SQL/Oracle.
  • Proficient in implementing HBase and Spark SQL.
  • Exploring wif Spark various modules of Spark and working wif Data Frames, RDD and Spark Context.
  • Experience in migrating teh data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
  • Experience in data analysis using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions wif control flows.
  • Strong understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB.
  • Proficient in designing Rowkeys & Schema Design for NoSQL Databases.
  • Experience in working on CQL (Cassandra Query Language), for retrieving teh data present in Cassandra cluster by running queries in CQL.
  • Proficient wif Cluster management and configuring Cassandra Database.
  • Extensive Experience on importing and exporting data using Flume and Kafka.
  • Experience in working wif flume to load teh log data from multiple sources directly into HDFS
  • Experience in configuring teh Zookeeper to coordinate teh servers in clusters and to maintain teh data consistency.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Developed a data pipeline using Kafka, Spark and Hive to ingest, transform and analyzing data
  • Proficient in coding of optimizedTeradata batch processing scriptsfor data transformation, aggregation and load usingBTEQ.
  • Extensive Experience onTeradatadatabase, analyzing business needs of clients, Developed and performed Data Integration on top of Hadoop using Talend
  • Involved in building many comparison graphsand a comparison matrix wif all teh details listed according to teh requirement using Talend open studio
  • Strong experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
  • Good understanding in writing Python Scripts.
  • Experience in working wif BI team and transform bigdata requirements into Hadoop centric technologies.
  • Strong Experience of Data Warehousing ETL concepts using Informatics Power Center, OLAP, OLTP and AutoSys.
  • Experienced in working wif Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
  • Good understanding and working experience on Cloud based architectures.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Experience working wif Spring and Hibernates frameworks in JAVA.
  • Good Experience and Expertise in Oracle ORMB and Stored procedures concepts.
  • Experience in Performance Tuning and Query Optimization.
  • Experience in using various IDEs Eclipse, IntelliJ and repositories SVN and CVS.


BigData Technologies: HDFS, Map Reduce, Pig, Hive, Sqoop, Oozie, Hbase, Scala, Spark, Apache Kafka, Cassandra & MongoDB, Solr, Ambari, Ab initio, Akka Framework

Database: Oracle 10g/11g, PL/SQL, MySQL, MS SQL Server 2012

SQL Server Tools: Enterprise Manager, SQL Profiler, Query Analyser, SQL Server 2008,SQL Server 2005 Management Studio, DTS, SSIS, SSRS, SSAS

Language: C, C++, Java, Python

Development Methodologies: Agile, Waterfall

Testing: Junit, Selenium Web Driver

ETL Tools: Talend Open Studio, Pentaho, Tableau

IDE Tools: Eclipse, NetBeans, Intellij

Modelling Tools: Rational Rose, StarUML, Visual paradigm for UML

Architecture: Relational DBMS, Client-Server Architecture, OLAP, OLTP

Cloud Platforms: AWS Cloud, Google Cloud

Operating System: Windows 7/8/10, Vista, UNIX, Linux, Ubuntu, Mac OS X


Confidential, CA

Java Hadoop Developer


  • Worked on IOTs and ThingSpace Platform teh Confidential ’s build-in autantication service for accessing information and development wifin intranet
  • Beside worked on Couchbase DB for maintaining teh applications stack dat are built for COHO product
  • Worked wif Docker Container’s, has Strong knowledge on building images and composing teh services.
  • Strong experience working wif Amazon AWS for accessing Hadoop cluster components.
  • Solid understanding of Cloud and Open source technologies - AWS, Docker, Elastic Search, Git, Stash
  • Elastic Search/Scala/ AKKA streaming implementation, Deployed and maintained multi-node Dev and Test Kafka Clusters
  • Experience in Job management using Fair scheduler and Developed job processing scripts using Oozie workflow.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Implemented ELK (Elastic Search, Log stash, Kibana) stack to collect and analyze teh logs produced by teh spark cluster.
  • Performed advanced procedures like text analytics and processing, using teh in-memory computing capabilities of Spark using Scala.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself. Managed and reviewed Hadoop log files
  • Strong working experience on Cassandra for retrieving data from Cassandra clusters to run queries.
  • Responsible for developing services dat integrates two different environments while sharing a common connectivity using simulator dat built using Java. strong knowledge and work experience in building java services like REST and Soap APIs.
  • Developed Dashboards using splunk to find loggings to rectify teh problems and resolve it as quickest as possible
  • Strong working knowledge on deployment tools like Mesos and Marathon

Environment: Hadoop YARN, Spark-Core, Spark-Streaming, Spark-SQL, Scala, Python, Kafka, Hive, Sqoop, Amazon AWS, Elastic Search, Cassandra, Java APIs, Splunk, Docker, Mesos & Marathon, Docker, ReadyAPI tool for testing java web services

Confidential, NY

Hadoop Scala/Spark Developer


  • Worked on Cluster size of 150-200 nodes.
  • Responsible for building scalable distributed data solutions using Hadoop
  • Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
  • Using Spark-Streaming APIs to perform transformations and actions on teh fly for building teh common learner data model which gets teh data from Kafka in near real time and Persists into Cassandra.
  • Developed Spark scripts by using Scala shell commands as per teh requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDFFs using both Data frames/SQL and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop; And Developed enterprise application using Scala as well
  • Expertise in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Loaded teh data into Spark RDD and do in memory data Computation to generate teh Output response.
  • Experience and hands-on knowledge in Akka and LIFT Framework.
  • Used PostgreSQL and No-SQL database and integrated wif Hadoop to develop datasets on HDFS
  • Involved in creating partitioned Hive tables, and loading and analyzing data using hive queries, Implemented Partitioning and bucketing in Hive.
  • Worked on a POC to compare processing time of Impala wif Apache Hive for batch applications to implement teh former in project.
  • Developed Hive queries to process teh data and generate teh data cubes for visualizing
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Good experience wif Talend open studio for designing ETL Jobs for Processing of data. Experience designing, reviewing, implementing and optimizing data transformation processes in teh Hadoop andTalend/Informatica ecosystems.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Coordinated wif admins and Sr. Technical staff for migrating Terradata to Hadoop and Ab Initio to Hadoop as well
  • Configured Hadoop clusters and coordinated wif BigData Admins for cluster maintenance.

Environment: Hadoop YARN, Spark-Core, Spark-Streaming, Spark-SQL, Scala, Python, Kafka, Hive, Sqoop, Amazon AWS, Elastic Search, Impala, Cassandra, Tableau, Informatica, Cloudera, Oracle 10g, Linux.

Confidential, Irving TX

Hadoop Developer


  • Detailed Understanding on existing build system, Tools related for information of various products and releases and test results information
  • Designed and implemented map reduce jobs to support distributed processing using java, Hive and Apache Pig.
  • Developed UDF’s to provide custom hive and pig capabilities using SOAP/RESTful services.
  • Performed transformations, cleaning and filtering on imported data using HIVE, Map Reduce and load final data into HDFS
  • Built a mechanism for talend, automatically moving teh existing proprietary binary format data files to HDFS using a service called Ingestion service.
  • Performed Scala, Data transformations in Scala, HIVE and used partitions, buckets for performance improvements.
  • Written custom Input format and record reader classes for reading and processing teh binary format in MapReduce.
  • Written Custom writable classes for Hadoop serialization and De serialization of Time series tuples.
  • Implemented Custom File loader for Pig so dat we can query directly on teh large Data files such as build logs
  • Used Python for pattern matching in build logs to format errors and warnings
  • Developed Pig Latin scripts & Shell scrip for validating teh different query modes in Historian.
  • Created Hive external tables on teh map reduce output before partitioning; bucketing is applied on top of it.
  • Used SOLR for database integration to SQL SERVER.
  • Monitoring clusters to provide reporting using SOLR.
  • Improved teh Performance by Scala, tuning of HIVE and map reduce using talend, ActiveMQ and JBoss.
  • Developed Daily Test engine using Python for continuous tests.
  • Expertise on performing cloudera operations on HDFS data.
  • Used Shell scripting for Jenkins job automation wif Talend.
  • Building a custom calculation engine which can be programmed according to user needs.
  • Ingestion of data into Hadoop using Shell scripting for Scrum, Elastic Sqoop and apply data transformations and using Pig and HIVE.
  • Research, Scrum, evaluate and utilize new technologies/tools/frameworks around Hadoop eco-system

Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Cloudera, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, Solr, Oracle, My SQL and CDH4.X.

Confidential, Pasadena, CA

Hadoop Developer


  • Create, validate and maintain scripts to load data using Sqoop manually.
  • Create Oozie workflows and coordinators to automate Sqoop jobs weekly and monthly.
  • Worked on reading multiple data formats on HDFS using.
  • Involved in converting Hive/SQL queries transformation.
  • Developed multiple POCs using Scala and deployed on teh Yarn cluster, compared teh performance of Spark, wif Hive and SQL/Teradata.
  • Expertise in RDBMS, databaseNormalization and Denormalizationconcepts and principles.
  • Strong experience in Creating Database Objects such asTables, Views, Functions, Stored Procedures, Indexes, Triggers, Cursors in Teradata.
  • Strong skills in coding and debugging Teradata utilities likeFast Load, Fast Export, MultiLoadandTpumpfor Teradata ETL processing huge volumes of data throughput.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
  • Analyzed teh SQL scripts and designed teh solution to implement and Running reports in Pig and Hive.
  • Develop, validate and maintain Hive QL queries.
  • Access to MongoDB and Cassandra to perform operations on Cassandra as well as MongoDB clusters.
  • Create, validate and maintain scripts to load data from and into tables in SQL Server 2012.
  • Writing Map Reduce jobs using Python Scripting as well as Java Programming Languages.
  • Fetch data to/from HBase using Map Reduce jobs tasks.
  • Running reports in Pig and Hive Queries.
  • Analyzing data wif Hive, Pig.
  • Designed Hive tables to load data to and from external tables.
  • Wrote shell scripting to load data across servers.
  • Importing data from MySQL database to HiveQL using Sqoop.
  • Designed Hive tables to load data to and from external files.
  • Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive

Environment: s: Hadoop Stack (Hive, PIG, HCatlog, Sqoop, Oozie), Qlik view, Linux, SQL Server 2010, Bit Bucket, JAVA, Python.


Java Developer


  • Develop teh application using Spring as a front-end architecture and Hibernate as a data access layer, WebLogic as an application server and Oracle as a Database.
  • Designed and developed teh system componentsusing Agile software methodology.
  • Created Spring Controllers and Integrated wif Business Components and View Components.
  • Experience and used Jenkins as well as Maven
  • Developed Spring and Hibernate data layer components for teh application.
  • Involved in databases updates and DDL Creation.
  • Developed Restful Web Services for accessing Ordering information.
  • Helped teh UI team to integrate using Spring and RESTFUL Services.
  • Unit test cases are created using JUNIT testing framework.
  • Involved in deploying teh application on WebLogic server.
  • SVN is used for version control.
  • Coordination wif various team including support and test teams.

Environment: Java,J2EE(Servlets, JSP, JDBC), Spring, Hibernate, XML, Web Service, Oracle SQL/PLSQL, Jenkins, Maven


Java Backend Developer


  • Developed bulk loading module, consumer specific access control model and search functionality for teh catalog server in our product.
  • Responsible for designing and developing catalog server UI using XML/XSLT
  • Developed shopping cart functionality (Back end and UI) for teh catalog.
  • Responsible for meeting teh specific performance goals for teh catalog server put forth by teh customer.
  • Deployed teh application using Tomcat webserver.
  • Worked on web-based reporting system wif HTML, JavaScript and JSP.
  • Involved in Designing teh Database Schema and writing teh complex SQL queries.
  • Accessed teh database using JDBC.
  • Used Oracle database for teh application.
  • Real Expertize and working knowledge on Oracle ORMB and performance tuning.
  • Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using JDBC.

Environment: HTML, CSS, JSP, Java 1.2, JavaScript, JVM, JDBC, Tomcat, Oracle, Servlets, XML, and XSLT, MYSQL

Educational Qualification: Bachelor’s in Information Technology, JNTU, Hyderabad, 2008

We'd love your feedback!