We provide IT Staff Augmentation Services!

Bigdata/hadoop Developer Resume

Tampa, FL


  • 8+ years of experience in SDLC with key emphasis on the trending Big Data Technologies - Spark, Scala, Spark Mlib, Hadoop, Tableau, Cassandra, Java, J2EE .
  • Built streaming applications using SPARK Streaming.
  • Knowledge on big-data database HBase and NoSQL databases Mongo DB and Cassandra.
  • Expertise in Java Script, JavaScript MVC patterns, Object Oriented JavaScript Design Patterns and AJAX.
  • Experience in working with MapReduce programs, Pig scripts and Hive commands to deliver the best results.
  • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experienced in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
  • Developed core modules in large cross-platform applications using JAVA, JSP, Servlets, JDBC, JavaScript, XML, and HTML.
  • Hands-on experience in using relational databases like Oracle, MySQL, PostgreSQL and MS-SQL Server.
  • Experience with Java web framework technologies like Struts2, Camel and Spring Batch.
  • Having good experience in Hadoop framework and related technologies like HDFS, MapReduce, Pig, Hive, HBase, Sqoop and Oozie
  • Expertise in Data Development in Hortonworks HDP platform &Hadoop ecosystem tools like Hadoop, HDFS, Spark, Zeppelin, Hive, HBase, SQOOP, flume, Atlas, SOLR, Pig, Falcon, Oozie, Hue, Tez, ApacheNiFi, Kafka.
  • Expertise in developing the presentation layer components like HTML, CSS, JavaScript, JQuery, XML, JSON, AJAX and D3.
  • Managed the project based on Agile-Scrum Methods.
  • Worked on Bootstrap, Angular JS and Node JS, knockout, ember.js, Java Persistence Architecture (JPA).
  • Strong experience in developing Enterprise and Web applications on n-tier architecture using Java/J2EE based technologies such as Servlets, JSP, Spring, Hibernate, Struts, EJBs, Web Services, XML, JPA, JMS, JNDI and JDBC.
  • Experienced in managing and reviewing the Hadoop log files.
  • Developed applications based on Model-View-Controller (MVC).
  • Extensive experience in building and deploying applications on Web/Application Servers like Web logic, Web sphere, and Tomcat.
  • Expert in Amazon EMR, Spark, Kinesis, S3, Boto3, Bean Stalk, ECS, Cloud watch, Lambda, ELB, VPC, Elastic Cache, Dynamo DB, Redshit, RDS, Aethna, Zeppelin & Airflow.
  • Good at problem-solving skills to identify areas of improvement and incorporating best practices for delivering quality deliverables.
  • Have good experience, excellent communication and interpersonal skills which contribute to timely completion of project deliverable well ahead of schedule.


Languages: Java, J2EE, PL/SQL, Pig Latin, HQL, R, Python, XPath, Spark

Hadoop/Big Data: Map Reduce, HDFS, Hive, Pig, HBase, Zookeeper, SQOOP, Oozie, Flume, Scala,Akka, Kafka, Storm,Mongo DB

Java/J2EE Technologies.: JDBC, Java Script, JSP, Servlets, JQuery

No SQL Databases: Cassandra, mongo DB

Web Technologies: HTML, DHTML, XML, XHTML, JavaScript, CSS, XSLT, AWS, Dynamo DB

Web/Application servers: Apache Tomcat6.0/7.0/8.0, JBoss

Frameworks: MVC Struts, Spring, Hibernate.


Operating Systems: UNIX, Ubuntu Linux and Windows, Centos, Sun Solaris.

Network protocols: TCP/IP fundamentals, LAN and WAN.

Databases: Oracle, MySQL, DB2, Derby,PostgreSQL,No-SQL Database (Hbase, Cassandra), Microsoft Access, MS SQL


Confidential, Tampa,FL

Bigdata/Hadoop Developer


  • Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, HBase, Hive.
  • Performed data analysis, feature selection, feature extraction using Apache Spark Machine Learning streaming libraries in Python.
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Experience in AWS, implementing solutions using services like (EC2, S3, RDS, Redshift, VPC).
  • Extensively development experience in different IDE like Eclipse, Net Beans and IntelliJ.
  • Worked as a Hadoop consultant on (Map Reduce/Pig/HIVE/SQOOP).
  • Worked using ApacheHadoop ecosystem components like HDFS, Hive, SQOOP, Pig, and Map Reduce.
  • Good exposure to Github and Jenkins.
  • Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • As a POC, used Spark for data transformation of larger data sets.
  • Worked on setting up and configuring AWS's EMR Clusters and Used Amazon IAM to grant fine-grained access to AWS resources to users
  • Enable and configure Hadoop services such as HDFS, YARN, Hive, Ranger, Hbase, Kafka, Sqoop, Zeppeline Notebook and Spark/Spark2.
  • Worked on Spark, Scala, Python, Storm Impala.
  • Extensive experience in Spark Streaming (version 1.5.2) through core Spark API running Scala, Java to transform raw data from several data sources into forming baseline data.
  • Creating dashboard on Tableau and Elastic search with Kibana.
  • Hands on expertise in running the SPARK & SPARK SQL.
  • Experienced in analyzing and Optimizing RDD's by controlling partitions for the given data.
  • Worked on MapRHadoop platform to implement BigData solutions using Hive, Map Reduce, shell scripting, and java technologies.
  • Struts (MVC) is used for implementation of business model logic.
  • Evaluate deep learning algorithms for text summarization using Python, Keras, TensorFlow and Theano on clouderaHadoopSstem
  • Deployed and managed Application on Tomcat server.
  • Experienced in querying data using Spark SQL on top of Spark engine.
  • Experience in managing and monitoring Hadoop cluster using Cloudera Manager.
  • Developed different kinds of interactive graphs in R studios.
  • Created own shiny-server on Linux Centos OS and deployed reports on server.
  • Created ER diagram for Data Modeling.

Environment: Big Data, JDBC, NOSQL, Spark, YARN, HIVE, Pig, Scala,Nifi, intellij, AWS EMR, Python, Hadoop, Redshift.

Confidential, DE

Bigdata /Hadoop Developer


  • Single View Of Product - Developed scripts using SQOOP, SCP & Hive to consolidate PCM & PSSA attributes of all products sold at Lowe's. Oozie coordinator is used for scheduling.
  • Consolidation of Allied BU Sales, Inventory, customer, GL & other data - Developed data Ingestion pipeline using SQOOP& Falcon. Developed scripts using Bash, Spark, Hive, Pig. Data Visualization using MSTR VI.
  • Data modeling, schema designing for no-sql (HBase) and impala tables.
  • Worked on Spark Storm, Apache and Apex and python.
  • Implemented the Machine learning algorithms using Spark with Python
  • Single View Of Product - Developed scripts using SQOOP, SCP & Hive to consolidate PCM & PSSA attributes of all productsold at Lowe's. Oozie coordinator is used for scheduling.
  • Developed the application by using the Spring MVC framework.
  • Delivery experience on major Hadoop ecosystem Components such as Pig, Hive, Spark Kafka, Elastic Search &HBase and monitoring with Cloudera Manager.
  • Involved in installing EMR clusters on AWS.
  • Developed HiveQL scripts for performing transformation logic and also loading the data from staging zone to landing zone and Semantic zone.
  • Involved in installing EMR clusters on AWS.
  • Used Git for version controller.
  • Use Spark API for Machine learning. Translate a predictive model from SAS code to Spark
  • Used Spark API over ClouderaHadoop YARN to perform analytics on data in Hive.
  • Involve in implementation of REST and SOAP based web services.
  • Worked on improvising the performance of the application.
  • Worked on Spark Storm, Apache and Apex and python.
  • Used AWS Data Pipeline to schedule an Amazon EMR cluster to clean and process web server logs stored in Amazon S3 bucket.
  • Developed Shell & Hive scripts for consolidating BrandView pricing data. Oozie is used for scheduling.
  • Spring IOC being used to inject the parameter values for the Dynamic parameters.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: ApacheHadoop, HDFS, Hive, Map Reduce, Impala, Cloudera,Pig, SQOOP, Kafka, Spark, Apache Cassandra, Oozie, Impala, Cloudera, Zookeeper, MySQL, Eclipse, PL/SQL and Python.

Confidential, Minnesota, MN

Hadoop Developer


  • Responsible for building scalable distributed data solutions using Hadoop.
  • Written multiple MapReduce programs in Java for Data Analysis.
  • Wrote MapReduce job using PigLatin and JavaAPI.
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.
  • Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
  • Designed and presented plan for POC on impala.
  • Experienced in migrating HiveQL into Impala to minimize query response time.
  • Implemented Avroand parquet data formats for apache Hive computations to handle custom business requirements.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Performed extensive DataMining applications using HIVE.
  • Responsible for performing extensive data validation using Hive.
  • Sqoop jobs, PIG and Hivescripts were created for data ingestion from relational databases to compare with historical data.
  • Setup Hadoop cluster on Amazon EC2 using whirr for POC.
  • Implemented test scripts to support test driven development and continuous integration.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, SVN, autosys, Tableau, Hbase.

Confidential -Houston, TX

Hadoop Developer


  • Installed and configured Cloudera Manager for easy management of existing Hadoopcluster.
  • Deployed Network File System for Namenode metadata backup.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Used Sqoop to transfer data between RDBMS and HDFS.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Designed and implemented custom writable, custom input formats, custom partitions and custom comparators in Mapreduce.
  • Thoroughly tested Mapreduce programs using MRUnit and Junit testing frameworks.
  • Responsible for troubleshooting issues in the execution of Mapreduce jobs by inspecting and reviewing log files.
  • Converted existing SQL queries into HiveQL queries.
  • Implemented UDFs, UDAFs, UDTFs in java for hive to process the data that can't be performed using Hive inbuilt functions.
  • Wrote PigLatinscripts for advanced analytics on data for recommendations.
  • Effectively used Oozie to develop automatic workflows of Sqoop, Mapreduce and Hive jobs.
  • Organized daily SCRUM meeting with team, prioritizeproduct backlog items and responsible for timely delivery and deployment of product releases.

Environment: CDH5.4 Cloudera Distribution, Sqoop, Pig Latin, Hive, Flume, HDFS, MapReduce, Eclipse IDE, UNIX Shell Scripting, Apache Solr


Sr. Java/J2EE Developer


  • Implemented Object-relation mapping in the persistence layer using hibernate frame work in conjunction with Spring Aspect Oriented Programming (AOP) functionality.
  • Developed application framework using struts with design principles of the J2EE using Business Delegate, Service Locator, Session Facade, Domain object and DAO patterns and developed Stateless Session Bean to Achieve Session façade Design Pattern.
  • Developed Stored Procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
  • Developed SQL queries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
  • Help Devops teams configuring servers by building cook books to install and configure tomcat.
  • Developed the XML Schema and Web services for the data maintenance and structures Wrote test cases in JUnit for unit testing of classes.
  • Used DOM and DOM Functions using Firefox and IE Developer Tool bar for IE.
  • Used JSP, HTML, Java Script, Angular JS and CSS3 for content layout and presentation.
  • Did core Java coding using JDK 1.3, Eclipse Integrated Development Environment (IDE), clear case, and ANT.
  • Developing User Interface Screens using Spring MVC, to enable customers obtain auto finance. Extensive experience in developing various web based applications using Hibernate 3.0 and Spring frameworks.
  • Developed Authentication layer using Spring Interceptors.
  • Used Log4J to print the logging, debugging, warning, info on the server console.
  • Build test cases using JUnit and carried out unit testing.
  • Developed Spring REST Exception Mappers.
  • Developed Stored Procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
  • Developed application framework using struts with design principles of the J2EE using Business Delegate, Service Locator, Session Facade, Domain object and DAO patterns and developed Stateless Session Bean to Achieve Session façade Design Pattern.
  • Developed Stored Procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.

Environment: Java, XML, HTML, JavaScript, JDBC, UNIX, CSS, SQL, PL/SQL, XML, Web MVC, Eclipse, Ajax, JQuery, Spring with Hibernate, Active MQ, Jasper Reports, Ant as build tool and My SQL and Apache Tomcat.


Software Engineer


  • Developed the code using the struts framework.
  • Involved in requirement analysis.
  • Developed UI components using JSP and JavaScript.
  • Involved in writing the technical design document.
  • Developed the front end for the site based on (MVC) design pattern Using Struts framework.
  • Created a data access layer to make rest of the code database independent.
  • Developed JSPs, ServLets and created java beans for the application.
  • Developed sample requests and responses for testing web services.
  • Deployed web applications on server using ApacheTomcat.
  • Developed new code for the change requests.
  • Developed complex PL/SQL queries to access data.
  • Coordinated across multiple development teams for quick resolution to blocking issues.
  • Prioritized tasks and coordinated assignments with the team.
  • Performed on call support on a weekly rotation basis.
  • Performed manual and automated testing.
  • Involved in writing and updating the Test cases in the Quality tool.

Environment: JSP, Java Bean, Servlets, Oracle, HTML & JAVASCRIPT, JDBC, PL/SQL, The web-tier consists of Apache

Hire Now