We provide IT Staff Augmentation Services!

Sr. Bigdata Developer Resume

Charlotte, NC


  • Result - driven IT Professional with referable expertise in Developing, Integrating, Analyzing, Deploying and Maintaining of the Big Data and Hadoop Eco Systems.
  • Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses.
  • Highly skilled installing, customizing and testing the Big Data and Hadoop Eco Systems such as Hive, Pig, Sqoop, Spark, Oozie etc.
  • Experience in Hadoop Map Reduce, Pig, Hive, Oozie, Sqoop, Flume, Zookeeper.
  • Strong experience in AWS, Cloudera and Hortonworks Hadoop distribution and maintaining and optimized AWS infrastructure (EC2 and EBS) also good knowledge in MS Azure.
  • Hands on experience with NoSQL Databases like HBase, Cassandra and relational databases like Oracle and MySQL.
  • Proficient in Java, Collections, J2EE, Servlets, JSP, Spring, Hibernate and JDBC/ODBC.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in setting up Test, QA, and Prod environment.
  • Experience in extending Hive and Pig core functionality by writing custom UDFs using Java.
  • Experience in developing MapReduce (Yarn) jobs for cleaning, accessing and validating the data.
  • Experience in Different Distributions like Cloudera, Hortonworks and MapR.
  • Experience in creating different visualizations using Bars, Lines, pies, Maps, Scatter Plots, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.
  • Expertise in Web pages development using JSP, HTML, Java Script, JQuery and Ajax.
  • Experience in writing database objects like Stored Procedures, Functions, Triggers, PL/SQL packages and Cursors for Oracle, SQL Server, and MySQL.
  • Experience on working structured, unstructured data with various file formats such as Avro data files, xml files, JSON files, sequence files, ORC and Parquet.
  • Expertise with Application servers and web servers like Oracle WebLogic, IBM WebSphere and Apache Tomcat.
  • Experience working in environments using Agile (Scrum) and Waterfall methodologies.
  • Expertise in database modeling and development using SQL and PL/SQL, MySQL, Teradata.
  • Strong programming experience using Java, Scala, Python and SQL.
  • Expertise in developing production ready Spark applications utilizing Spark-Core, Data frames, Spark-SQL, Spark-ML and Spark-Streaming API's.
  • Experience in working with NoSQL database like HBase, Cassandra and Mongo DB.
  • Good experience in developing applications using Java, J2EE, JSP, MVC, EJB, JMS, JSF, Hibernate, AJAX and web based development tools.
  • Good experience using a modern version control system such as GitHub, Bitbucket
  • Experience with web app development with NodeJS, ExpressJS, HTML, CSS and JavaScript.
  • Extensive experience with advanced J2EE Frameworks such as spring, Struts, JSF and Hibernate.


Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper

Hadoop Distributions: Cloudera, Hortonworks, MapR

Cloud: AWS, Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Databases: Oracle 12c/11g, SQL

Database Tools: TOAD, SQL PLUS, SQL

Operating Systems: Linux, UNIX, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB

Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS


Confidential - Charlotte, NC

Sr. BigData Developer

Roles & Responsibilities:

  • Extensively worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
  • Used Agile methodology process in the development project and used JIRA to manage the issues/project work flow.
  • Worked in Azure environment for development and deployment of Custom Hadoop Applications.
  • Designed and implemented scalable Cloud Data and Analytical architecture solutions for various public and private cloud platforms using Azure.
  • Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, MapReduce, Spark and Shells scripts (for scheduling of few jobs).
  • Implemented various Azure platforms such as Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
  • Extracted and loaded data into Data Lake environment (MS Azure) by using Sqoop which was accessed by business users and data scientists.
  • Manage and support of enterprise Data Warehouse operation, big data advanced predictive application development using Cloudera & Hortonworks HDP.
  • Designed, developed and maintained Big Data streaming and batch applications using Storm.
  • Worked with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
  • Implemented and configured workflows using Oozie to automate jobs.
  • Participated in the managing and reviewing of the Hadoop log files.
  • Used Elastic Search & MongoDB for storing and querying the offers and non-offers data.
  • Developed Web applications using Servlets, JSP, JDBC, EJB, web services using JAX-WS and JAX-RS APIS.
  • Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Developed a Spark job in Java which indexes data into Elastic Search from external Hive tables which are in HDFS.
  • Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and NoSQL databases such as HBase and Cassandra.
  • Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
  • Built Hadoop solutions for big data problems using MR1 and MR2 in Yarn.
  • Handled importing of data from various data sources, performed transformations using Hive, Pig, and loaded data into HDFS.
  • Involved in identifying job dependencies to design workflow for Oozie & Yarn resource management.
  • Used windows Azure SQL reporting services to create reports with tables, charts and maps.
  • Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Imported and exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Support Cloud Strategy team to integrate analytical capabilities into an overall cloud architecture and business case development.

Environment: Azure, Hadoop 3.0, Sqoop 1.4.6, Pig 0.17, Hive 2.3, MapReduce, Spark 2.2.1, Shells scripts, SQL, Hortonworks, Python 3.6, MLlib, HDFS, YARN 2.9, Java, Kafka 1.0, Cassandra 3.11, Oozie 4.3

Confidential - Peoria IL

Sr. Big Data/Hadoop Developer

Roles & Responsibilities:

  • Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.
  • Worked on all activities related to the development, implementation and support for Hadoop.
  • Designed custom re-usable templates in Nifi for code reusability and interoperability.
  • Involved in Installing, Configuring Hadoop Eco System, and Cloudera Manager using CDH4 Distribution.
  • Worked with teams in setting up AWS EC2 instances by using different AWS services like S3, EBS, and Elastic Load Balancer, Auto scaling groups, VPC subnets and CloudWatch.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Worked with Kafka streaming tool to load the data into HDFS and exported it into MongoDB database.
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Implemented multiple MapReduce Jobs in java for data cleansing and pre-processing.
  • Wrote complex Hive queries and UDFs in Java and Python.
  • Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters and Experience in converting MapReduce applications to Spark.
  • Worked with Hadoop eco system covering HDFS, HBase, YARN and MapReduce.
  • Used Scala and Spark-SQL to develop spark code for faster processing, testing and performed complex Hive queries on Hive tables.
  • Wrote and execute SQL queries to work with structured data available in relational databases and to validate the transformation/ business logic.
  • Use Flume to move data from individual data sources to Hadoop system.
  • Use MRUnit framework to test the MapReduce code.
  • Responsible for building scalable distributed data solutions using Hadoop Eco system and Spark.
  • Involved in the process of data acquisition, data pre-processing various types of source data using Stream sets.
  • Responsible for design & development of Spark SQL Scripts using Scala/Java based on Functional Specifications.
  • Worked with NoSQL Cassandra to store, retrieve, and update and manage all the details for Ethernet provisioning and customer order tracking.
  • Analyzed the data by performing Hive queries (HiveQL), ran Pig scripts, Spark SQL and Spark streaming.
  • Developed tools using Python, Shell scripting, XML to automate some of the menial tasks
  • Wrote scripts in Python for extracting data from HTML file.
  • Implemented MapReduce jobs in HIVE by querying the available data.
  • Configured Hive Meta store with MySQL, which stores the metadata for Hive tables.
  • Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
  • Performance tuning of Hive queries, MapReduce programs for different applications.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.

Environment: Nifi 1.1, Hadoop 2.6, JSON, XML, Avro, HDFS, Teradata r15, Sqoop, Kafka, MongoDB, Hive 2.3, Pig 0.17, HBase, Zookeeper, MapReduce, java, Python 3.6, Yarn, Flume, NoSQL, Cassandra 3.11

Confidential - Wilmington, DE

Sr. Java/Hadoop Developer

Roles & Responsibilities:

  • Gathered the business requirements from the Business Partners and Subject Matter Experts.
  • Supported HBase Architecture Design with the Hadoop Architect team to develop a Database Design in HDFS.
  • Supported Map Reduce Programs those are running on the cluster and also Wrote MapReduce jobs using Java API.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Imported data from mainframe dataset to HDFS using Sqoop.
  • Handled importing of data from various data sources (i.e. Oracle, DB2, Cassandra, and MongoDB) to Hadoop, performed transformations using Hive, MapReduce.
  • Created the Mock-ups using HTML and JavaScript to understand the flow of the web application
  • Integration of Cassandra with TalenD and automation of jobs.
  • Used Struts framework to develop the MVC architecture and modularized the application
  • Wrote Hive queries for data analysis to meet the business requirements.
  • Involved in managing and reviewing Hadoop log files.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Utilized Agile Scrum Methodology to help manage and organize with developers and regular code review sessions.
  • Upgraded the Hadoop Cluster from CDH4 to CDH5 and setup High availability Cluster to Integrate the HIVE with existing applications
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Continuous monitored and managed the Hadoop cluster through Cloudera Manager.
  • Developed Hive queries to process the data and generate the data cubes for visualizing.
  • Optimized the mappings using various optimization techniques and also debugged some existing mappings using the Debugger to test and fix the mappings.
  • Used SVN version control to maintain the different version of the application
  • Updated maps, sessions and workflows as a part of ETL change and also modified existing ETL Code and document the changes.
  • Involved in coding, maintaining, and administering EJB, Servlets, and JSP components to be deployed on a Web Logic Server.
  • Worked on importing data from HDFS to MYSQL database and vice-versa using Sqoop.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Extracted meaningful data from unstructured data on Hadoop Ecosystem.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required

Environment: Hadoop 2.4, Java, MapReduce, HDFS, Hive 2.0, Pig 0.15, Linux, XML, Eclipse, Cloudera, CDH4/5 Distribution, DB2, SQL Server, Oracle 11g, MYSQL, Web Logic Application Server 8.1, EJB 2.0, Struts 1.1


Java/J2EE Developer

Roles & Responsibilities:

  • As a Java/J2EE Developer my role is to design, develop, deploy and maintain website and applications.
  • Involved in Software Development Life Cycle (SDLC) of the application: Requirement gathering, Design Analysis and Code development.
  • Implemented Struts framework based on the Model View Controller design paradigm.
  • Designed the application by implementing Struts based on MVC Architecture, simple Java Beans as a Model, JSP UI Components as View and Action Servlet as a Controller.
  • Used JNDI to perform lookup services for the various components of the system.
  • Involved in designing and developing dynamic web pages using HTML and JSP with Struts tag libraries.
  • Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML and AJAX and developed web services by using SOAP UI.
  • Used JPA to persistently store large amount of data into database.
  • Implemented modules using Java APIs, Java collection, Threads, XML, and integrating the modules.
  • Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, Front Controller Pattern and MVC.
  • Used JPA for the management of relational data in application.
  • Designed and developed business components using Session and Entity Beans in EJB.
  • Developed the EJBs (Stateless Session beans) to handle different transactions to the service providers.
  • Developed JMS Sender and Receivers for the loose coupling between the other modules and Implemented asynchronous request processing using Message Driven Bean.
  • Used JDBC for data access from Oracle tables.
  • Successfully installed and configured the IBM WebSphere Application server and deployed the business tier components using EAR file.
  • Used Maven for build framework and Jenkins for continuous build system.
  • Deployed application on JBOSS application server environment.
  • Provided SQL scripts and PL/SQL stored procedures for querying the database.
  • Used Eclipse for writing JSPs, Struts and other java code snippets.
  • Used JUnit framework for Unit testing of application and Clear Case for version control.
  • Built application using ANT and used Log4J to generate log files for the application.

Environment: Java, J2EE, Hibernate 4.3, JSON, XML, HTML, CSS, JavaScript, AJAX, JQuery, Apache Tomcat, Maven, JBOSS, PL/SQL, Eclipse, JUnit, ANT

Hire Now