We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

5.00/5 (Submit Your Rating)

Charlotte, NC

SUMMARY:

  • Above 9+ years of working experience as a Sr. Big Data/Hadoop Developer in designed and developed various applications like big data, Hadoop, Java/J2EE open - source technologies.
  • Work Extensively in Core Java, Struts2, JSF, spring, Hibernate, Servlets, JSP and Hands-on experience with PL/SQL, XML and SOAP.
  • Well verse working with Relational Database Management Systems as Oracle, MS SQL, MySQL Server
  • Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data.
  • Hands on experience in working on XML suite of technologies like XML, XSL, XSLT, DTD, XML Schema, SAX, DOM, JAXB.
  • Experience in working with Web Servers like Apache Tomcat and Application Servers like IBM Web Sphere and JBOSS.
  • Hands on experience in advanced Big-Data technologies like Spark Ecosystem (Spark SQL, MLlib, Spark, R and Spark Streaming), Kafka and Predictive analytics
  • Knowledge of the software Development Life Cycle (SDLC), Agile and Waterfall Methodologies.
  • Experience on applications using Java, python and UNIX shell scripting
  • Experience in consuming Web services with Apache Axis using JAX-RS(REST) API’s.
  • Experienced in building tool Maven, ANT and logging tool Log4J.
  • Experience in Programming and Development of java modules for an existing web portal based in Java using technologies like JSP, Servlets, JavaScript and HTML, SOA with MVC architecture.
  • Expertise in ingesting real time/near real time data using Flume, Kafka, Storm
  • Good knowledge of NO SQL databases like Mongo DB, Cassandra and HBase.
  • Strong development skills in Hadoop, HDFS, Map Reduce, Hive, Sqoop, HBase with solid understanding of Hadoop internals.
  • Excellent knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MRA and MRv2 (YARN).
  • Good knowledge of NoSQL databases such as HBase, MongoDB and Cassandra.
  • Experience in working with Eclipse IDE, Net Beans, and Rational Application Developer.
  • Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
  • Expertise in developing a simple web based application using J2EE technologies like JSP, Servlets, and JDBC.
  • Experience working on EC2 (Elastic Compute Cloud) cluster instances, setup data buckets on S3 (Simple Storage Service), set EMR (Elastic MapReduce).
  • Hands on experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop Distributed File System (HDFS), MapReduce, Pig, Hive, HBase, Apache Crunch, Zookeeper, Scoop, Hue, Scala, AVRO.
  • Strong Programming Skills in designing and implementing of multi-tier applications using Java, J2EE, JDBC, JSP, JSTL, HTML, CSS, JSF, Struts, JavaScript, JAXB.
  • Extensive experience in SOA-based solutions - Web Services, Web API, WCF, SOAP including Restful APIs services
  • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experience in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig 0.17, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper

Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory

Hadoop Distributions: Cloudera, Hortonworks, MapR

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Operating Systems: Linux, Unix, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB

Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS

PROFESSIONAL EXPERIENCE:

Confidential - Charlotte, NC

Sr. Big Data Developer

Responsibilities:

  • As a Big Data Developer, I worked on Hadoop eco-systems including Hive, HBase, Oozie, Pig, Zookeeper, Spark Streaming MCS (MapR Control System) and so on with MapR distribution.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in Java for data cleaning and Pre-processing.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Built code for real time data ingestion using Java, Map R-Streams (Kafka) and STORM.
  • Involved in various phases of development analysed and developed the system going through Agile Scrum methodology.
  • Worked on Apache Solr which is used as indexing and search engine.
  • Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
  • Worked on analysing Hadoop stack and different Big data tools including Pig and Hive, HBase database and Sqoop.
  • Developed data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
  • Used J2EE design patterns like Factory pattern & Singleton Pattern.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Implemented usage of Amazon EMR for processing Big Data across Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
  • Performed transformations, cleaning and filtering on imported data using Hive, MapReduce, Impala and loaded final data into HDFS.
  • Developed Python scripts to find vulnerabilities with SQL Queries by doing SQL injection.
  • Experienced in designing and developing POC's in Spark using Scala to compare the performance of Spark with Hive and SQL.
  • Responsible for coding MapReduce program, Hive queries, testing and debugging the MapReduce programs.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Imported weblogs & unstructured data using the Apache Flume and stores the data in Flume channel.
  • Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
  • Used RESTful web services with MVC for parsing and processing XML data.
  • Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Involved in loading data from UNIX file system to HDFS. Involved in designing schema, writing CQL's and loading data using Cassandra.
  • Built the automated build and deployment framework using Jenkins, Maven etc.

Environment: Hadoop 3.0, Hive 2.3, HBase 1.2, Oozie, Pig 0.17, Zookeeper, Spark, MapReduce, Azure, Java, Agile, J2EE, Cassandra 3.11, Jenkins, Maven

Confidential - Peoria IL

Big Data/Hadoop Developer

Responsibilities:

  • Responsible for installation and configuration of Hive, Pig, HBase and Sqoop on the Hadoop cluster and created hive tables to store the processed results in a tabular format.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Integrate visualizations into a Spark application using Data bricks and popular visualization libraries (ggplot, matplotlib).
  • Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation and support for Hadoop.
  • Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
  • Developed the Sqoop scripts to make the interaction between Hive and vertica Database.
  • Processed data into HDFS by developing solutions and analyzed the data using MapReduce, PIG, and Hive to produce summary results from Hadoop to downstream systems.
  • Streamed AWS log group into Lambda function to create service now incident.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Created Managed tables and External tables in Hive and loaded data from HDFS.
  • Developed Spark code by using Scala and Spark-SQL for faster processing and testing and performed complex HiveQL queries on Hive tables.
  • Scheduled several times based Oozie workflow by developing Python scripts.
  • Exporting the data using Sqoop to RDBMS servers and processed that data for ETL operations.
  • Worked on S3 buckets on AWS to store Cloud Formation Templates and worked on AWS to create EC2 instances.
  • Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoop, package and MySQL.
  • End-to-end architecture and implementation of client-server systems using Scala, Akka, Java, JavaScript and related, Linux
  • Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Implementing Hadoop with the AWS EC2 system using a few instances in gathering and analyzing data log files.
  • Involved in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
  • Created partitioned tables and loaded data using both static partition and dynamic partition method.
  • Developed custom Apache Spark programs in Scala to analyze and transform unstructured data.
  • Using Kafka on publish-subscribe messaging as a distributed commit log, have experienced in its fast, scalable and durability.
  • Scheduled map reduces jobs in production environment using Oozie scheduler.
  • Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
  • Designed and implemented MapReduce jobs to support distributed processing using java, Hive and Apache Pig
  • Analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase and Sqoop.
  • Improved the Performance by tuning of Hive and MapReduce.
  • Implemented POC to migrate MapReduce jobs into Spark RDD transformations using Scala.

Environment: Hive 2.3, Pig 0.17, HBase 1.2, Sqoop, Hadoop 3.0, Agile, Spark, Scala, AWS, JavaScript

Confidential - Menlo Park, CA

Hadoop Developer

Responsibilities:

  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data using Lambda Architecture.
  • Experience in deploying data from various sources into HDFS and building reports using Tableau.
  • Developed a data pipeline using Kafka and Strom to store data into HDFS.
  • Developed REST APIs using Scala and Play framework to retrieve processed data from Cassandra database.
  • Re-engineered n-tiered architecture involving technologies like EJB, XML and Java into distributed applications.
  • Explored the possibilities of using technologies like JMX for better monitoring of the system.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, MapReduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Responsible for Cluster Maintenance, Monitoring, Managing, Commissioning and decommissioning Data nodes, Troubleshooting, and review data backups, Manage & review log files for Hortonworks.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Installed and configured HA of Hue to point Hadoop Cluster in cloudera Manager.
  • Installed and configured MapReduce, HDFS and developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Working with applications teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible for developing data pipeline using HDInsight, Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
  • Performed transformations, cleaning and filtering on imported data using Hive, MapReduce, and loaded final data into HDFS.
  • Developed and maintained the continuous integration and deployment systems using Jenkins, ANT, Akka and MAVEN.
  • Effectively used GIT (version control) to collaborate with the Akka team members.
  • Developed HDFS with huge amounts of data using Apache Kafka.
  • Collected the log data from web servers and integrated into HDFS using Flume.

Environment: Spark, HDFS, Kafka, Scala, Cassandra, Java, Hive, MapReduce, Hortonworks, Oozie, Hadoop, HBase, Flume, Sqoop, Pig, MAVEN, ANT

Confidential - San Francisco, CA

Java/J2ee Developer

Responsibilities:

  • Worked closely with the Requirements team and analyzed the Use cases. Elaborated on the Use cases based on business requirements and was responsible for creation of class diagrams, sequence diagrams.
  • Adopted J2EE best Practices, using Core J2EE patterns. Developed in Eclipse environment using Struts based MVC framework.
  • Designed and developed presentation layer using JSP, HTML and JavaScript.
  • Created JSPs using JSTL and Spring tag libraries. Developed Struts Action and Action Form classes.
  • Deployed J2EE components (EJB, Servlets) in Tomcat Application server.
  • Involved in understanding business requirements and provide technical designs and necessary documentation.
  • Implemented Microservices architecture to make application smaller and independent.
  • Developed new Spring Boot application with microservices and added functionality to existing applications using Java/ J2EE technologies.
  • Used Hibernate as Object relational mapping tool for mapping Java Objects to database tables.
  • Used Hibernate Query Language (HQL), annotations and Criteria for access and updating data.
  • Implemented REST Web Services and performed the HTTP operations.
  • Implemented multithreading to process multiple tasks concurrently in order to perform the read/write operations.
  • Worked extensively with Core Java concepts like Collections, Exception Handling, Java I/O, and Generics to implement business logic.
  • Developed web pages using with HTML, CSS, JavaScript, jQuery and Ajax.
  • Used JavaScript for client-side validations in the JSP and HTML pages.
  • Created SQL queries and stored procedures to create, retrieve and update data from database.
  • Developed Maven Scripts to build and deploy EAR files.
  • GitHub was used for the version control and source code management.
  • Followed Test Driven Development (TDD), responsible for testing, debugging and bug fixing of the application.
  • Used log4j to capture the logs that included runtime exceptions and debug information.
  • The application was deployed in LINUX environment.

Environment: J2EE, MVC, HTML, JavaScript, Java, Hibernate, jQuery, Ajax, Maven

Confidential

Java Developer

Responsibilities:

  • Involved in the complete Software Development Life Cycle including Requirement Analysis, Design, Implementation, Testing, and Maintenance.
  • Designed the front end using JSP, jQuery, CSS, and HTML as per the requirements that are provided.
  • Developed Responsive web application for the backend system using AngularJS with HTML and CSS.
  • Used Spring AOP to enable the log interfaces and cross-cutting concerns.
  • Developed payment flow using AJAX partial page refresh, validation, and dynamic drop-down list.
  • Responsible for use case diagrams, class diagrams and sequence diagrams using Rational Rose in the Design phase.
  • Used Spring Core for dependency injection/Inversion of Control (IOC) to have loose-coupling.
  • Implemented application using MVC architecture integrating Hibernate and spring frameworks.
  • Implemented the Enterprise JavaBeans (EJBs) to handle various transactions and incorporated the validation framework for the project.
  • Designed and developed all UI Screens using Java Server Pages (JSP), Static Content, HTML, CSS and JavaScript.
  • Worked on server-side web applications using Node.js and involved in Construction of UI using jQuery, Bootstrap and JavaScript.
  • Developed Database access components using Spring DAO integrated with Hibernate for accessing the data.
  • Integrate the dynamic pages with AngularJS to make the pages dynamic.
  • Developed Custom Tags and JSTL to support custom user interfaces.
  • Used CSS style sheets for presenting data from XML documents and data from databases to render on HTML web pages.
  • Used Spring Framework for integrating Hibernate for dependency injection.
  • Extensively used Eclipse for writing code.
  • Used Maven as a build tool and deployed on WebSphere Application Server.
  • Developed test cases on JUnit.
  • Used Log4J for logging and tracing the messages.

Environment: HTML, CSS, JavaScript, jQuery, AngularJS, AJAX, Bootstrap, XML, Maven, Eclipse, JUnit

We'd love your feedback!