We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

5.00/5 (Submit Your Rating)

Charlotte, NC

SUMMARY:

  • Over 9+ years of professional IT work experience in Analysis, Design, Administration, Development, Deployment and Maintenance of critical software and Big data applications.
  • Hands on experience in developing and deploying enterprise based applications using major Hadoop ecosystem components like MapReduce, YARN, Hive, Pig, HBase, Flume, Sqoop, Spark, Streaming, Spark SQL, Storm, Kafka, Oozie and Cassandra.
  • Hands on experience in using MapReduce programming model for Batch processing of data stored in HDFS.
  • Experience in using Maven for building and deploying J2EE Application archives (Jar and War) on Web Logic, IBM Web Sphere.
  • Experience in installing, customizing and testing the Hadoop Eco Systems such as Hive, Pig, Sqoop, Pyspark, Oozie etc.
  • Exposure to administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig
  • Installed and configured multiple Hadoop clusters of different sizes and with ecosystem Components like Pig, Hive, Sqoop, Flume, HBase, Oozie and Zookeeper.
  • Worked on all major distributions of Hadoop Cloudera and Hortonworks.
  • Responsible for designing and building a Data Lake using Hadoop and its ecosystem components.
  • Deployed various Micro services like Spark, MongoDB, and Cassandra in Kubernetes and Hadoop clusters using Docker.
  • Handled Data Movement, data transformation, Analysis and visualization across the lake by integrating it with various tools.
  • Defined extract - translate-load (ETL) and extract-load-translate (ELT) processes for the Data Lake.
  • Good knowledge of Hadoop ecosystems, HDFS, Big Data, ETL Concepts, RDBMS
  • Good Expertise in Planning, Installing and Configuring Hadoop Cluster based on the business needs.
  • Good experience in working with cloud environment like Amazon Web Services (AWS) EC2 and S3.
  • Strong background in Java/J2EE environments. Well experienced in MVC architecture of Spring framework.
  • Transformed and aggregated data for analysis by implementing workflow management of Sqoop, Hive and Pig scripts.
  • Experience in using PySpark for big Data Analysis.
  • Experience in retrieving data from databases like MYSQL, Teradata, Informix, DB2 and Oracle into HDFS using Sqoop and ingesting them into HBase and Cassandra.
  • Experience writing Oozie workflows and Job Controllers for job automation.
  • Integrated Oozie with Hue and scheduled workflows for multiple Hive, Pig and Spark Jobs.
  • In-Depth knowledge of Scala and Experience building Spark applications using Scala.
  • Adequate knowledge of Scrum, Agile and Waterfall methodologies.
  • Designed and developed multiple J2EE Model MVC based Web Application using J2EE.
  • Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL, and SQL Plus.
  • Experience in writing Complex SQL queries, PL/SQL, Views, Stored procedure, triggers.
  • Experience in OLTP and OLAP design, development, testing and support of Data warehouses.
  • Good experience in optimizing MapReduce algorithms using Mappers, Reducers, combiners and practitioners to deliver the best results for the large datasets.
  • Operated on Java/J2EE systems with different databases, which include Oracle, MySQL and DB2.
  • Good knowledge on build tools like Maven, Log4j, and Ant.
  • Hands on experience in using various Hadoop Cloudera, Hortonworks, MapR, IBM Big Insights, Apache and Amazon EMR Hadoop distributions.
  • Knowledge on installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera, and Hortonworks and on Amazon web services (AWS).

TECHNICAL SKILLS:

Languages: C, C++, Python, PL/SQL, Java, HiveQL, Pig Latin, Scala, UNIX shell scripting.

Hadoop Ecosystem: HDFS, YARN, Spark Core, Spark SQL, Spark Streaming, Scala, Map Reduce, Hive2.3, Pig 0.17, Zookeeper 3.4.11, Sqoop 1.4, Oozie 4.3, Bedrock, Apache Flume 1.8, Kafka 2.0, Impala 3.0, Nifi, MongoDB, HBase.

Oracle 12c, MS: SQL Server 2017, MySQL, PostgreSQL, NoSQL (HBase, Cassandra 3.11, MongoDB), Teradata r14.

Tools: Eclipse 4.8, NetBeans 9.0, Informatica, IBM Data Stage, Talend, Maven, Jenkins 2.12.

Hadoop Platforms: Hortonworks, Cloudera, Azure, Amazon Web services (AWS)

Operating Systems: Windows XP/2000/NT, Linux, UNIX.

Amazon Web Services: Redshift, EMR, EC2, S3, RDS, Cloud Search, Data Pipeline, Lambda.

Version Control: GitHub, SVN, CVS.

Packages: MS Office Suite 2016, MS Vision, MS Project Professional.

SDLC Methodologies: Agile, Waterfall, Scrum

Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB

WORK EXPERIENCE:

Confidential, Charlotte, NC

Sr. Big Data/Hadoop Developer

Responsibilities:

  • As a Sr. Big Data/Hadoop Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
  • Developed Big Data solutions focused on pattern matching and predictive modeling.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Developed Apache Spark 2.0 applications with PySpark using RDD transformations and actions and Spark SQL.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Used XML and JSON for transferring/retrieving data between different Applications.
  • Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
  • Used Java Persistence API (JPA) framework for object relational mapping which is based on POJO Classes.
  • Extracted the data from legacy systems into staging area using ETL jobs & SQL queries.
  • Designed & Developed a Flattened View (Merge and Flattened dataset) de-normalizing several Datasets in Hive/HDFS which consists of key attributes consumed by Business and other down streams.
  • Used Hive as ETL tool to do transformations and some pre-aggregations before storing the data onto HDFS
  • Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
  • Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
  • Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
  • Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
  • Extensively used JQuery to provide dynamic User Interface and for the client side validations.
  • Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
  • Used My Eclipse IDE tool and deployed the application in Bea Web logic Application Server using ANT Scripts.
  • Designed solution for various system components using Microsoft Azure.
  • Worked on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting
  • Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
  • Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
  • Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
  • Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
  • Developed a fully functional prototype application using JavaScript and Bootstrap, connecting to a Restful server on a different domain.
  • Developed Nifi flows dealing with various kinds of data formats such as XML, JSON, Avro.
  • Developed and designed data integration and migration solutions in Azure.
  • Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Implemented MapReduce jobs in HIVE by querying the available data.
  • Configured Hive Meta store with MySQL, which stores the metadata for Hive tables.
  • Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
  • Performance tuning of Hive queries, MapReduce programs for different applications.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Collaborated with business users/product owners/developers to contribute to the analysis of functional requirements.
  • Used ANT as build tool and developed build file for compiling the code of creating WAR files.
  • Worked on MongoDB, HBase databases which differ from classic relational databases
  • Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
  • Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
  • Integrated Kafka-Spark streaming for high efficiency throughput and reliability
  • Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
  • Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts.

Environment: Hadoop 3.0, Hive 2.3, MongoDB, Zookeeper, MapR, Pig 0.17, HBase, Agile, Azure, Jenkins, XML, JSON, HDFS, NoSQL, MapReduce, YARN, Oozie, Eclipse, ANT, Spark, Oracle 12c, PL/SQL, JavaScript, Bootstrap, Nifi, MYSQL, Scala, Flume, Kafka

Confidential, Troy, NY

Big Data/Hadoop Developer/Tester

Responsibilities:

  • As a Big Data/Hadoop Developer involved in installing and configuration of Hadoop distribution systems.
  • Involved in complete implementation SDLC, specialized in writing custom Spark, Hive programs.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
  • Involved in designing the row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
  • Developed MapReduce (YARN) jobs for cleaning, accessing and validating the data.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Implemented usage of Amazon EMR for processing Big Data across Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
  • Worked with Apache Nifi to Develop Custom Processors for the purpose of processing and disturbing data among cloud systems.
  • Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
  • Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data using several tools.
  • Imported the data from various formats like JSON, Sequential, Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization.
  • Configured Hive and written Hive UDF's and UDAF's Also, created partitions such as Static and Dynamic with bucketing.
  • Importing and exporting data into HDFS and hive using Sqoop and Kafka with batch and streaming.
  • Created Hive tables, loaded the data and Performed data manipulations using Hive queries in MapReduce Execution Mode.
  • Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
  • Developed Spark scripts by using Python shell commands as per the requirement.
  • Using Hive join queries to join multiple tables of a source system and load them to Elastic search tables.
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in MongoDB.
  • Creating Hive tables and working on them using Hive QL.
  • Designed and Implemented Partitioning (Static, Dynamic) Buckets in Hive.
  • Developed multiple POCs using PySpark and deployed on the YARN cluster, compared the performance of Spark, with Hive
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Used Apache Spark on Yarn to have fast large scale data processing and to increase performance.
  • Responsible for design & development of Spark SQL Scripts using Scala/Java based on Functional Specifications.
  • Worked on Cluster co-ordination services through Zookeeper.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
  • Developed Python scripts to find vulnerabilities with SQL Queries by doing SQL injection.

Environment: Hadoop 3.0, Spark, Hive 2.3, MapReduce, Java, Agile, MongoDB, JSON, HBase, YARN, Sqoop, Apache Nifi, HDFS, Sqoop, Kafka, Python, AWS, MongoDB, YARN, Zookeeper, Java, Maven, Jenkins

Confidential, Hillsboro, OR

Sr. Java/Hadoop Developer

Responsibilities:

  • Research and recommend suitable technology stack for Hadoop migration considering current enterprise architecture.
  • Involved in installation, configuration and Design of Hadoop Distributed using Cloudera and Hortonworks of Hadoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Used Spring framework for Dependency Injection (IOC) and integrated with Hibernate framework.
  • Involved in the design by preparing UML diagrams using Microsoft Visio tool.
  • Designed and Developed application modules using spring and Hibernate frameworks.
  • Designed and developed the front-end with Swings and Spring MVC framework, Tag libraries and Custom Tag Libraries.
  • Used Hibernate to develop persistent classes following ORM principles.
  • Deployed spring configuration files such as application context, application resources and application files.
  • Worked with Maven for build scripts and Setup the Log4J Logging framework.
  • Responsible for developing various modules, front-end and back-end components using several design patterns based on client's business requirements.
  • Experienced with batch processing of data sources using Apache Spark and Elastic search.
  • Worked on analyzing Hadoop stack and different big data analytic tools including Pig, Hive, HBase database and Sqoop.
  • Built the automated build and deployment framework using GitHub and Maven etc.
  • Used JQuery to make the frontend components interact with the JavaScript functions to add dynamism to the web pages at the client side.
  • Worked with AngularJS filters in expressions and directives to filter data rendered in the UI.
  • Created Hibernate mapping files, sessions, transactions, HQL Queries to fetch data from database.
  • Developed front end using JSP and JavaScript pages as per the client requirements.
  • Used JDBC statements, prepared statements and callable statements to get data from database.
  • Worked on presentation layer to develop JSP pages and embedding CSS, JavaScript, DOM and HTML.
  • Developed internal coding using J2EE technologies based on the MVC Architecture.
  • Implemented Business Logic using Java, Spring, and Hibernate.
  • Developed Business objects using POJOs and data access layer using Hibernate framework.

Environment: Hadoop, Spring, Hibernate, Microsoft Visio, MVC, Maven, Apache Spark, Elastic Search, Pig, Hive, HBase Sqoop, Maven, JQuery, JavaScript, AngularJS, JSP, HTML, J2EE, POJO

Confidential, Durham, NC

Java/J2EE Developer

Responsibilities:

  • Used Springs JDBC and DAO layers to offer abstraction for the business from the database related code (CRUD).
  • Created RESTful web services interface to Java-based runtime engine.
  • Developed the persistence layer using Hibernate Framework, created the POJO objects and mapped using Hibernate annotations.
  • Developed Maven scripts to build and deploy the application in the WebSphere Application Server.
  • Developed Unit test cases using JUnit and involved in User Acceptance Testing and Bug Fixing.
  • Designed and developed Class diagrams and sequence diagrams using Unified Modeling Language (UML).
  • Worked on JSP, HTML, JQuery, CSS, and JavaScript for developing the GUI of the application.
  • Developed Hibernate POJO Classes, Hibernate Configuration file and Hibernate Mapping files
  • Involved in the configuration of Spring Framework and Hibernate mapping tool.
  • Worked on Java Messaging Services (JMS) for developing messaging services to interact with different application modules.
  • Used Maven for building and deploying the web application into WebSphere and configuring the dependency plug-ins and resources.
  • Wrote JUnit test cases for each and every line of the application code and performed validation
  • Configured JNDI resources, Data Base resources, JMS and other configurations on the Application Server
  • Used Eclipse IDE for writing code and BEA Web logic as application server.
  • Used JDBC, MQ Series, Web Service, and Hibernate framework to access the data from back-end MS SQL database server.
  • Used Java Persistence API (JPA) for managing relational data mapping.
  • Used Hibernate object relational data mapping framework to persist and retrieve the data from database.
  • Wrote SQL queries, stored procedures, and triggers to perform back-end database operations.
  • Developed ANT Scripts to do compilation, packaging and deployment in the WebSphere server.
  • Implemented various J2EE design patterns in the project such as Factory, Singleton, Service Locator and Data Access Object.
  • Implemented the business layer by using Hibernate with Spring DAO and developed mapping files and POJO java classes using ORM tool.
  • Used JSF Validators for validating the server-side data using Springs and Hibernate framework.
  • Participated in object-oriented design, development and testing of REST APIs using Java.

Environment: JDBC, Springs, CRUD, RESTful, Hibernate, POJO, Maven, JUnit, JSP, HTML, JQuery, CSS, JavaScript, JMS, JUnit, JNDI, Eclipse, SQL queries, ANT, J2EE

Confidential

Java/J2EE Developer

Responsibilities:

  • Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
  • Developed REST based Web services to allow communication between the applications.
  • Developed the application using Spring Framework that leverages classical Model View Controller (MVC) architecture.
  • Developed complete Web tier of the application with Spring MVC framework.
  • Created dynamic pages using HTML, CSS, JQuery and JavaScript for client-side validation.
  • Responsible for setting up Angular.js framework for UI development.
  • Extensively used Hibernate in data access layer to access and update information in the database.
  • Used Spring IoC Framework for Dependency injection, Spring MVC, Web services, Spring security.
  • Used spring integration for integrating the application with Micro services using Spring integration workflow files.
  • Developed JSP pages and client side validation using JavaScript tags.
  • Developed user interface using JSP with Java Beans, JSTL and Custom Tag Libraries and Ajax to improve the performance of the application.
  • Extensively used HTML, JavaScript, Angular.js and Ajax for client side development and validations.
  • Developed views using Bootstrap components, Angular-UI and involved in configuring routing for various modules using angular UI router.
  • Worked on JBoss locally and Used JSF to build application and to create a page structure by arranging JSF components.
  • Used Spring framework for implementing Dependency Injection, Spring ORM.
  • Extensively used JavaScript and Angular.js to provide dynamic User Interface and for the client side validations.
  • Used JSP for presentation layer, developed high performance object/relational persistence and query service for entire application utilizing Hibernate.
  • Implemented Micro Services based Cloud Architecture using Spring Boot.
  • Used JUnit framework for unit testing of application and Log4j to capture the log that includes runtime exceptions.
  • Designed and developed business and persistence layer components using Spring IOC and Hibernate and Spring JDBC framework.

Environment: Java, J2EE, spring, MVC, HTML, CSS, JQuery, JavaScript, Angular.js, Hibernate, JSP, JSTL, Java Beans, Ajax, Bootstrap, JBoss, Log4j, JUnit

We'd love your feedback!