We provide IT Staff Augmentation Services!

Sr. Big Data Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Shelton, CT

SUMMARY:

  • Overall 8+ years of professional IT industry experience encompassing wide range of skill set in Big Data technologies.
  • Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
  • Excellent experience in AWS, Cloudera maintaining and optimized AWS infrastructure (EC2 and EBS) also good knowledge in MS Azure.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • Experience in developing custom UDF's for Pig and Apache Hive to in corporate methods and functionality of Java into Pig Latin and HiveQL.
  • Experience in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
  • Strong knowledge on Hadoop eco systems including HDFS, Hive, Sqoop, Zookeeper etc.
  • Experience on developing Java MapReduce jobs for data cleaning and data manipulation as required for the business.
  • Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
  • Extensive experience with advanced J2EE Frameworks such as spring, Struts, JSF and Hibernate.
  • Deploying Customized UDFs in java to extend HIVE and Pig core functionality.
  • Expertise in Web pages development using JSP, HTML, Java Script, JQuery and Ajax.
  • Involving in efficiently converting JSON files to XML, and CSV files in Talend.
  • Working on Bootstrap, AngularJS and NodeJS, knockout, ember, Java Persistence Architecture (JPA).
  • Experience on developing Spark batch applications to ingest data into common data lake using Scala
  • Good understanding of Amazon Web services to design data pipeline using various services.
  • Excellent experience in installing and running various Oozie workflows and automating parallel job executions.
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
  • Excellent networking and communication with all levels of stakeholders as appropriate, including executives, application developers, business users, and customers
  • Expertise in developing production ready Spark applications utilizing Spark - Core, Data frames, Spark-SQL, Spark-ML and Spark-Streaming API's.
  • Used Data Frame API in Scala for converting the distributed collection of data organized into named columns.
  • Proven ability to manage all stages of project development Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig,Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1,Zookeeper

Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL DataWarehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory

Hadoop Distributions: Cloudera, Hortonworks, MapR

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0,JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML5, CSS3, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Databases: Oracle 12c/11g, SQL

Operating Systems: Linux, Unix, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB, Accumulo

Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS

WORK EXPERIENCE:

Confidential, Shelton, CT

Sr. Big Data Hadoop Developer

Responsibilities:

  • Worked as a Big Data Developer, I worked on Hadoop eco-systems including Hive, HBase, Oozie, Pig, Zookeeper, Spark Streaming MCS and so on with MapR distribution.
  • Designed and deployed full SDLC of Hadoop cluster based on client's business need.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and spark.
  • Performed transformations / analysis by writing complex HQL queries in Hive and exported result to HDFS in discrete file format
  • Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
  • Developed data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Implemented Security in Web Applications using Azure and deployed Web Applications to Azure.
  • Participated in maintaining data integrity between Oracle and SQL databases.
  • Used RESTful web services with MVC for parsing and processing XML data.
  • Managed real time data processing and real time Data Ingestion in MongoDB and Hive using Storm.
  • Installed and configured Hortonworks Ambari for easy management of existing Hadoop cluster.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Developed PL/SQL scripts to validate and load data into interface tables
  • Designed, developed and maintained Big Data streaming and batch applications using Storm.
  • Export result set from Hive to MySQL using Sqoop export tool for further processing.
  • Developed and designed data integration and migration solutions in Azure.
  • Integrated Apache Kafka with Elastic search using Kafka Elastic search Connector to stream all messages from different partitions and topics into Elastic search for search and analysis
  • Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
  • Implemented the Cassandra and manage of the other tools to process observed running on over Yarn.
  • Developed customized classes for serialization and Deserialization in Hadoop.
  • Worked on Apache Nifi as ETL tool for batch processing and real time processing.
  • Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
  • Used Spring JDBC Dao as a data access technology to interact with the database.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.

Environment: Hadoop3.0, Oozie4.3 Pig 0.17, Zookeeper3.4, Hive 2.3, HBase1.2, Jenkins2.1, Azure, Git hub, Map Reduce, Spark2.4, Cassandra 3.0, flume1.8, Sqoop1.4, Oracle 12c, XML, MongoDB4.0, Teradata r15, SQL, PL/SQL, MySQL8.0, Kafka 1.1, Elastic search 6.6, HDFS, ETL, Zookeeper 3.4, Scala2.1.

Confidential, Lowell, AR

Spark/Hadoop Developer

Responsibilties:

  • Created Spark applications using Spark Data frames and Spark SQL API extensively.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
  • Prepared Spark builds from MapReduce source code for better performance.
  • Involved in importing the real-time data to Hadoop using Kafka and implemented Oozie jobs for daily imports.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Implemented NiFi flow topologies to perform cleansing operations before moving data into HDFS.
  • Used Spark API over Hortonworks, Hadoop YARN to perform analytics on data in Hive.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
  • Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
  • Used parquet file format for published tables and created views on the tables.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Implemented POC to migrate MapReduce jobs into SparkRDD transformations using SCALA
  • Written event-driven, link tracking system to capture user events and feed to Kafka to push it to HBase
  • Used Maven in building the application and auto deploying it to the environment.
  • Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
  • Developed several REST web services which produces both XML and JSON to perform tasks, leveraged by both web and mobile applications.
  • Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
  • Worked with teams in setting up AWS EC2 instances by using different AWS services like S3, EBS, and Elastic Load Balancer, Auto scaling groups, VPC subnets and Cloud Watch.
  • Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
  • Worked on No-SQL databases like Cassandra, MongoDB for POC purpose in storing images and URIs.
  • Coordinated with Hortonworks support team through support portal to sort out the critical issues during upgrades.
  • Migrated complex MapReduce programs into Spark RDD transformations, actions.
  • Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case
  • Used Spring Framework for logging implementation and extensively used Spring AOP to reduce cross cutting concerns.
  • Involved in Unit integration, bug fixing, acceptance testing with test cases, Code reviewing
  • Written ad-hoc queries using Presto and Impala and used Impala analytical functions

Environment: Spark2.4, Scala2.1, Oozie4.3, Kafka1.1, Hadoop3.0, MySQL8.0, YARN, HBase1.2, AWS, Map Reduce, Maven2.0, XML, JSON, PL/SQL, Sqoop1.4, Cassandra3.0, MongoDB4.0.

Confidential, Austin, TX

Hadoop Developer

Responsibilities:

  • Involved in Installing, Configuring Hadoop Eco-System, Cloudera Manager using CDH3, CDH4Distributions.
  • Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
  • Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
  • Created conversion scripts using Oracle SQL queries, functions and stored procedures, testcases and plans before ETL migrations.
  • Worked on Spark for in memory commutations and comparing the Data Frames for optimizing performance.
  • Developed Java modules implementing business rules and workflows using Spring MVC, Web Framework.
  • Involved in loading and transforming large sets of Structured, Semi-Structured andUnstructured data and analyzed them by running Hive queries and Pig scripts
  • Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
  • Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
  • Installed and configured Hadoop Ecosystem components and Cloudera manager using CDH distribution.
  • Used Open Source packages, designed POC to demonstrate Integration of Kafka/Flume with Spark Streaming for real-time data Ingestion and processing.
  • Developed Shell scripts to read files from edge node to ingest into HDFS partitions based on thefile naming pattern.
  • Exported of result set from HIVE to MySQL using Sqoop export tool for further processing.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Involved in creating Hive tables, loading with data and writing hive queries which runs internally in MapReduce way.

Environment: Hadoop2.3, ETL, Oracle, HDFS, MySQL, Sqoop, Hive 2.1 Zookeeper, Pig, Spring MVC4, Scala, Map Reduce.

Confidential, McLean, VA

Java/J2EE Developer

Responsibilities:

  • Developed extensive additions to existing Struts/Java/J2EE Web Application utilizing Service Oriented Architecture (SOA) techniques.
  • Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using spring.
  • Performed tuning J2EE apps, performance testing, analysis, and tuning.
  • Developed User Interface having animations and effects using JSF, JavaScript and HTML.
  • Used Hibernate for mapping POJO's to relational database tables using XML files.
  • Used Java Servlets, JSPs, AJAX, HTML and CSS for developing the Web component of the application.
  • Developed RESTful web services using Java, Spring Boot, databases like PostgreSQL.
  • Created several JSP's and populated them with data from databases using JDBC.
  • Used Object/Relational mapping Hibernate framework as the persistence layer for interacting with Database.
  • Developed the Product Builder UI screens using Angular-JS, NodeJS, HTML5, CSS, and JavaScript.
  • Written Hibernate annotation-based mapping Java classes with Oracle Database tables.
  • Designed, developed and maintained the data layer using Hibernate and performed configuration of Struts Application Framework.
  • Used log4j for logging and SVN for versioncontrol.
  • Implemented log4j for application logging and to troubleshoot issues in debug mode.
  • Involved in developing test cases using JUnit testing during development mode.
  • Used Eclipse as Java IDE tool for creating various J2EE artifacts like Servlets, JSP's and XML.
  • Used JMS (Asynchronous/Synchronous) for sending and getting messages from the MQ Series.
  • Created and implemented Queries insert, update and delete operation for the Cassandra and mongoDB database.
  • Implemented SOA architecture with Web Services using SOAP, WSDL and XML to integrate other legacy systems.
  • Involved in using continuous integration tool Jenkins to push and pull the project code into GitHub repositories.
  • Involved in Unit Testing and Bug-Fixing and achieved the maximum code coverage using JUNIT test cases.

Environment: Java 8, J2EE, POJO's, XML, Spring Boot, HTML5, log4j, SOA, WSDL, XML, Jenkins, JUNIT, JUnit, SOAP, Hibernate, XML, JMS.

Confidential

Java Developer

Responsibilities:

  • Used JavaScript to update content in the database and manipulate files and generated JAVA and Spring MVC Forms to record data of online users.
  • Developed and tested many features for dashboard, created using Bootstrap, CSS, and JavaScript.
  • Developed single page applications using Angular.js, Implemented two-way data binding using AngularJS.
  • Used Spring DAO concept in order to interact with Database using JDBC template and Hibernate template.
  • Created an XML configuration file for Hibernate for Database connectivity.
  • Created Test suites in SOAP UI projects and created internal test cases depending on the requirement.
  • Used Maven for building and deploying the project on Web Sphere application server.
  • Involved in the deployment of the application using JBoss, WebLogic servers.
  • Involved in development of WebServices using REST for sending and getting data from the external interface in the JSON format.
  • Created reusable components using typescript on the client side in Angular Js, used fast data access purpose React Js, NodeJS.
  • Built web-based application using Spring MVC Architecture and REST Web-services.
  • Developed the custom Logging framework used to log transactions executed across the various applications using Log4j.
  • Used SVN as version control system for the source code and project documents.
  • Used React.js to render changing currency spreads and to dynamically update the DOM
  • Developed Enterprise Java Beans (EJB) with both State Less Session beans and Entity beans using CMP.
  • Used XSL/ XSLT for Transforming and displaying reports. Developed DTD's for XML.
  • Developed common business-related custom tag using JSP and published to rest of the teams. Developed user interface using JSP, Struts tag library.
  • Implemented various designpatterns like singleton, data access object, data transfer object, MVC design pattern.
  • Developed many JSP pages, used Dojo in JavaScript Library, jQuery UI for client-side validation.

Environment: JAVA7, Bootstrap, JavaScript, Angular.js, JDBC, Hibernate3.0, XML, SOAP, ReactJs, NodeJS, Angular Js, JSON, DOM, EJB, Log4j, Dojo, jQuery.

We'd love your feedback!