We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

3.00/5 (Submit Your Rating)

Arlington, VA

SUMMARY

  • Over 9+ years of IT experience in analysis, design and development using Hadoop, HDFS, Hortonworks, MapReduce and Hadoop Ecosystem (Pig, Hive, Impala and Spark, Scala), Java and J2EE.
  • Experience with Hadoop distribution and Hortonworks Ambari to install, configure monitor on Hadoop clusters.
  • Experience in writing complex SQL queries, creating reports and dashboards.
  • Experience in using Agile, Waterfall methodologies including extreme programming, SCRUM and Test - Driven Development (TDD).
  • Experience on importing and exporting the data from HDFS and Hive into Relational Database Systems like MySQL and vice versa using Sqoop and Spark.
  • Experience on developing MapReduce jobs for data cleaning and data manipulation as required for the business.
  • Installation, configuration and administration experience in Big Data platforms Cloudera Manager of Cloudera.
  • Experience in moving data into and out of the HDFS and Relational Database Systems (RDBMS) using Apache Sqoop.
  • Significant experience writing custom UDF's in Hive and custom Input Formats in MapReduce.
  • Expertise in Object Oriented Analysis and Design (OOAD) and knowledge in Unified Modeling Language (UML).
  • Experience in using IDEs like Eclipse, NetBeans and IntelliJ.
  • Good knowledge in using job scheduling and monitoring tools like Oozie and Zookeeper.
  • Hands on experience working on NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster & Kubernetes cluster.
  • Proficient with Cluster management and configuring Cassandra Database.
  • Extensive experience in developing Pig Latin Scripts and using Hive Query Language for data analytics.
  • Expert in developing web page interfaces using JSP, Java Swings, and HTML scripting languages.
  • Extensive experience in developing and deploying applications using Web Logic, Apache Tomcat and JBOSS.
  • Knowledge in configuration and managing - Cloud era’s Hadoop platform along with CDH3&4 clusters.
  • Experience working with spring and Hibernates frameworks for Java.
  • Expertise in Big Data architecture like Hadoop (Azure, Hortonworks, Cloudera) distributed system, MongoDB, NoSQL.
  • Expertise in using J2EE application servers such as IBM WebSphere, JBoss and web servers like Apache Tomcat.
  • Experience in working with build tools like Ant, Maven, SBT, and Gradle to build and deploy applications into server.
  • Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP).
  • Experience on Hadoop/Hive on AWS, using both EMR and non-EMR-Hadoop in EC2.
  • Hands on experience in installing, configuring and using Apache Hadoop ecosystem components.
  • Strong technical, administration and mentoring knowledge in Linux and Big data/Hadoop technologies.
  • Expertise in J2EE technologies like JSP, Servlets, EJBs, JDBC, JNDI and AJAX.
  • Expertise in applying Java Messaging Service (JMS) for reliable information exchange across Java applications.
  • Good usage of Apache Hadoop along enterprise version of Cloudera and Hortonworks.
  • Good Knowledge on MAPR distribution & Amazon's EMR.
  • Execute faster MapReduce functions using Spark RDD for parallel processing or referencing a dataset in HDFS, HBase and other data sources

TECHNICAL SKILLS

Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper

Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory

Hadoop Distributions: Cloudera, Hortonworks, MapR

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Databases: Oracle 12c/11g, SQL

Operating Systems: Linux, Unix, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB, Accumulo

Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential, Arlington, VA

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Responsible for building and configuring distributed data solution using MapR distribution of Hadoop.
  • Expertise inBigDataarchitecture like Hadoop (Azure, Hortonworks, Cloudera) distributed system, MongoDB, NoSQL.
  • Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data.
  • Involved in Agile methodologies, daily scrum meetings, sprint planning.
  • Installed and configured Hive, HDFS and the Nifi, implemented CDH cluster. Assisted with performance tuning and monitoring.
  • Worked on Installing and configuring the HDP Hortonworks and Cloudera Clusters in Dev and Production Environments.
  • Worked with NoSQL databases HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
  • Multiple Spark Jobs were written to perform Data Quality checks on data before files were moved to Data Processing Layer.
  • Migrated complex MapReduce programs into Spark RDD transformations, actions.
  • Implemented the Cassandra and manage of the other tools to process observed running on over Yarn.
  • Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and spark.
  • Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Export result set from Hive to MySQL using Sqoop export tool for further processing.
  • Evaluated the performance of Apache Spark in analyzing genomic data.
  • Implemented Hive complex UDF's to execute business logic with Hive Queries.
  • Prepared Linux shell scripts for automating the process.
  • Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Load and transform large sets of structured, semi structured, and unstructured data with MapReduce, Hive, and Pig.
  • Managed and reviewed huge Hadoop log files. Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting data cleansing.
  • Created, technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
  • Worked on importing data from MySQL DB to HDFS and vice-versa using Sqoop to configure Hive Metastore with MySQL, which stores the metadata for Hive tables.
  • Loaded the customer's data and event logs from Kafka into HBase using REST API.
  • Used hive data warehouse modeling to interface with BI tools such as Tableau from Hadoop also, enhance the existing applications.
  • Built the automated build and deployment framework using Jenkins, Maven etc.
  • Written Scala based Spark applications for performing various data transformations, De-normalization, and other custom processing.
  • Performed tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Created a multi-threaded Java application running on edge node for pulling the raw click stream data from FTP servers.
  • Worked on conversion of Hive/ SQL queries into Spark transformations using Spark RDDs and data frames.
  • Involved in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Deployed data from various sources into HDFS and building reports using Tableau.
  • Extended Hive and Pig core functionality by writing custom UDF’s using Java.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.

Environment: Hadoop 3.0, HDFS, Nifi, Hive 2.3, Pig 0.17, XML, Spark, python, MySQL, Sqoop 1.4, NoSQL, HBase 1.2, Kafka 1.0, Tableau, Hortonworks, Cassandra 3.11, Yarn, Maven, Cloudera, Jenkins, Scala 2.12, Java.

Confidential - Boston, MA

Big Data/Hadoop Developer

Responsibilities:

  • Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.
  • Worked on all activities related to the development, implementation and support for Hadoop.
  • Designed custom re-usable templates in Nifi for code reusability and interoperability.
  • Involved in Installing, Configuring Hadoop Eco System, and Cloudera Manager using CDH4 Distribution.
  • Worked with teams in setting up AWS EC2 instances by using different AWS services like S3, EBS, and Elastic Load Balancer, Auto scaling groups, VPC subnets and Cloud Watch.
  • Managed data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Worked with Kafka streaming tool to load the data into HDFS and exported it into MongoDB database.
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Implemented multiple MapReduce Jobs in java for data cleansing and pre-processing.
  • Wrote complex Hive queries and UDFs in Java and Python.
  • Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters and Experience in converting MapReduce applications to Spark.
  • Worked with Hadoop eco system covering HDFS, HBase, YARN and MapReduce.
  • Used Scala and Spark-SQL to develop spark code for faster processing, testing and performed complex Hive queries on Hive tables.
  • Wrote and execute SQL queries to work with structured data available in relational databases and to validate the transformation/ business logic.
  • Used Flume to move data from individual data sources to Hadoop system.
  • Responsible for design & development of Spark SQL Scripts using Scala/Java based on Functional Specifications.
  • Worked with NoSQL Cassandra to store, retrieve, and update and manage all the details for Ethernet provisioning and customer order tracking.
  • Analyzed the data by performing Hive queries (HiveQL), ran Pig scripts, Spark SQL and Spark streaming.
  • Developed tools using Python, Shell scripting, XML to automate some of the menial tasks
  • Wrote scripts in Python for extracting data from HTML file.
  • Implemented MapReduce jobs in Hive by querying the available data.
  • Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
  • Configured Hive Meta store with MySQL, which stores the metadata for Hive tables.
  • Performed tuning of Hive queries, MapReduce programs for different applications.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.

Environment: Apache Nifi 1.1, Hadoop 3.0, JSON, XML, Avro, HDFS, Teradata r15, Sqoop, Kafka, MongoDB, Hive 2.3, Pig 0.17, HBase, Zookeeper, MapReduce, java, Python 3.6, Yarn, Flume, NoSQL, Cassandra 3.11

Confidential, Washington, DC

Sr. Spark Developer

Responsibilities:

  • Created Spark applications using Spark Data frames and Spark SQL API extensively.
  • Developed Java modules implementing business rules and workflows using Spring MVC, Web Framework.
  • Performed tuning J2EE apps, performance testing, analysis, and tuning.
  • Developed the Product Builder UI screens using Angular-JS, NodeJS, HTML5, CSS, and JavaScript.
  • Created Sqoop scripts to import user profile and sale order purchase information from Teradata to S3 data store.
  • Developed various Spark applications using Scala to perform various enrichments, aggregations and other business metrics processing click stream data along with user profile data.
  • Worked on fine-tuning spark applications/jobs to improve the efficiency and overall processing time for the pipelines.
  • Worked on development of Hibernate, including mapping files, configuration file and classes to interact with the database.
  • Installed and configured Hadoop, MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Worked extensively on the spring framework, implementing Spring MVC, Spring Security, IOC (dependency injection) and Spring AOP.
  • Used Angular JS, Bootstrap and AJAX to get the data from the server asynchronously by using JSON objects.
  • Used Oracle as Database and used Toad for queries execution and Involved in writing SQL scripts, PL/ SQL code for procedures and functions.
  • Developed Restful web services using Java, Spring Boot, NoSQL databases like MongoDB.
  • Developed classes using core java (multithreading, concurrency, collections, memory management) and some spring IOC.
  • Designed & developed the UI application using React JS, Rest, Spring MVC, Spring Data JPA, Spring Batch, Spring Integration & Hibernate.
  • Utilized Spark Scala API to implement batch processing of jobs
  • Used Broadcast variables in Spark, effective & efficient Joins, transformations and other capabilities for data processing.
  • Used Spark-SQL to perform event enrichment and to prepare various levels of user behavioral summaries.
  • Designed dynamic and browser compatible pages using HTML5, DHTML, CSS3 and JavaScript.
  • Used AngularJS and Bootstrap to consume service and populated the page with the products and pricing returned.
  • Used RESTFUL in conjunction with Ajax calls using JAX-RS and Jersey.
  • Designed and developed the Application using spring and Hibernate framework.
  • Developed Intranet Web Application using J2EE architecture, using JSP to design the user interfaces and Hibernate for database connectivity.
  • Extensively Worked with Eclipse as the IDE to develop, test and deploy the complete application.

Environment: Java, J2EE, JavaScript, HTML5, Hibernate 4.2, Hadoop 2.5, MapReduce, HDFS, Ajax, Oracle 11g, PL/ SQL, MongoDB, SQL, Scala, spark, Kafka, Storm, Zookeeper, Eclipse

Confidential - Mount Laurel, NJ

Java/J2ee Developer

Responsibilities:

  • Worked as a Java/J2EE Developer worked on middleware architecture using Java technologies like J2EE, Servlets, and application servers like Web Sphere and Web logic.
  • Worked as a Java/J2EE Developer to manage data and to develop web applications.
  • Implemented MVC architecture by separating the business logic from the presentation layer using spring.
  • Involved in Documentation and Use case design using UML modeling include development of Class diagrams, Sequence diagrams, and Use case diagrams.
  • Extensively worked on n-tier architecture system with application system development using Java, JDBC, Servlets, JSP, Web Services, WSDL, Soap, Spring, Hibernate, XML, SAX, and DOM.
  • Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
  • Developed UI using HTML, CSS, Bootstrap, JQuery, and JSP for interactive cross browser functionality and complex user interface.
  • Developed Service layer interfaces by applying business rules to interact with DAO layer for transactions.
  • Used Spring Framework for MVC for writing Controller, Validations and View.
  • Provided utility classes for the application using Core Java and extensively used Collection package.
  • Used Core Spring for Dependency Injection of various component layers.
  • Used SOA REST (JAX-RS) web services to provide/consume the Web services from/to down-stream systems.
  • Developed a web-based reporting for credit monitoring system with HTML, CSS, XHTML, JSTL, Custom tags using spring.
  • Developed user interface using JSP, JSP Tag libraries and Struts Tag Libraries to simplify the complexities of the application.
  • Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
  • Used the built tools Maven to build JAR & WAR files and ANT for clubbing all source files and web content in to war files.
  • Worked on various SOAP and RESTful services used in various internal applications.
  • Developed JSP and Java classes for various transactional/ non-transactional reports of the system using extensive SQL queries.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and Spark.
  • Implemented Storm topologies to pre-process data before moving into HDFS system.
  • Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
  • Involved in configuring builds using Jenkins with Git and used Jenkins to deploy the applications onto Dev, QA environments
  • Involved in unit testing, system integration testing and enterprise user testing using JUnit.
  • Used Maven to build, run and create Aerial-related JARs and WAR files among other uses.
  • Used JUnit for unit testing of the system and Log4J for logging.
  • Worked with production support team in debugging and fixing various production issues.

Environment: Java, spring 3.0, XML, Jenkins, Hibernate 3.0, JUnit, HTML 4.0.1, CSS, Angular.JS, Bootstrap, WebSphere, Maven 3.0, Eclipse, Spark, JQuery.

Confidential

Java Developer

Responsibilities:

  • Involved in the complete Software Development Life Cycle (SDLC) including Requirement Analysis, Design, Implementation, Testing and Maintenance.
  • Used core java to design application modules, base classes and utility classes.
  • Used Dependency Injection (DI) or Inversion of Control (IOC) In order to develop code for obtaining bean references in spring framework using annotations.
  • Involved in Implementation of the application by following the Java best practices and patterns.
  • Used both Java Objects and Hibernate framework to develop Business components to map the Java classes to the database.
  • Used spring framework for dependency injection, transaction management. Used Spring MVC framework controllers for Controllers part of the MVC.
  • Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
  • Used Spring Framework for MVC for writing Controller, Validations and View.
  • Used Eclipse as IDE for development of the application.
  • Built data-driven Web applications with server side Java technologies like Servlets/JSP and generated dynamic Web pages with Java Server Pages (JSP)
  • Involved in mapping of data representation from MVC model to Oracle Relational data model with a SQL-based schema using Hibernate, object/relational-mapping (ORM) solution.
  • Used Spring IOC framework to integrate with Hibernate.
  • Implemented Maven Script to create JAR & dependency JARS and deploy the entire project onto the Weblogic Application Server.
  • Coded JavaBeans and implemented Model View Controller (MVC) Architecture.
  • Developed Client applications to consume the Web services based on both SOAP and REST protocol.
  • Utilized log4j for logging purposes and debug the application.
  • Created and implemented Oracle Queries, functions using SQL and PL/SQL.
  • Involved in bug fixing during the System testing, Joint System testing and User acceptance testing.
  • Worked on various SOAP and RESTful services used in various internal applications.
  • Consumed REST based Micro services with Rest template based on RESTful APIs.
  • Developed front end web application using AngularJS along with cutting edge HTML and CSS.
  • Developed processing component to retrieve customer information from MySQL database, developed DAO layer using Hibernate.
  • Used Maven for developing build scripts and deploying the application onto WebLogic.

Environment: Java, spring, Hibernate, MVC, POJO, WebSphere, Eclipse, Maven, JavaBeans, SOAP, log4j, SQL, PL/SQL, CSS, MySQL

We'd love your feedback!