We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

3.00/5 (Submit Your Rating)

Shelton, CT

SUMMARY:

  • Around 8 years of professional IT experience which includes Java/J2EE, Big Data ecosystem related experience in developing Spark/Hadoop applications.
  • Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL)
  • Proficient in importing/exporting data from RDBMS to HDFS using Sqoop.
  • Used hive extensively to performing various data analytics required by business teams.
  • Solid experience in working various data formats like Parquet, Orc, Avro, Json etc.,
  • Experience automating end - to-end data pipelines with strong resilience and recoverability.
  • Strong knowledge of NoSQL databases and worked with HBase, Cassandra and Mongo DB.
  • Extensively used various IDE's like IntelliJ, NetBeans and Eclipse
  • Expert in SQL, extensively worked RDBMSs like Oracle, SQL Server, DB2, MySQL and Teradata
  • Worked with Apache Nifi to ingest the data into HDFS from variety of sources
  • Proficient and Worked with GIT, Jenkins and Maven.
  • Good understanding and Experience with Agile and Waterfall methodologies of Software Development Life Cycle (SDLC).
  • Operated on Java/J2EE systems with different databases, which include Oracle, MySQL and DB2.
  • Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
  • Build AWS secured solutions by creating VPC with private and public subnets.
  • Good working experience using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Good knowledge in developing data pipeline using Flume, Sqoop, and Pig to extract the data from weblogs and store in HDFS.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Good Exposure on Apache Hadoop MapReduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in building Pig scripts to extract, transform and load data onto HDFS for processing.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
  • Experience in Big Data analysis using PIG and HIVE and understanding of SQOOP and Puppet.
  • Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Load streaming log data from various web servers into HDFS using Flume.
  • Experience in scheduling Cron jobs on EMR, Kafka, and Spark using Clover Server.
  • Proficient in using RDMS concepts with Oracle, SQL Server and MySQL.
  • Hands on experience with build and deploying tools like Maven and GitHub using Bash scripting.
  • Hands on experience with spring tool suit for development of Scala Applications.
  • Experience in usage of Hadoop distribution like Cloudera and Hortonworks.
  • Experience in creating Hive tables and queries using HiveQL, with a good understanding of Hive concepts such as Partitioning, Bucketing and Joins.
  • Developed Pig Latin scripts to extract and load the data into HDFS.
  • Experience in migrating the data using Sqoop from Relational Database Systems to HDFS and vice-versa.
  • Experience in loading unstructured data (Log files, Xml data) into HDFS using Flume.
  • Hands on expertise with different file formats like JSON, XML, CSV.
  • Good exposure on setting up job streaming and scheduling with Oozie and working on messaging system such as Kafka integrated with Zookeeper.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Hive 2.3, Pig 0.17, Sqoop 1.4, Flume 1.8, Oozie 4.3, Spark 2.3, Kafka 1.1, Storm 1.0.5 and ZooKeeper 3.4

Languages: C, Java, Python 3.7, Scala 2.12, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC Struts, Spring, Hibernate 5.3.1

NoSQL Databases: HBase, Cassandra 3.11, MongoDB 4.0.0

HP: Unix, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML5, DHTML, XML, AJAX, WSDL

Web/Application servers: Apache Tomcat 9.0.10, WebLogic, JBoss

Databases: Oracle 12c, DB2, SQL Server, MySQL, Teradata r15

Tools: and IDE: Eclipse 4.8, NetBeans, Toad, Maven, ANT 1.10.3, Sonar, JDeveloper, DB Visualizer

Version control & Web Services: SVN, CVS, GIT, REST, SOAP

PROFESSIONAL EXPERIENCE:

Confidential - Shelton, CT

Sr. Big Data Developer

Responsibilities:

  • As a Big Data Developer, I worked on Hadoop eco-systems including Hive, Oozie, Pig, Zookeeper, Spark Streaming MCS (MapR Control System) and so on with MapR distribution.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Built code for real time data ingestion using Java, Map R-Streams (Kafka) and STORM.
  • Involved in various phases of development analysed and developed the system going through Agile Scrum methodology.
  • Worked on Apache Solr which is used as indexing and search engine.
  • Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
  • Migrated to Azure cloud and created end-to-end architecture for running in Cloud.
  • Worked on analysing Hadoop stack and different Big data tools including Pig and Hive database and Sqoop.
  • Developed data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
  • Used J2EE design patterns like Factory pattern & Singleton Pattern.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Implemented usage of Amazon EMR for processing Big Data across Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
  • Performed transformations, cleaning and filtering on imported data using Hive, MapReduce, Impala and loaded final data into HDFS.
  • Developed Python scripts to find vulnerabilities with SQL Queries by doing SQL injection.
  • Experienced in designing and developing POC's in Spark using Scala to compare the performance of Spark with Hive.
  • Responsible for coding MapReduce program, Hive queries, testing and debugging the MapReduce programs.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Imported weblogs & unstructured data using the Apache Flume and stores the data in Flume channel.
  • Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
  • Used RESTful web services with MVC for parsing and processing XML data.
  • Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Involved in loading data from UNIX file system to HDFS. Involved in designing schema, writing CQL's and loading data.
  • Designed and developed data factory pipeline which performs the data flow from on premise SQL to Azure Data Lake.
  • Built the automated build and deployment framework using Jenkins, Maven.

Environment: Hive 2.3, Oozie 4.3, Pig 0.17, Zookeeper, Spark 4.3, Hadoop 3.0, MapReduce, HDFS, Azure, J2EE

Confidential - West Point, PA

Hadoop Developer

Responsibilities:

  • Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in installing, configuring and managing Hadoop Ecosystem components like Hive, Pig, Sqoop, Kafka and Flume.
  • Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
  • Worked on analysing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
  • Ingested data into HDFS using SQOOP and scheduled an incremental load to HDFS.
  • Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDD's.
  • Developed data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase histories into HDFS for analysis.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Used Pig to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed into HBase.
  • Participated in development/implementation of Cloudera Hadoop environment.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage.
  • Worked with Zookeeper, Oozie, and Data Pipeline Operational Services for coordinating the cluster and scheduling workflows.
  • Designed and built the Reporting Application, which uses the Spark SQL to fetch and generate reports on HBase table data.
  • Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
  • Loaded the data into Simple Storage Service (S3) in the AWS Cloud.
  • Involved in running MapReduce jobs for processing millions of records.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • Developed Hive queries and Pig scripts to analyze large datasets.
  • Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.
  • Involved in generating the Ad-hoc reports using Pig and Hive queries.
  • Used Hive to analyze data ingested into HBase by using Hive-HBase integration and compute various metrics for reporting on the dashboard.
  • Provide operational support for Hadoop and/or MySQL databases
  • Developed job flows in Oozie to automate the workflow for pig and hive jobs.
  • Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting on the dashboard.

Environment: Hadoop 3.0, Hive 2.1, Pig 0.17, Sqoop, Kafka, Flume, MapReduce, Spark 4.1, Scala, AWS, HDFS

Confidential - Menlo Park, CA

Java/Hadoop Developer

Responsibilities:

  • Worked as Java/Hadoop Developer and responsible for taking care of everything related to the clusters.
  • Responsible for building scalable distributed data solutions using Hadoop cluster environment with Hortonworks distribution.
  • Developed Spark scripts by using Java, and Python shell commands as per the requirement.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and Persists into Cassandra database.
  • Developed Spark scripts by writing custom RDDs in Scala and Python for data transformations and actions on RDDs.
  • Worked on developing the application involving Spring MVC implementations and Restful web services.
  • Implemented some of the big data operations on AWS cloud
  • Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Involved in performance tuning of Spark jobs using Cache and using complete advantage of cluster environment.
  • Developed Spark scripts by using Scala Shell commands as per the requirement.
  • Configured, deployed and maintained multi-node Dev and Tested Kafka Clusters.
  • Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
  • Configured spark streaming data to receive real time data from Kafka and store it in HDFS.
  • Developed in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Worked with different file formats such as Text, Sequence files, Avro, ORC and Parquet.
  • Implemented the use of Amazon EMR for Big Data processing among a Hadoop Cluster of virtual servers on Amazon related EC2 and S3.
  • Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats.
  • Developed code using Core Java to implement technical enhancement following Java Standards.
  • Worked on Spark SQL and Data frames for faster execution of Hive queries using Spark SQL Context.
  • Used Data frames/ Datasets to write SQL type queries using Spark SQL to work with datasets sitting on HDFS.
  • Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
  • Worked on creating Hive tables and written Hive queries for data analysis to meet business requirements and experienced in Sqoop to import and export the data from Oracle & MySQL.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
  • Used Hibernate reverse engineering tools to generate domain model classes, perform association mapping and inheritance mapping using annotations and XML.
  • Involved in various NoSQL databases like HBase, Cassandra in implementing and integration.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Used Spark API over Hadoop Yarn as execution engine for data analytics using Hive.
  • Configured Continuous Integration system to execute suites of automated test on desired frequencies using Jenkins, Maven & GIT.

Environment: Hadoop, Pig, Hive, HBase, Oozie, Sqoop, Kafka, Spark, AWS, EC2, Scala, Zookeeper, HDFS, Oozie, JSON, XML, Oracle, MySQL, Cassandra, Jenkins, Maven, GIT

Confidential - Plano, TX

Java/J2ee Developer

Responsibilities:

  • Actively involved in requirement gathering, Designing of the application, development, code review and bug fixes.
  • Developed application using Java, J2EE, Spring Boot, AngularJS and REST web services.
  • Generated the flow diagram in the design phase using Microsoft visio.
  • Build Spring Boot Micro services for the delivery of software products across the enterprise.
  • Responsible for creating web-based applications using React JS, Node.js, and Redux workflow.
  • Build prototype for various required services such as Scheduling, Logging and Notification Service using third party Node JS based
  • Deployed Spring Boot based microservices in Docker and Amazon EC2 container using Jenkins
  • Developed UI using AngularJS, HTML, CSS, jQuery, Bootstrap, JavaScript, Ajax.
  • Developed and implemented Business Requirements using Spring MVC framework.
  • Used Maven to build and deploy the project.
  • Used GIT for source control, Bit Bucket for code review and Bamboo for continuous integration.
  • Implemented cross-cutting concerns such as logging, authentication and system performance using Spring AOP.
  • Developed reusable and interoperable services, based on SOAP, WSDL, JAXWS, JAXRPC Web services.
  • Develops microservices and has extensive experience using GitLab, Jenkins, clustering other tools and technologies.
  • Experienced on latest version of object-oriented JavaScript frameworks.
  • All the functionality is implemented using Spring Boot, Thyme leaf and Hibernate ORM.
  • Implemented integration with external SOAP and REST services.
  • Implementing Request/Response model for marshalling/unmarshalling JSON and CSV files with Jackson libraries.
  • Successfully delivered design and code using Scrum methodology in agile environment.
  • Used JUnit, Mockito, WebMvcTest and Spring Boot Test for developing unit test for controllers and Services.
  • Used SonarLint plugins for code coverage and to track the critical issues.
  • Build the application using TDD Test Driven Development approach.

Environment: Java, J2EE, Spring Boot, AngularJS, HTML, CSS, jQuery, Bootstrap, JavaScript, Ajax, Maven, Jenkins, JUnit, JSON

Confidential

Java Developer

Responsibilities:

  • Used Java Collection components (List, Hash Map) for caching data.
  • Developed Front-End screens using JSP, HTML, CSS and JavaScript.
  • Developed Java DAO classes to manage database transactions using JDBC.
  • Also used AJAX in suggestive search and to display dialog boxes with JSF and DOJO for some front end.
  • Involved in installing and configuring Maven for application builds and deployment
  • Used application servers like Weblogic and Tomcat for deploying the web application.
  • Involved in multi-tiered J2EE design utilizing Struts framework and JDBC.
  • Developed the project with the java coding concepts like multithreading concepts by creating the various java classes.
  • Implemented various design patters by using java code to obtain the perfect application.
  • System was built using Model-View-Controller (MVC) architecture.
  • Designed the front end using JSP, jQuery.
  • Designed and implemented the application using JSP, Struts MVC, JDBC, MYSQL
  • Used SVN version control tool
  • Automated the build process by writing Maven build scripts.
  • Created Class Diagrams, Sequence Diagrams using Rational Rose, prepared application design document.
  • Designed and developed UI using HTML, CSS, JSP and Struts where users have all the services listed.
  • Developed Servlets and java files for control of the business processes in the middle-tier.
  • Used Java script in user validation and suggestion list and to display dialog boxes.
  • Worked on creating CSS style, JavaScript and AJAX.
  • Performed Validations on UI data using JSF validations and JavaScript.
  • Involved in writing the test cases for the application using JUnit.
  • Extensively worked with Spring MVC for developing J2EE Components.
  • Developed servlets and JSPs with Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.

Environment: Java, JSP, HTML, CSS, JavaScript, AJAX, Maven, MVC

We'd love your feedback!