We provide IT Staff Augmentation Services!

Hadoop/ Spark Developer Resume

5.00/5 (Submit Your Rating)

Rochester, MN

SUMMARY:

  • Over 8 years of experience in data analysis, data modeling and implementation of enterprise class systems spanning Big Data, Data Integration, Object Oriented programming, Dataware Housing and Advanced Analytics
  • 3+ years of experience in delivering end to end bigdata projects processing peta bytes of activity data by gathering
  • Hands on experience in Big Data Technologies and Hadoop Ecosystem like
  • In - depth Knowledge of Data Structures, Design and Analysis of Algorithms.
  • Excellent understanding of Hadoop architecture and different demons of Hadoop clusters which include
  • Analyzed large data sets using Hive- HQL and Pig Scripts and also written UDF ‘s.
  • Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
  • Analyzed datasets using Spark framework usingmodules Spark Core and Spark SQL.
  • Experience in analyzing data in Spark using Scala and Pyspark.
  • Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with HiveQL queries.
  • Having good work experience in file formats such as AVRO, JSON, and Parquet etc. with Hadoop tools using SerDe concepts.
  • Experience in importing streaming data into HDFS using flume sources and flume sinks and transforming the data using flume interceptors
  • Experience in developing Kafka producers and consumers in java and integrating with apache storm and ingesting data into HDFS and HBase by implementing the rules in storm.
  • Experience in Database Design and Development using Relational Databases (Oracle, MySQL, MS-SQL Server 2005/2008) and NoSQL Databases (MongoDB, Cassandra, HBase)
  • Design and Programming experience in developing Internet Applications using Java, J2EE, JSP,MVC, Servlets, Struts, Hibernate, JDBC, JSF, JMS, EJB, XML, AJAX and web based development tools.
  • Efficient use of various design patterns such as MVC (Model-View-Controller), Singleton, Session Facade, Service locator, Strategic, DAO (Data Access Object), DTO (Data Transfer Object), and Business Delegate in the development of distributed Enterprise Applications.
  • Proficient in using various IDEs like Eclipse, Visual Studio and Matlab.
  • Implemented Unit Testing using JUNIT and Load Runner during the projects.
  • Experienced in writing ANT and Maven scripts to build and deploy Java applications.
  • Good experience in client web technologies like HTML, CSS, JavaScript and AJAX, Servlets, JSP, JSON, XML, JSF, AWS
  • Experience in Reporting using tools like SSRS and Tableau
  • Experienced with different scripting languages like Python and shell scripting.
  • Good knowledge in R language and implementing various statistical algorithmsusing Rstudio.

TECHNICAL SKILLS:

HADOOP / NOSQL: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, Zookeeper, Spark, Strom, Kafka, HBase, Cassandra

LANGUAGES: Java, C#, Python, T - SQL, PL/SQL, R

WEB TECHNOLOGIES: HTML, CSS, AJAX, Servlets, JSON, JSP, JavaScript

FRAME WORKS/TOOLS: Spring, Hibernate, Struts, JUnit and MRUnit

DEVELOPMENT ENVIRONMENTS: Eclipse, Matlab, Visual Studio, Oracle SQL Developer, SQL SSMS, R Studio

OPERATING SYSTEMS: MS Windows 9X/XP/2003/7/8, Linux/Unix, Cent OS

DATABASES: Oracle, Teradata, HBase, MS-SQL, MongoDB, MySQL

PROFESSIONAL EXPERIENCE:

Confidential, Rochester, MN

Hadoop/ Spark Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Written multiple MapReduce programs for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Designed and implemented a data analytics engine based on Scala/Akka(Cluster)/Play to provide trend analysis, regression analysis and machine learning predictions as web services for survey data.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop.
  • Expertise in programming using Scala, built Scala prototype for the application requirement and focused on types functional Scala.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala and Python.
  • Created topics on the Desktop portal using Spark Streaming with Kafka and Zookeeper.
  • Involved in performing the Linear Regression using Scala API and Spark.
  • Import the data from different sources like HDFS/HBase into SparkRDD.
  • Experienced with Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Experience in developing and designing POC’s using Scala, SparkSQL and MLlib libraries then deployed on
  • Installation of Storm and Kafka on 4 node cluster and written Kafka producer to collect events from Rest API and push them to broker.
  • Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala.
  • Written Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
  • Used Cassandra (CQL) with Java API's to retrieve data from Cassandra tables.
  • Worked on analyzing and examining customer behavioral data using Cassandra.
  • Worked on Data warehouse ETL using Talend, development of Business Objects (BO) Reports for healthcare to analyze the data.
  • Hands-on experience and knowledgeable in design and development of ETL processes in Talend ETL, Hadoop, Hive, MySQL, SQL, Linux, and UNIX environment.
  • Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data availability.
  • Developed ETL Jobs using Talend Tool, Data analysis, Performing Validation Checks.
  • Implemented Row Level Updates and Real time analytics using CQL on Cassandra Data.
  • Stored data in AWS S3 similar to HDFS. Also performed EMR programs on data stored in S3.
  • Built S3 buckets and managed policies for S3 buckets and used S3 bucket for storage and backup on AWS.
  • Experience in designing and configuring secure VPC through private and public networks in AWS and created subnets for networks.
  • Involved in running Hadoop Streaming jobs to process Terabytes of data.
  • Used JIRA for bug tracking and CVS for version control.

Environment: Hadoop, MapReduce, HDFS, PIG, Hive, Flume, Sqoop, Oozie, Storm, Kafka, Spark, Scala, Akka, MongoDB, Cassandra, Cloudera, Zookeeper, AWS, MySQL, Talend, Shell Scripting, Java, Git, Jenkins.

Confidential -Dallas

Hadoop Developer

Responsibilities:

  • Migrated data to the HDFS from traditional RDBMS.
  • A deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment
  • Wrote Pig scripts, Hive Queries and UDF’s for large datasets for analysis.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Implemented Row Level Updates and Real time analytics using CQL on Cassandra Database.
  • Developed workflow in Oozie to automate the task of loading the data into HDFS and scheduled Map Reduce jobs.
  • Created SQL-Loader scripts to load legacy data into Oracle staging tables and wrote SQL queries to perform Data Validation and Data Integrity testing.
  • Performed Data Analysis and Data validation by writing SQL queries.
  • Created dashboards for customer analytics using Tableau reporting tool and uploaded in the server for availability to management and other teams.
  • Converted unstructured data to structured data using Pig scripting for testing and validation.
  • Performed complex Linux administrative activates as we created, maintained and updated Linux shell scripts.
  • Developed automated process for builds and deployments; Jenkins and Shell Script.

Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Oracle, MySQL, Unix, Shell Scripting, PL/SQL,Jenkins, Linux, Cassandra, Java, Servlets, Tableau.

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Developed MapReduce programs to clean and aggregate the data.
  • Experienced in handling Avro data files by passing schema into HDFS using Avro tools and MapReduce.
  • Implemented secondary sorting to sort based on multiple fields in map reduces.
  • Implemented data pipeline by chaining multiple mappers by using Chained Mapper.
  • Created Hive Dynamic partitions to load time series data.
  • Developed Pig program for loading and filtering the streaming data into HDFS using Flume.
  • Experienced in handling data from different data sets, join them and preprocess using Pig join operations.
  • Utilized Scala pattern matching in coding the Akka actors for POC’s purpose.
  • Developed HBase data model on top of HDFS data to perform near real time analytics using Java API.
  • Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
  • Implement counters on HBase data to count total records on different tables.
  • Created tables, partitions, buckets and perform analytics using Hive ad-hoc queries.
  • Experienced import/export data into HDFS/Hive from relational data base and Tera data using Sqoop.
  • Handling continuous streaming data comes from different sources using flume and set destination as HDFS.
  • Integrated spring schedulers with Oozie client as beans to handle cron jobs.
  • Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, HDFS, Pig, Hive, Flume, Sqoop, Oozie, Cassandra.

Confidential, Houston

Java Developer

Responsibilities:

  • Worked on a J2EE application, that has MVC architecture and based on Spring and Java Server Faces(JSF) framework along with Prime faces.
  • Implemented Spring-WS webservice to communicate with different applications.
  • Generated and consumed SOAP messages using JAX-WS and Spring-WS webservices.
  • Consumed restful webservices using Apache Http Client to get data from different other systems.
  • Tested the web-services using SOAP UI tool.
  • Developed User Interface satisfying business requirements using Java Server Faces (JSF 2.2, xhtml, JavaScript), Cascading Style Sheets (CSS), and XML.
  • Implemented and configured built-in validators and formatters of JSF.
  • Worked on transactional manager provided by spring framework.
  • Implemented business logic using various design patterns such as Singleton design pattern, Factory design pattern, Abstract design pattern, Proxy design pattern, Facade design pattern, and Observer and Interpreter design pattern.
  • Used Gradle and Ant for build processes. The versioning system in-use was Tortoise SVN.
  • Implemented unit tests for the business logic using JUnit test.
  • Fine-tuned the JVM parameters of the application for better performance and less heap memory usage.
  • Worked with JMS Queues for sending messages in point-to-point mode.
  • Reviewed design specification released by the design team on various enhancements.
  • Deployed application using WebLogic Application Server

Environment: WAS, Servlets, Ajax, JMS, Java Script, Spring, Applets, Java, Jasper reports, log4j, IBM DB2, SVN.

Confidential

Java Developer

Responsibilities:

  • Complete involvement in Requirement Analysis and documentation on Requirements Specification.
  • Developed prototype based on the requirements using Spring Web Flow framework as part of POC (Proof of Concept).
  • Prepared use-case diagrams, class diagrams and sequence diagrams as part of requirement specification documentation.
  • Involved in design of the core implementation logic using MVC architecture.
  • Used Apache Maven to build and configure the application.
  • Configured Spring xmlfiles with required action-mappings for all the required services.
  • Implemented Hibernate at DAO layer by configuring hibernate configuration file for different databases.
  • Developed business services to utilize Hibernate service classes that connect to the database and perform the required action.
  • Developed dynamic web pages using JSP, JSTL, HTML, Spring tags, JQuery, JavaScript and used JQuery to make AJAX calls.
  • Used JSON as response type in SOAP and REST web services.
  • Developed JAX-WS web services to provide services to the other systems.
  • Used Ajax calls for SOAP and REST web service calls to get the response from PCM, RBM, Services components.
  • Expertise in Logging, build management, Transaction management, and Testing framework using Log4j, Maven, Junit and WireMock.
  • Used and implemented WireMock library to stub the data of all web pages in excel sheets as input for the flow methods and JSON objects as response to test the whole application while building project war to avoid inconsistency of the applications in Agile methodology.
  • Involved in coding, testing, maintenance and support phases for three change requests in total for telecom domain client web applications (CMC, B2B and B2C) for users and agents of the client.
  • Developed JSP pages using Spring JSP-tags and in-house tags to meet business requirements.
  • Involved in writing functions, PL/SQL queries to fetch the data from the MySQL database.
  • Developed JavaScript validations to validate form fields.
  • Rigorously worked in NBT, SIT, E2E and final production testing to fix the issues raised by testing team to meet the project deadline for production release.
  • Performed unit testing for the developed code using JUnit.
  • Developed design documents for the code developed.
  • Used SVN repository for version control of the developed code.

Environment: Core java, Spring, Hibernate, HTML, CSS 2.0, PL/SQL, MySQL 5.1, Log4j, SOAP, REST, QT, JavaScript 1.5, AJAX, JSON, Junit, WireMock, SVN and Windows

We'd love your feedback!