Spark/hadoop Developer Resume
Newark, NJ
SUMMARY:
- Talented and accomplished Software Engineer with 6+ years of IT experience in developing applications using Big Data , AWS , Java and Spark .
- 3+ years of experience with Big Data tools like MapReduce , YARN , HDFS , HBase , Impala , Hive ,, AWS ,, Apache Spark for ingestion, storage, querying, processing and analysis of data.
- Expertise in designing websites with W3C standards using HTML5, CSS3 to get best cross - browser user experience for long-term user retention and engagement.
- Experienced in using Front End editors like Eclipse, Notepad++, NetBeans and Dreamweaver.
- Expertise in working with the JavaScript and exposure to various MVC JavaScript frameworks AngularJS & NodeJS.
- Hands on experience with data ingestion tools Kafka , Flume and workflow management tools Oozie .
- Hands on experience handling different file formats like JSON , AVRO , ORC and Parquet.
- Experience on analyzing data in NOSQL databases like HBase and Cassandra and its Integration with Hadoop cluster .
- Hands on experience with Spark Core, Spark SQL and Data Frames/Data Sets/RDD API.
- Experience in using Kafka and Kafka brokers to initiate spark context and processing live streaming information with the help of RDD.
- Experience in implementing the various services using Microservices architecture in which the services working dependently.
- Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
- Extensive experience in developing Enterprise applications using SDLC patterns such as Agile, Waterfall methodologies.
- Experience in database design using PL/SQL to write Stored Procedures, Functions and Triggers.
- Proficient in development of applications using Java and J2EE technologies with experience in JSP, Servlets, Struts and Hibernate frameworks.
- Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
- Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
- Extensive development experience in spark applications for data transformations and loading into HDFS using RDD , Data Frames and Datasets .
- Extensive knowledge on performance tuning of Spark applications and converting Hive / SQL queries into Spark transformations .
- Hands-on experience with AWS ( Amazon Web Services ), using Elastic MapReduce ( EMR ), creating and storing data in S3buckets and creating Elastic Load Balancers ( ELB ) for Hadoop front end WebUI’s . and using IAM ( Identity and Access Management ) for creating groups, users and assigning permissions.
- Extensive programming experience in Java Core concepts like OOPS , Multithreading , Collections and IO .
- Experience using Jira for ticketing issues and Jenkins for continuous integration.
- Extensive experience with UNIX commands, shell scripting and setting up CRON jobs.
- Experience in software configuration management using Git .
- Good experience in using Relational databases Oracle & MySQL .
- Able to assess business rules , collaborate with stakeholders and perform source-to-target data mapping , design .
- Successfully working in fast-paced environment, both independently and in collaborative team environments.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop, HDFS, MapReduce, Hive, Spark core, Spark Sql, Spark streaming, AWSNoSQL Databases: Cassandra, MongoDB
Web Technologies: HTML 5, DHTML, JavaScript, JQuery 3.3, CSS3, AJAX, DOJO, XML, Web Services (SOAP, REST, WSDL), Angular Js 4/6
Frameworks and IDE's: Struts, Spring 5.0.4, Hibernate 5.2, JPA, JSF 2.0/1.2, Spring Core, Spring ORM, Spring MVC, Spring AOP
Cloud Platforms: AWS, EC2, EC3, MS Azure, Azure Data Lake
Web Service Technologies: SOAP, REST, Web-sockets.
IDE's: Eclipse, JDK 1.7, SDK, Apache Tomcat, Edit Plus, Visual Studio, RAD.
Methodologies: Agile, Waterfall, TTD, Iterative
Database: Oracle 112c, SQL Server 2016, My SQL
Operating Systems: Windows, (Mac & Linux) UNIX
PROFESSIONAL EXPERIENCE:
Confidential - Newark, NJ
Spark/Hadoop Developer
Responsibilities:
- Working as Spark Developer, responsible for building scalable distributed data solutions using Hadoop.
- As a developer responsible for requirements gathering, analysis and design and documentation as the application was started from scratch.
- Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
- Used Java Persistence API (JPA) framework for object relational mapping which is based on POJO Classes.
- Primarily involved in Data Migration process using Azure by integrating with GitHub repository and Jenkins.
- Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real time and Persists into Cassandra.
- Used DataStax Spark-Cassandra connector to load data into Cassandra and used CQL to analyze data from Cassandra tables for quick searching, sorting and grouping.
- Worked on loading data into Spark RDD's, perform advanced procedures like text analytics using in-memory data computation capabilities of Spark to generate the Output response.
- Developed the statistics graph using JSP, Custom tag libraries, Applets and Swing in a multi-threaded architecture
- Executed many performance tests using the Cassandra-stress tool to measure and improve the read and write performance of the cluster.
- Handled large datasets using Partitions, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
- Used Kafka Streams to Configure Spark Streaming to get information and then store it in HDFS.
- Partitioned data streams using Kafka, designed and Used Kafka producer API's to produce messages.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Performed tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
- Ingested data from RDBMS to Hive to perform data transformations, and then export the transformed data to Cassandra for data access and analysis.
- Implemented Spark Scripts using Scala, Spark SQL to access hive tables into Spark for faster processing of data.
- Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
- Implemented Informatica Procedures and Standards while developing and testing the Informatica objects.
Environment: Hadoop 3.0, Spark 2.1, Cassandra 1.1, Kafka 0.9s, JSP, HDFS, Hive 1.9, MapReduce, MapR, Java, MVC, Scala, NoSQL
Confidential - Bellevue, WA
Sr. Java/Hadoop Developer
Responsibilities:
- Designed and Developed application modules using spring and Hibernate frameworks.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in Agile methodologies, daily scrum meetings, spring planning.
- Used MAVEN for developing build scripts and deploying the application onto WebLogic.
- Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
- Developed Spark jobs and Hive Jobs to summarize and transform data.
- Involved in converting Hive/SQL queries into Spark transformations using Spark data frames, Scala and Python.
- Implemented MVC architecture using Spring Framework, Coding involves writing Action Classes/Custom Tag Libraries, JSP.
- Creating Hive tables with periodic backups, writing complex Hive/Impala queries to run on Impala.
- Implemented partitioning, bucketing and worked on Hive, using file formats and compressions techniques with optimizations.
- Involved in designing and developing modules at both Client and Server Side.
- Worked on JDBC framework encapsulated using DAO pattern to connect to the database.
- Developed the UI Screens using JSP and HTML and did the client side validation with the JavaScript.
- Worked on various SOAP and RESTful services used in various internal applications.
- Developed JSP and Java classes for various transactional/ non-transactional reports of the system using extensive SQL queries.
- Implemented Storm topologies to pre-process data before moving into HDFS system.
- Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
- Involved in configuring builds using Jenkins with Git and used Jenkins to deploy the applications onto Dev, QA environments
- Involved in unit testing, system integration testing and enterprise user testing using JUnit.
- Involved in creating Hive tables, loading with data and writing Hive queries which runs internally in MapReduce way.
Environment: spring 4.0, Hibernate 5.0.7, Hadoop 2.6.5, Spark 1.1, Hive, Python 3.3, Scala, MapReduce, LINUX
Confidential - Centennial, CO
Sr. Java Developer
Responsibilities:
- Worked on developing the application involving Spring MVC implementations and Restful web services.
- Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
- Developed the spring AOP programming to configure logging for the application
- Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
- Developed code using Core Java to implement technical enhancement following Java Standards.
- Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
- Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
- Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with spring functionality.
- Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
- Used JDBC and Hibernate for persisting data to different relational databases.
- Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
- Implemented application level persistence using Hibernate and spring.
- Used XML and JSON for transferring/retrieving data between different Applications.
- Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
- Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.
- Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
- Developed back end interfaces using embedded SQL, PL/SQL packages, stored procedures, Functions, Procedures, Exceptions Handling in PL/SQL programs, Triggers.
- Used Log4j to capture the log that includes runtime exception and for logging info.
- Used ANT as build tool and developed build file for compiling the code of creating WAR files.
- Used Tortoise SVN for Source Control and Version Management.
- Responsibilities include design for future user requirements by interacting with users, as well as new development and maintenance of the existing source code.
Environment: JDK 1.5, Servlets, JSP, XML, JSF Spring MVC, JNDI, Hibernate 3.6, JDBC, SQL, PL/SQL, HTML, DHTML, JavaScript, Ajax, Oracle 10g, SOAP, SVN, SQL, Log4j, ANT.
Confidential
Java Developer
Responsibilities:
- Worked on developing the application involving Spring MVC implementations and Restful web services.
- Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
- Developed the spring AOP programming to configure logging for the application
- Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
- Developed code using Core Java to implement technical enhancement following Java Standards.
- Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
- Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
- Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with spring functionality.
- Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
- Used JDBC and Hibernate for persisting data to different relational databases.
- Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
- Implemented application level persistence using Hibernate and spring.
- Used XML and JSON for transferring/retrieving data between different Applications.
- Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
- Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.
- Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
- Developed back end interfaces using embedded SQL, PL/SQL packages, stored procedures, Functions, Procedures, Exceptions Handling in PL/SQL programs, Triggers.
- Used Log4j to capture the log that includes runtime exception and for logging info.
- Used ANT as build tool and developed build file for compiling the code of creating WAR files.
- Used Tortoise SVN for Source Control and Version Management.
- Responsibilities include design for future user requirements by interacting with users, as well as new development and maintenance of the existing source code.
Environment: JDK 1.5, Servlets, JSP, XML, JSF Spring MVC, JNDI, Hibernate 3.6, JDBC, SQL, PL/SQL, HTML, DHTML, JavaScript, Ajax, Oracle 10g, SOAP, SVN, SQL, Log4j, ANT.