Sr. Java/hadoop Developer Resume
New, YorK
SUMMARY
- Overall 7+ working experience as a Big Data/Hadoop Developer in designed and developed various applications like big data, Hadoop, Java/J2EE open - source technologies.
- Experience in leveraging big data tools such as Spark, Hadoop, Hive, HBase, Kafka, Zookeeper, Flume, MapReduce, Oozie, Yarn and Pig.
- Good knowledge on Snowflake.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
- Develop Scala scripts, UDFs using both Data frames and RDDs in Spark for Data Aggregation, queries and writing data back into OLTP Systems.
- Create batch data by using spark with the help of Scala API in developing Data Ingestion pipelines using Kafka.
- Hands on experience in designing and developing POCs in Spark to compare the performance of Spark with Hive and SQL/Oracle using Scala.
- Used Flume and Kafka to direct data from different sources to/from HDFS.
- Worked with AWS cloud and created EMR clusters with spark for analyzing raw data processing and access data from S3 buckets.
- Hands on experience with various file formats such as ORC, Avro, Parquet and JSON.
- Experience in working on CQL (Cassandra Query Language), for retrieving the data present in Cassandra cluster by running queries in CQL.
- Experienced with performing Real Time Analytics on NoSQL distributed data bases like Cassandra, Hbase and MongoDB.
- Procedural knowledge in data Cleaning and Analyzing using HiveQL and custom MapReduce programs.
- Good understanding of designing attractive data visualization dashboards using Tableau.
- Experience in using various IDEs/Text Editors such as PyCharm, Jupyter, Nano, Emacs and repositories such as Git, SVN.
- Develop Scala scripts, UDFs using both Data frames and RDDs in Spark for Data Aggregation, queries and writing data back into OLTP Systems.
- Create batch data by using spark with the help of Scala API in developing Data Ingestion pipelines using Kafka.
- Hands on experience in designing and developing POCs in Spark to compare the performance of Spark with Hive and SQL/Oracle using Scala.
- Used Flume and Kafka to direct data from different sources to/from HDFS.
- Worked with AWS cloud and created EMR clusters with spark for analyzing raw data processing and access data from S3 buckets.
- Scripted an ETL Pipeline on Python that ingests files from AWS S3 to Redshift Table.
- Hands on experience with various file formats such as ORC, Avro, Parquet and JSON.
- Experience in working on CQL (Cassandra Query Language), for retrieving the data present in Cassandra cluster by running queries in CQL.
- Ability to develop Map Reduce program using Java and Python.
- Good understanding and exposure to Python programming.
- Experience in using various IDEs Eclipse, Intellij and repositories SVN and Git.
- Exporting and importing data to and from Oracle using SQL developer for analysis.
- Good experience in using Sqoop for traditional RDBMS data pulls and worked with different distributions of Hadoop like Hortonworks and Cloudera.
- Experience in designing a component using UML Design-Use Case, Class, Sequence, and Development, Component diagrams for the requirements.
TECHNICAL SKILLS
Hadoop/Big Data: HDFS, MapReduce, Hive 2.3, Pig 0.17, Sqoop 1.4, Flume 1.8, Oozie 4.3, Spark 2.3, Kafka 1.1, Storm 1.0.5 and Zookeeper 3.4
Languages: C, Java, Python 3.7, Scala 2.12, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts
Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery
Frameworks: MVC Struts, Spring, Hibernate 5.3.1
NoSQL Databases: HBase, Cassandra 3.11, MongoDB 4.0.0
Operating Systems: HP-Unix, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8
Web Technologies: HTML5, DHTML, XML, AJAX, WSDL
Web/Application servers: Apache Tomcat 9.0.10, WebLogic, JBoss
Databases: Oracle 12c, DB2, SQL Server, MySQL, Teradata r15
Tools: and IDE: Eclipse 4.8, NetBeans, Toad, Maven, ANT 1.10.3, Sonar, JDeveloper, DB Visualizer
Version control & Web Services: SVN, CVS, GIT, REST, SOAP
PROFESSIONAL EXPERIENCE
Confidential - New York
Sr. Data Engineer
Responsibilities:
- Working as Sr. Data Engineer with Hadoop Ecosystems, Apache Spark, and AWS.
- Designing and building robust services using streaming and batch data.
- Key contributor in building identity services that will enable Confidential to share profiles across the organization in support of marketing and analytics.
- Created Hive schemas using performance techniques like partitioning and bucketing.
- Developed analytical components using Kafka and Spark Stream.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Create Spark Application to load the data in Athena Tables.
- Developed PySpark script to setup the data pipe line.
- Worked on Spark SQL, created Data frames by loading data from Hive tables and created prep data and stored in AWS S3.
- Collaborated with product teams, data analysts and data scientists to design and built data-forward solutions.
- Create airflow jobs to workflow of Spark.
- Work independently and part of teams that will ingest data from a variety of source types including viewership, behavioral, attribution, content metadata etc.
Environment: Hadoop 3.0, Scala, Airflow, Zookeeper 3.4, Python 2.7, Apache Spark 2.3, Apache Kafka, Cassandra 5.X, S3, Athena, Red Shift.
Confidential - New York
Sr. Big Data/Hadoop Developer
Responsibilities:
- Worked as a Sr. Big Data/Hadoop Developer with Hadoop Ecosystems components like HBase, Sqoop, Zookeeper, Oozie, Hive and Pig with Cloudera Hadoop distribution.
- Involved in Agile development methodology active member in scrum meetings.
- Worked in Azure environment for development and deployment of Custom Hadoop Applications.
- Created Hive schemas using performance techniques like partitioning and bucketing.
- Developed analytical components using Kafka and Spark Stream.
- Developed POC using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL.
- Involved in converting Hive queries into Spark transformations using Spark RDDs, Python and Scala.
- Performed transformations like event joins, filter boot traffic and some pre-aggregations using Pig.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case
- Used windows Azure SQL reporting services to create reports with tables, charts and maps.
- Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Imported and exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Environment: Hadoop 3.0, HBase, Sqoop 1.4, Zookeeper 3.4, Oozie, Hive 2.3, Pig 0.17, MS Azure, Scala 2.12, Spark 2.3, Apache Flume 1.8, NoSQL, MongoDB 4.0, MapReduce, HDFS, Cassandra 3.11, Kafka 1.1, Java
Confidential - Troy, NY
Big Data/Hadoop Developer
Responsibilities:
- Worked as a Big Data/Hadoop Developer for providing solutions for big data problem.
- Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala and Python.
- Developed Nifi flow to move data from different sources to HDFS and from HDFS to S3 buckets
- Worked on Spark SQL, created Data frames by loading data from Hive tables and created prep data and stored in AWS S3.
- Responsible for loading the customer's data and event logs from Kafka into HBase using REST API
- Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real-time and persist it to HBase.
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Wrote complex Hive queries and UDFs in Python.
- Analyzed the data by performing Hive queries (HiveQL), ran Pig scripts, Spark SQL and Spark streaming.
- Developed tools using Python, Shell scripting, XML to automate some of the menial tasks.
Environment: Hadoop 3.0, Agile, Spark 2.3, Scala 2.12, Python 2.7, Hortonworks, Nifi, HDFS, Hive 2.3, AWS, NoSQL, HBase, Kafka, Java, EMR, MapReduce, Cassandra 3.11, MongoDB, MySQL, Zookeeper 3.4, Oozie, Pig 0.17, Sqoop 1.4, XML
Confidential - Boston, MA
Sr. Java/Hadoop Developer
Responsibilities:
- Worked as Sr. Java/Hadoop Developer and responsible for taking care of everything related to the clusters.
- Responsible for building scalable distributed data solutions using Hadoop cluster environment with Hortonworks distribution.
- Developed Spark scripts by using Python shell commands as per the requirement.
- Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and Persists into Cassandra database.
- Developed Spark scripts by writing custom RDDs in Scala and Python for data transformations and actions on RDDs.
- Implemented some of the big data operations on AWS cloud
- Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Involved in performance tuning of Spark jobs using Cache and using complete advantage of cluster environment.
- Developed Spark scripts by using Scala Shell commands as per the requirement.
- Configured spark streaming data to receive real time data from Kafka and store it in HDFS.
- Developed in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Worked with different file formats such as Text, Sequence files, Avro, ORC and Parquet.
- Worked on creating Hive tables and written Hive queries for data analysis to meet business requirements and experienced in Sqoop to import and export the data from Oracle & MySQL.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Used Spark API over Hadoop Yarn as execution engine for data analytics using Hive.
Environment: Hadoop, Pig, Hive, HBase, Oozie, Sqoop, Kafka, Spark, AWS, EC2, Scala, Zookeeper, HDFS, Oozie, JSON, XML, Oracle, MySQL, Cassandra, Jenkins, Maven, GIT
Confidential - San Francisco, CA
Sr. Java/J2EE Developer
Responsibilities:
- Translate business requirements into technical document by Interacting with Business Analysts and Subject Matter Experts (SMEs) to carefully understand business requirements.
- Involved in the requirement analysis, design and development of the application built in Java/J2EE using JavaScript, JSP, AJAX, JDBC and Web Services with JAX-WS
- Contributed in design and development of Spring MVC web based application.
- Designing and Developing Micro-services that are highly scalable, fault-tolerant using Spring Boot.
- Involved in design, develop and implementation of the application using Spring and J2EE framework.
- Used JSP, Servlets, and HTML to create web interfaces. Developed JavaBeans and used custom tag libraries for embedding dynamic into JSP pages.
- Used advanced level of HTML, JavaScript, CSS and pure CSS layouts (table less layout)
- Involved in the designing and developing modules in application using Spring.
- Designed and developed User Interface using JSP, JSTL, HTML, AJAX, and JQuery.
- Used Hibernate implemented JPA for persisting backend database transaction results in persisted classes.
- Built web-based applications using Spring MVC Architecture suitable for Apache Axis framework.
- Created an XML configuration file for Hibernate for Database connectivity.
- Created connections to database using Hibernate Session Factory, using Hibernate APIs to retrieve and store data to the database with Hibernate transaction control.
- Implemented spring services and Spring DAO's for controller interactions to operate on data.
- Implemented Java and J2EE design patterns such as MVC and DAO.
Environment: Java, J2EE, JavaScript, AJAX, Spring MVC, HTML, JavaBeans, CSS, Struts, RESTful, SOAP, Hibernate, POJO, AngularJS, JUnit, JQuery, Ajax, XML
Confidential
Java Developer
Responsibilities:
- Involved in the complete Software Development Life Cycle (SDLC) including Requirement Analysis, Design, Implementation, Testing and Maintenance.
- Used core java to design application modules, base classes and utility classes.
- Designed and implemented customized exception handling to handle the exceptions in the application.
- Used Dependency Injection (DI) or Inversion of Control (IOC) In order to develop code for obtaining bean references in spring framework using annotations.
- Involved in Implementation of the application by following the Java best practices and patterns.
- Used both Java Objects and Hibernate framework to develop Business components to map the Java classes to the database.
- Used spring framework for dependency injection, transaction management. Used Spring MVC framework controllers for Controllers part of the MVC.
- Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
- Used Spring Framework for MVC for writing Controller, Validations and View.
- Used Eclipse as IDE for development of the application.
- Built data-driven Web applications with server side Java technologies like Servlets/JSP and generated dynamic Web pages with Java Server Pages (JSP)
- Involved in mapping of data representation from MVC model to Oracle Relational data model with a SQL-based schema using Hibernate, object/relational-mapping (ORM) solution.
- Used Spring IOC framework to integrate with Hibernate.
- Integrating HTTP Apache Http plug-in with Weblogic Servers.
- Implemented Maven Script to create JAR & dependency JARS and deploy the entire project onto the Weblogic Application Server.
- Coded JavaBeans and implemented Model View Controller (MVC) Architecture.
- Developed Client applications to consume the Web services based on both SOAP and REST protocol.
- Utilized log4j for logging purposes and debug the application.
- Created and implemented Oracle Queries, functions using SQL and PL/SQL.
- Involved in bug fixing during the System testing, Joint System testing and User acceptance testing.
- Worked on various SOAP and RESTful services used in various internal applications.
- Consumed REST based Micro services with Rest template based on RESTful APIs.
- Developed front end web application using AngularJS along with cutting edge HTML and CSS.
- Developed processing component to retrieve customer information from MySQL database, developed DAO layer using Hibernate.
- Used MAVEN for developing build scripts and deploying the application onto WebLogic.
Environment: java, spring, Hibernate, MVC, POJO, WebSphere, Eclipse, HTTP, Maven, JavaBeans, SOAP, log4j, SQL, PL/SQL, CSS, MySQL
