We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

3.00/5 (Submit Your Rating)

Charleston, SC

SUMMARY:

  • Overall 9 years of experience in various IT related technologies, which includes hands - on experience in Big Data & Java/J2EE technologies.
  • Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Rich working experience in data loading in hive tables and writing hive queries using join, order by, group by etc., by Sqoop data from RDBMS.
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Strong experience on Hadoop distributions like Cloudera, MapR and Hortonworks.
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB.
  • Apache Spark concepts with Scala, writing transformations in Scala for live streaming data. Click stream analysis using Spark with Scala involving data gathering from Kafka, Flume
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, Apache parquet and Avro.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice-versa according to client's requirement.
  • Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
  • Written Scala codes for data analytics in Spark using MapReduce, ByKey, group ByKey etc. to analyze the real time streaming data.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC, SOAP and RESTful web services.
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
  • Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Experience in using various IDEs Eclipse, Intellij and repositories SVN and Git.
  • Experience of using build tools Ant, Maven.
  • Strong knowledge of Spark for handling large data processing in streaming process along with Scala.
  • Experience in designing a component using UML Design-Use Case, Class, Sequence, and Development, Component diagrams for the requirements.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Spark, Kafka, Storm and Zookeeper.

Languages: C, Java, Python, Scala, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate

NoSQL Databases: HBase, Cassandra, MongoDB

Operating Systems: HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL.

Web/Application servers: Apache Tomcat, WebLogic, JBoss.

Databases: Oracle, DB2, SQL Server, MySQL, Teradata

Tools and IDE: Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer

Version control: SVN, Confidential, GIT

Web Services: REST, SOAP

PROFESSIONAL EXPERIENCE:

Confidential - Charleston, SC

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Worked as a Sr. Big Data/Hadoop Developer with Hadoop Ecosystems components like HBase, Sqoop, Zookeeper, Oozie, Hive and Pig with Cloudera Hadoop distribution.
  • Involved in Agile development methodology active member in scrum meetings.
  • Worked in Azure environment for development and deployment of Custom Hadoop Applications.
  • Designed and implemented scalable Cloud Data and Analytical architecture solutions for various public and private cloud platforms using Azure.
  • Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, MapReduce, Spark and Shells scripts.
  • Implemented various Azure platforms such as Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
  • Extracted and loaded data into Data Lake environment (MS Azure) by using Sqoop which was accessed by business users.
  • Manage and support of enterprise Data Warehouse operation, big data advanced predictive application development using Cloudera & Hortonworks HDP.
  • Developed PIG scripts to transform the raw data into intelligent data as specified by business users.
  • Utilized Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine learning use cases under Spark ML and MLlib.
  • Installed Hadoop, Map Reduce, HDFS, Azure to develop multiple MapReduce jobs in PIG and Hive for data cleansing and pre-processing.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Developed a Spark job in Java which indexes data into Elastic Search from external Hive tables which are in HDFS.
  • Performed transformations, cleaning and filtering on imported data using Hive, MapReduce, and loaded final data into HDFS.
  • Explored with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Import the data from different sources like HDFS/HBase into Spark RDD and developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and NoSQL databases such as HBase and Cassandra.
  • Documented the requirements including the available code which should be implemented using Spark, Hive, HDFS, HBase and Elastic Search.
  • Performed transformations like event joins, filter boot traffic and some pre-aggregations using Pig.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case
  • Used windows Azure SQL reporting services to create reports with tables, charts and maps.
  • Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
  • Developed code in Java which creates mapping in Elastic Search even before data is indexed into.
  • Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Imported and exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Azure, Hadoop 3.0, Sqoop 1.4.6, PIG 0.17, Hive 2.3, MapReduce, Spark 2.2.1, Shells scripts, SQL, Hortonworks, Python, MLlib, HDFS, YARN, Java, Kafka 1.0, Cassandra 3.11, Oozie, Agile

Confidential - Deerfield, IL

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Worked as a Big/Hadoop Developer for providing solutions for big data problem.
  • Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.
  • Design, Architect, and help Maintain scalable solutions on the big data analytics platform for enterprise module.
  • Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
  • Created real time data ingestion of structured and unstructured data using Kafka and Spark streaming to Hadoop and MemSQL.
  • Populate the data into dimensions and fact tables, efficiently involved in creating Talend Mappings.
  • Started using Apache Nifi to copy the data from local file system to HDP.
  • Imported data from AWS S3 and into Spark RDD and performed transformations and actions on RDD's.
  • Migrated physical data center environment to AWS also designed, built, and deployed a multitude applications utilizing almost all of the AWS stack (EC2, S3, RDS)
  • Implement solutions for ingesting data from various sources and processing the Data utilizing Big Data Technologies.
  • Use Input and Output data as delimited files into HDFS using Talend Big data studio with different Hadoop Component.
  • Developed Scala scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Create a table inside RDBMS, insert some data after load the same table into HDFS, Hive using Sqoop.
  • Work with Business stakeholder and translate Business objectives, requirements into technical requirements and design.
  • Defined the application architecture and design for Big Data Hadoop initiative to maintain structured and unstructured data; create reference architecture for the enterprise.
  • Identify data sources, create source-to-target mapping, storage estimation, provide support for Hadoop cluster setup, data partitioning.
  • Developed scripts for data ingestion using Sqoop and Flume, Spark SQL and Hive queries for analyzing the data, and Performance optimization
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in Amazon EMR.
  • Wrote DDL and DML files to create and manipulate tables in the database
  • Developed the Unix shell/Python scripts for creating the reports from Hive data.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Analyzed data using Hadoop components Hive and Pig and created tables in hive for the end users
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.

Environment: Agile, Hive 2.3, Pig 0.17, Kafka, Spark, Apache Nifi, AWS, HDFS, Scala, Zookeeper, Sqoop, HBase, Sqoop, Spark SQL, Amazon EMR, Apache Flume

Confidential - Rocky Hill, CT

Sr. Java/Hadoop Developer

Responsibilities:

  • Worked as Java/Hadoop Developer and responsible for taking care of everything related to the clusters.
  • Developed Spark scripts by using Java, and Python shell commands as per the requirement.
  • Involved with ingesting data received from various relational database providers, on HDFS for analysis and other big data operations.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Worked on Spark SQL and Data frames for faster execution of Hive queries using Spark SQL Context.
  • Performed analysis on implementing Spark using Scala.
  • Used Data frames/ Datasets to write SQL type queries using Spark SQL to work with datasets sitting on HDFS.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Created and imported various collections, documents into MongoDB and performed various actions like query, project, aggregation, sort and limit.
  • Extensively experienced in deploying, managing and developing MongoDB clusters.
  • Created Hive tables to import large data sets from various relational databases using Sqoop and export the analyzed data back for visualization and report generation by the BI team.
  • Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
  • Implemented some of the big data operations on AWS cloud.
  • Used Hibernate reverse engineering tools to generate domain model classes, perform association mapping and inheritance mapping using annotations and XML.
  • Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to analyze HDFS data.
  • Maintained the cluster securely using Kerberos and making the cluster up and running all the times.
  • Have an experience to load and transform large sets of structured, semi structured and unstructured data, using Sqoop from Hadoop Distributed File Systems to Relational Database Systems.
  • Created Hive tables to store the processed results in a tabular format.
  • Used Hive QL to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Performed data transformations by writing MapReduce as per business requirements.
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Involved in various NoSQL databases like HBase, Cassandra in implementing and integration.
  • Queried and analyzed data from Cassandra for quick searching, sorting and grouping through CQL.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.

Environment: Java, Spark, Python, HDFS, YARN, Hive, Scala, SQL, MongoDB, Sqoop, AWS, Pig, MapReduce, Cassandra, NoSQL

Confidential - Philadelphia, PA

Sr. Java/J2EE Developer

Responsibilities:

  • Worked on developing the application involving Spring MVC implementations and Restful web services.
  • Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
  • Developed the spring AOP programming to configure logging for the application
  • Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
  • Developed code using Core Java to implement technical enhancement following Java Standards.
  • Worked with Swing and RCP using Oracle ADF to develop a search application which is a migration project.
  • Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
  • Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
  • Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with spring functionality.
  • Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
  • Used JDBC and Hibernate for persisting data to different relational databases.
  • Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
  • Implemented application level persistence using Hibernate and spring.
  • Data Warehouse (DW) data integrated from different sources in different format (PDF, TIFF, JPEG, web crawl and RDBMS data MySQL, oracle, Sql server etc.)
  • Used XML and JSON for transferring/retrieving data between different Applications.
  • Also wrote some complex PL/SQL queries using Joins, Stored Procedures, Functions, Triggers, Cursors, and Indexes in Data Access Layer.
  • Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
  • Designed and developed SOAP Web Services using CXF framework for communicating application services with different application and developed web services interceptors.
  • Implemented the project using JAX-WS based Web Services using WSDL, UDDI, and SOAP to communicate with other systems.
  • Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.
  • Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
  • Developed back end interfaces using embedded SQL, PL/SQL packages, stored procedures, Functions, Procedures, Exceptions Handling in PL/SQL programs, Triggers.
  • Used Log4j to capture the log that includes runtime exception and for logging info.
  • Used ANT as build tool and developed build file for compiling the code of creating WAR files.
  • Used Tortoise SVN for Source Control and Version Management.
  • Responsibilities include design for future user requirements by interacting with users, as well as new development and maintenance of the existing source code.

Environment: JDK 1.5, Servlets, JSP, XML, JSF, Web Services (JAX-WS: WSDL, SOAP), Spring MVC, JNDI, Hibernate 3.6, JDBC, SQL, PL/SQL, HTML, DHTML, JavaScript, Ajax, Oracle 10g, SOAP, SVN, SQL, Log4j, ANT.

Confidential

Java Developer

Responsibilities:

  • Involved in various Software Development Life Cycle (SDLC) phases of the project which was modeled using Rational Unified Process (RUP)
  • Prepared high level technical documents by analyzing the user requirements and implementing the use cases.
  • Implement DAO pattern for database connectivity and Hibernate for object persistence.
  • Used Maven for build and Jenkins as the continuous integration tool for the application development
  • Used WebLogic application server for deploying in dev environments and used Apache Tomcat in local environment.
  • Responsible for the design and development of data loader and data exporter with file feed interface.
  • Troubleshooting and debugging applications and providing fixes in a timely manner.
  • Involved in SDLC stages of application including Requirements analysis, Implementation, Design and Testing.
  • Extensively Used Spring MVC Framework for Controlling the Application.
  • Extensively used Spring RESTful web services for designing the end points.
  • Developed Web applications using Spring Core, Spring MVC, Apache Tomcat, JSTL and spring tag libraries.
  • Developed the web interface using HTML, CSS, JavaScript, JQuery, AngularJS, and Bootstrap
  • Used Ant to build and package the application.
  • Used XML for data loading and reading from different sources.
  • Enhance and modify the presentation layer and GUI framework that are written using JSP and client-side validations done using JavaScript & design enhanced wireframe screens.
  • Deployed the Application on Tomcat server.
  • Used Eclipse as IDE to write the code and debug application using separate log files.
  • Wrote unit and system test cases for modified processes and Continuous Integration with the help of QC team and Configuration team on timely manner.
  • Successfully involved in test driven development model using JUnit.
  • Developed JMS Sender and Receivers for the loose coupling between the other modules and Implemented asynchronous request processing using Message Driven Bean.
  • Developed XML configuration files, properties files used in Spring framework for validating Form inputs on server side.
  • Involved in deployment of application on WebLogic Application Server in Development & QA environment.
  • Used Log4j for External Configuration Files and debugging.
  • Developed GIT controls to track and maintain the different version of the project.

Environment: Hibernate, Maven, Jenkins, Apache Tomcat, MVC, HTML, CSS, JavaScript, JQuery, AngularJS, Bootstrap, Ant, XML, Eclipse, JUnit

We'd love your feedback!