Sr. Big Data/Hadoop Developer Resume Charleston, SC - Hire IT People

SUMMARY:

Overall 9 years of experience in various IT related technologies, which includes hands - on experience in Big Data & Java/J2EE technologies.
Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
Rich working experience in data loading in hive tables and writing hive queries using join, order by, group by etc., by Sqoop data from RDBMS.
Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
Strong experience on Hadoop distributions like Cloudera, MapR and Hortonworks.
Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB.
Apache Spark concepts with Scala, writing transformations in Scala for live streaming data. Click stream analysis using Spark with Scala involving data gathering from Kafka, Flume
Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, Apache parquet and Avro.
Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice-versa according to client's requirement.
Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
Written Scala codes for data analytics in Spark using MapReduce, ByKey, group ByKey etc. to analyze the real time streaming data.
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC, SOAP and RESTful web services.
Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
Experience in using various IDEs Eclipse, Intellij and repositories SVN and Git.
Experience of using build tools Ant, Maven.
Strong knowledge of Spark for handling large data processing in streaming process along with Scala.
Experience in designing a component using UML Design-Use Case, Class, Sequence, and Development, Component diagrams for the requirements.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Spark, Kafka, Storm and Zookeeper.

Languages: C, Java, Python, Scala, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate

NoSQL Databases: HBase, Cassandra, MongoDB

Operating Systems: HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL.

Web/Application servers: Apache Tomcat, WebLogic, JBoss.

Databases: Oracle, DB2, SQL Server, MySQL, Teradata

Tools and IDE: Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer

Version control: SVN, Confidential, GIT

Web Services: REST, SOAP

PROFESSIONAL EXPERIENCE:

Confidential - Charleston, SC

Sr. Big Data/Hadoop Developer

Responsibilities:

Worked as a Sr. Big Data/Hadoop Developer with Hadoop Ecosystems components like HBase, Sqoop, Zookeeper, Oozie, Hive and Pig with Cloudera Hadoop distribution.
Involved in Agile development methodology active member in scrum meetings.
Worked in Azure environment for development and deployment of Custom Hadoop Applications.
Designed and implemented scalable Cloud Data and Analytical architecture solutions for various public and private cloud platforms using Azure.
Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, MapReduce, Spark and Shells scripts.
Implemented various Azure platforms such as Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
Extracted and loaded data into Data Lake environment (MS Azure) by using Sqoop which was accessed by business users.
Manage and support of enterprise Data Warehouse operation, big data advanced predictive application development using Cloudera & Hortonworks HDP.
Developed PIG scripts to transform the raw data into intelligent data as specified by business users.
Utilized Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine learning use cases under Spark ML and MLlib.
Installed Hadoop, Map Reduce, HDFS, Azure to develop multiple MapReduce jobs in PIG and Hive for data cleansing and pre-processing.
Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Developed a Spark job in Java which indexes data into Elastic Search from external Hive tables which are in HDFS.
Performed transformations, cleaning and filtering on imported data using Hive, MapReduce, and loaded final data into HDFS.
Explored with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Import the data from different sources like HDFS/HBase into Spark RDD and developed a data pipeline using Kafka and Storm to store data into HDFS.
Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and NoSQL databases such as HBase and Cassandra.
Documented the requirements including the available code which should be implemented using Spark, Hive, HDFS, HBase and Elastic Search.
Performed transformations like event joins, filter boot traffic and some pre-aggregations using Pig.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case
Used windows Azure SQL reporting services to create reports with tables, charts and maps.
Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
Developed code in Java which creates mapping in Elastic Search even before data is indexed into.
Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
Imported and exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Azure, Hadoop 3.0, Sqoop 1.4.6, PIG 0.17, Hive 2.3, MapReduce, Spark 2.2.1, Shells scripts, SQL, Hortonworks, Python, MLlib, HDFS, YARN, Java, Kafka 1.0, Cassandra 3.11, Oozie, Agile

Confidential - Deerfield, IL

Sr. Big Data/Hadoop Developer

Responsibilities:

Worked as a Big/Hadoop Developer for providing solutions for big data problem.
Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.
Design, Architect, and help Maintain scalable solutions on the big data analytics platform for enterprise module.
Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
Created real time data ingestion of structured and unstructured data using Kafka and Spark streaming to Hadoop and MemSQL.
Populate the data into dimensions and fact tables, efficiently involved in creating Talend Mappings.
Started using Apache Nifi to copy the data from local file system to HDP.
Imported data from AWS S3 and into Spark RDD and performed transformations and actions on RDD's.
Migrated physical data center environment to AWS also designed, built, and deployed a multitude applications utilizing almost all of the AWS stack (EC2, S3, RDS)
Implement solutions for ingesting data from various sources and processing the Data utilizing Big Data Technologies.
Use Input and Output data as delimited files into HDFS using Talend Big data studio with different Hadoop Component.
Developed Scala scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
Create a table inside RDBMS, insert some data after load the same table into HDFS, Hive using Sqoop.
Work with Business stakeholder and translate Business objectives, requirements into technical requirements and design.
Defined the application architecture and design for Big Data Hadoop initiative to maintain structured and unstructured data; create reference architecture for the enterprise.
Identify data sources, create source-to-target mapping, storage estimation, provide support for Hadoop cluster setup, data partitioning.
Developed scripts for data ingestion using Sqoop and Flume, Spark SQL and Hive queries for analyzing the data, and Performance optimization
Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in Amazon EMR.
Wrote DDL and DML files to create and manipulate tables in the database
Developed the Unix shell/Python scripts for creating the reports from Hive data.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Analyzed data using Hadoop components Hive and Pig and created tables in hive for the end users
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.

Environment: Agile, Hive 2.3, Pig 0.17, Kafka, Spark, Apache Nifi, AWS, HDFS, Scala, Zookeeper, Sqoop, HBase, Sqoop, Spark SQL, Amazon EMR, Apache Flume

Confidential - Rocky Hill, CT

Sr. Java/Hadoop Developer

Responsibilities:

Worked as Java/Hadoop Developer and responsible for taking care of everything related to the clusters.
Developed Spark scripts by using Java, and Python shell commands as per the requirement.
Involved with ingesting data received from various relational database providers, on HDFS for analysis and other big data operations.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Worked on Spark SQL and Data frames for faster execution of Hive queries using Spark SQL Context.
Performed analysis on implementing Spark using Scala.
Used Data frames/ Datasets to write SQL type queries using Spark SQL to work with datasets sitting on HDFS.
Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
Created and imported various collections, documents into MongoDB and performed various actions like query, project, aggregation, sort and limit.
Extensively experienced in deploying, managing and developing MongoDB clusters.
Created Hive tables to import large data sets from various relational databases using Sqoop and export the analyzed data back for visualization and report generation by the BI team.
Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
Implemented some of the big data operations on AWS cloud.
Used Hibernate reverse engineering tools to generate domain model classes, perform association mapping and inheritance mapping using annotations and XML.
Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to analyze HDFS data.
Maintained the cluster securely using Kerberos and making the cluster up and running all the times.
Have an experience to load and transform large sets of structured, semi structured and unstructured data, using Sqoop from Hadoop Distributed File Systems to Relational Database Systems.
Created Hive tables to store the processed results in a tabular format.
Used Hive QL to analyze the partitioned and bucketed data and compute various metrics for reporting.
Performed data transformations by writing MapReduce as per business requirements.
Implemented schema extraction for Parquet and Avro file Formats in Hive.
Involved in various NoSQL databases like HBase, Cassandra in implementing and integration.
Queried and analyzed data from Cassandra for quick searching, sorting and grouping through CQL.
Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.

Environment: Java, Spark, Python, HDFS, YARN, Hive, Scala, SQL, MongoDB, Sqoop, AWS, Pig, MapReduce, Cassandra, NoSQL

Confidential - Philadelphia, PA

Sr. Java/J2EE Developer

Responsibilities:

Worked on developing the application involving Spring MVC implementations and Restful web services.
Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
Developed the spring AOP programming to configure logging for the application
Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
Developed code using Core Java to implement technical enhancement following Java Standards.
Worked with Swing and RCP using Oracle ADF to develop a search application which is a migration project.
Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with spring functionality.
Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
Used JDBC and Hibernate for persisting data to different relational databases.
Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
Implemented application level persistence using Hibernate and spring.
Data Warehouse (DW) data integrated from different sources in different format (PDF, TIFF, JPEG, web crawl and RDBMS data MySQL, oracle, Sql server etc.)
Used XML and JSON for transferring/retrieving data between different Applications.
Also wrote some complex PL/SQL queries using Joins, Stored Procedures, Functions, Triggers, Cursors, and Indexes in Data Access Layer.
Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
Designed and developed SOAP Web Services using CXF framework for communicating application services with different application and developed web services interceptors.
Implemented the project using JAX-WS based Web Services using WSDL, UDDI, and SOAP to communicate with other systems.
Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.
Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
Developed back end interfaces using embedded SQL, PL/SQL packages, stored procedures, Functions, Procedures, Exceptions Handling in PL/SQL programs, Triggers.
Used Log4j to capture the log that includes runtime exception and for logging info.
Used ANT as build tool and developed build file for compiling the code of creating WAR files.
Used Tortoise SVN for Source Control and Version Management.
Responsibilities include design for future user requirements by interacting with users, as well as new development and maintenance of the existing source code.

Environment: JDK 1.5, Servlets, JSP, XML, JSF, Web Services (JAX-WS: WSDL, SOAP), Spring MVC, JNDI, Hibernate 3.6, JDBC, SQL, PL/SQL, HTML, DHTML, JavaScript, Ajax, Oracle 10g, SOAP, SVN, SQL, Log4j, ANT.

Confidential

Java Developer

Responsibilities:

Involved in various Software Development Life Cycle (SDLC) phases of the project which was modeled using Rational Unified Process (RUP)
Prepared high level technical documents by analyzing the user requirements and implementing the use cases.
Implement DAO pattern for database connectivity and Hibernate for object persistence.
Used Maven for build and Jenkins as the continuous integration tool for the application development
Used WebLogic application server for deploying in dev environments and used Apache Tomcat in local environment.
Responsible for the design and development of data loader and data exporter with file feed interface.
Troubleshooting and debugging applications and providing fixes in a timely manner.
Involved in SDLC stages of application including Requirements analysis, Implementation, Design and Testing.
Extensively Used Spring MVC Framework for Controlling the Application.
Extensively used Spring RESTful web services for designing the end points.
Developed Web applications using Spring Core, Spring MVC, Apache Tomcat, JSTL and spring tag libraries.
Developed the web interface using HTML, CSS, JavaScript, JQuery, AngularJS, and Bootstrap
Used Ant to build and package the application.
Used XML for data loading and reading from different sources.
Enhance and modify the presentation layer and GUI framework that are written using JSP and client-side validations done using JavaScript & design enhanced wireframe screens.
Deployed the Application on Tomcat server.
Used Eclipse as IDE to write the code and debug application using separate log files.
Wrote unit and system test cases for modified processes and Continuous Integration with the help of QC team and Configuration team on timely manner.
Successfully involved in test driven development model using JUnit.
Developed JMS Sender and Receivers for the loose coupling between the other modules and Implemented asynchronous request processing using Message Driven Bean.
Developed XML configuration files, properties files used in Spring framework for validating Form inputs on server side.
Involved in deployment of application on WebLogic Application Server in Development & QA environment.
Used Log4j for External Configuration Files and debugging.
Developed GIT controls to track and maintain the different version of the project.

Environment: Hibernate, Maven, Jenkins, Apache Tomcat, MVC, HTML, CSS, JavaScript, JQuery, AngularJS, Bootstrap, Ant, XML, Eclipse, JUnit

We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

Charleston, SC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship