Sr. Big Data / Hadoop Developer Resume
Eden Prairie, MN
SUMMARY
- Over 9+ years of IT experience in software analysis, design, development, testing and implementation of Big Data, Hadoop, NoSQL and Java/J2EE technologies.
- Hands on experience with Big Data Ecosystems including Hadoop, Tableau, MapReduce, Pig, Hive, Impala, Sqoop, Flume, Oozie, MongoDB, Zookeeper, Kafka, Maven, Spark, Scala, HBase, Cassandra.
- Experience in installation, configuration and deployment of Big Data solutions.
- Extensive hold over Hive and Pig core functionality by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources.
- Hands on experience with NoSQL Databases like HBase, Cassandra and relational databases like Oracle and MySQL.
- Involve in converting Hive/SQL queries into Spark transformations using Spark RDDs, Spark SQL and Scala.
- Defined real time data streaming solutions across the cluster using Spark Streaming, Apache Storm, Kafka, Nifi and Flume.
- Experience in working with Developer Toolkits like Force.com IDE, Force.com Ant Migration Tool, Eclipse IDE, Mavens.
- Proficient in Java, Collections, J2EE, Servlets, JSP, spring, Hibernate, JDBC/ODBC.
- Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2.
- In depth knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node.
- Experience with Amazon Web Services, AWS command line interface, and AWS data pipeline.
- Experience in writing SQL, PL/SQL queries, Stored Procedures for accessing and managing databases such as Oracle, SQL MySQL, and IBM DB2.
- Good working experience in using Spark SQL to manipulate Data Frames in Python.
- Good knowledge in NoSQL databases including Cassandra and MongoDB.
- Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and different NoSQL databases.
- A very good experience in developing and deploying the applications using Weblogic, Apache Tomcat, and JBoss.
- Extensive knowledge of Teradata utilities (BTEQ, Fastload, FastExport, Multiload Update/Insert/Delete/Upset)
- Working with creating RDD's, Data Frames and Datasets dealing with structured and unstructured datasets.
- Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive SerDe like JSON and Avro.
- Experience with build tool ANT, Maven and continuous integrations like Jenkins.
- Working experience in Development, Production and QA Environments.
- Implementing in setting up standards and processes for Hadoop based application design and implementation.
- Experience in NoSQL Column - Oriented Databases like HBase and its Integration with Hadoop cluster.
- Written python program for web scrapping and converting HTML to Text.
- Execute faster MapReduce functions using Spark RDD for parallel processing or referencing a dataset in HDFS, HBase and other data sources
- Good experience working on analysis tool like Tableau for regression analysis, pie charts, and bar graphs.
- A very good understanding of job workflow scheduling and monitoring tools like Oozie and ControlM.
- Experience in Front-end Technologies like HTML, CSS, HTML5, CSS3, and AJAX.
TECHNICAL SKILLS
Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper
Hadoop Distributions: Cloudera, Hortonworks, MapR
Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets
Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lak, Data Factory
Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS
Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX
Databases: Oracle 12c/11g, SQL
Operating Systems: Linux, Unix, Windows 10/8/7
IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven
NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB, Accumulo
Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere
SDLC Methodologies: Agile, Waterfall
Version Control: GIT, SVN, CVS
PROFESSIONAL EXPERIENCE
Confidential - Eden Prairie, MN
Sr. Big Data / Hadoop Developer
Responsibilities:
- As a Sr. Big Data/Hadoop Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
- Developed Big Data solutions focused on pattern matching and predictive modeling.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Involved in Agile methodologies, daily scrum meetings, spring planning.
- Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
- Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
- Designed & Developed a Flattened View (Merge and Flattened dataset) de-normalizing several Datasets in Hive/HDFS which consists of key attributes consumed by Business and other down streams.
- Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
- Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
- Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
- Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
- Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
- Designed solution for various system components using Microsoft Azure.
- Worked on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting
- Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
- Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
- Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
- Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
- Developed Nifi flows dealing with various kinds of data formats such as XML, JSON, Avro.
- Developed and designed data integration and migration solutions in Azure.
- Worked on Proof of concept with Spark with Scala and Kafka.
- Worked on visualizing the aggregated datasets in Tableau.
- Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
- Implemented MapReduce jobs in HIVE by querying the available data.
- Configured Hive meta store with MySQL, which stores the metadata for Hive tables.
- Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
- Performance tuning of Hive queries, MapReduce programs for different applications.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Used Cloudera Manager for installation and management of Hadoop Cluster.
- Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Collaborated with business users/product owners/developers to contribute to the analysis of functional requirements.
- Worked on MongoDB, HBase databases which differ from classic relational databases
- Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
- Integrated Kafka-Spark streaming for high efficiency throughput and reliability
- Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
- Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts
Environment: Agile, Hadoop 3.0, Pig 0.17, HBase, Sqoop, Azure, Hive 2.3, HDFS, NoSQL, Impala, YARN, PL/SQL, Nifi, XML, JSON, Avro, Spark Kafka, Tableau, MySQL, Apache Flume 2.3
Confidential - Hillsboro, OR
Big Data/Hadoop Developer
Responsibilities:
- Involved in analysis, design and development phases of the project. Adopted agile methodology throughout all the phases of the application.
- Performed Hadoop installation, configuration of multiple nodes in AWS-EC2 using Hortonworks platform.
- Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
- Analyzed the existing data flow to the warehouses and taking the similar approach to migrate the data into HDFS.
- Created Partitioning, Bucketing, and Map Side Join, Parallel execution for optimizing the hive queries decreased the time of execution from hours to minutes.
- Involved in gathering requirements from client and estimating time line for developing complex queries using Hive for logistics application.
- Worked with cloud provisioning team on a capacity planning and sizing of the nodes (Master and Slave) for an AWS EMR Cluster.
- Worked with Amazon EMR to process data directly in S3 when we want to copy data from S3 to the Hadoop Distributed File System (HDFS) on your Amazon EMR cluster by setting up the Spark Core for analysis work.
- Exposure on Spark Architecture and how RDD's work internally by involving and processing the data from Local files, HDFS and RDBMS sources by creating RDD and optimizing for performance.
- Involved in data pipeline using Pig, Sqoop to ingest cargo data and customer histories into HDFS for analysis.
- Worked on importing data from MySQL DB to HDFS and vice-versa using Sqoop to configure Hive Metastore with MySQL, which stores the metadata for Hive tables.
- Responsible for loading the customer's data and event logs from Kafka into HBase using REST API.
- Worked on Kafka and Spark integration for real time data processing by using Kafka Producer for real time data processing by setting up Kafka mirror maker for data replication across the clusters.
- Created custom UDF's for Spark and Kafka procedure for some of non-working functionalities in custom UDF into Scala in production environment.
- Developed workflows in Oozie and scheduling jobs in Mainframes by preparing data refresh strategy document & Capacity planning documents required for project development and support.
- Worked with different actions in Oozie to design workflow like Sqoop action, pig action, hive action, shell action.
- Worked on major Hadoop distribution like Hortonworks numerous Open Source projects and prototype various applications that utilize modern Big Data tools.
- Implemented Fair scheduler on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
- Implemented Reporting, notification services using AWS API and used AWS (Amazon Web services) compute servers extensively.
Environment: Hadoop 3.0, AWS, EC2, Hortonworks, NoSQL, HBase 1.2, HDFS 1.2, Hive, S3, Spark, RDBMS, Pig, Sqoop, MySQL, UDF, Oozie
Confidential - Washington, DC
Sr. Java/Hadoop Developer
Responsibilities:
- Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
- Used Hive to analyze data ingested in to the HBase by using Hive-HBase integration and computes various metrics for reporting on the dashboard.
- Loaded the aggregated data onto the oracle from Hadoop environment using Sqoop for reporting on the dashboard.
- Involved in installing, configuring and maintaining the Hadoop cluster including YARN configuration using Cloudera, Hortonworks.
- Developed Java MapReduce programs for the analysis of sample log file stored in cluster.
- Created and managed in database schema, common frameworks. XML schemas, APLs
- Developed MVC design pattern based User Interface using JSP, XML, HTML4, CSS2 and Struts.
- UsedJava/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO).
- Developed window layouts and screen flows using Struts Tiles.
- Developed structured, efficient and error free codes for Big Data requirements. Storing, processing and analyzing huge dataset for getting valuable insights from them.
- Implemented application specific exception handling and logging framework using Log4j
- Developed the user interface screens using JavaScript and HTML and also conducted client side validations.
- Used JDBC to connect to database and wrote SQL queries and stored procedures to fetch and insert/update to database tables.
- Applied machine learning principles for studying market behavior for trading platform.
- Used Maven as the build tool and Tortoise SVN as the Source version controller.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Excessive work in writing SQL Queries, Stored procedures, Triggers using TOAD.
- Code development using core java concepts to provide service and persistence layers. Used JDBC to provide connectivity layer to the Oracle database for data transaction.
- Implemented corejavaconcepts like interfaces, collection framework, used Array List, Map and Sets of Collection API.
- Developed Entity Beans as Bean Managed Persistence Entity Beans and used JDBC to connect to backend database DB2.
- Used SOAP-UI for testing the Web-Services.
- Performed software development/enhancement using IBM Rational ApplicationDeveloper(RAD)
- Integrated with the back-end code (Web services) using JQUERY, JSON and AJAX to get and post the data to backend servers.
- Developed the Sqoop scripts to make the interaction between HDFS and RDBMS (Oracle, MySQL).
- Worked with complicated queries in Cassandra
- Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop
- Developed various data connections from data source to Tableau Server for report and dashboard development.
- Developed multiple scripts for analyzing data using Hive and Pig and integrating with HBase.
- Used apache-maven tool to build, Config, and package and deploy an application project.
- Developed complex data representation for the adjustment claims using JSF Data Tables.
- Performed version control using PVCS.
- Used JAX-RPC Web Services using SOAP to process the application for the customer
- Used various tools in the project including Ant build scripts, JUnit for unit testing, clear case for source code version control, IBM Rational DOORS for requirements, HP Quality Center for defect tracking.
Environment: Java 6, Oracle 11g, Hadoop, Hive, HBase, HDFS, Hive, SQL Server 2012, MapReduce, JQUERY, JDBC, Eclipse 4.x, Apache POI, HTML4, XML, CSS/2, JavaScript, Apache Server, PL/SQL, CVS.
Confidential - Chicago, IL
Sr. Java/J2EE Developer
Responsibilities:
- Involved in analysis, design and development phases of the project. Adopted agile methodology throughout all the phases of the application.
- Worked on developing the application involving Spring MVC implementations and Restful web services.
- Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML, XHTML and AJAX.
- Developed the spring AOP programming to configure logging for the application
- Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
- Worked in AGILE Team for rapid development and improve coding efficiency.
- Developed code using Core Java to implement technical enhancement following Java Standards.
- Worked with Swing and RCP using OracleADF to develop a search application which is a migration project.
- Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
- Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
- Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with Spring functionality.
- Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
- Used JDBC and Hibernate for persisting data to different relational databases.
- Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
- Implemented application level persistence using Hibernate and Spring.
- Data Warehouse (DW) data integrated from different sources in different format (PDF, TIFF, JPEG, web crawl and RDBMS data MySQL, oracle, sql server etc.)
- Used XML and JSON for transferring/retrieving data between different Applications.
- Also wrote some complex PL/SQL queries using Joins, Stored Procedures, Functions, Triggers, Cursors, and Indexes in Data Access Layer.
- Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations
- Designed and developed SOAP Web Services using CXF framework for communicating application services with different application and developed web services interceptors.
- Implemented the project using JAX-WS based Web Services using WSDL, UDDI, and SOAP to communicate with other systems.
- Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.
- Wrote JUnit test cases for all the classes. Worked with Quality Assurance team in tracking and fixing bugs.
- Developed back end interfaces using embedded SQL, PL/SQL packages, stored procedures, Functions, Procedures, Exceptions Handling in PL/SQL programs, Triggers.
- Used Log4j to capture the log that includes runtime exception and for logging info.
- Used ANT as build tool and developed build file for compiling the code of creating WAR files.
- Used Tortoise SVN for Source Control and Version Management.
- Responsibilities include design for future user requirements by interacting with users, as well as new development and maintenance of the existing source code.
Environment: JAVA, J2EE, JDK 1.5, Servlets, JSP, XML, JSF, Web Services (JAX-WS: WSDL, SOAP), Spring MVC, JNDI, Hibernate 3.6, JDBC, SQL, PL/SQL, HTML, DHTML, JavaScript, Ajax, Oracle 10g, SOAP, SVN, SQL, Log4j, ANT.
Confidential
Java/J2EE Developer
Responsibilities:
- Analyzed use cases, created interfaces and designed the core functionality from presentation layer to business logic layer.
- Responsibilities include analysis of applications, designing of the enterprise applications, functional, technical and project management.
- Extensively developed stored procedures, triggers, functions and packages in oracle SQL, PL/SQL.
- Used Rational Rose for developing Use case diagrams, Activity flow diagrams, Class diagrams and Object diagrams in the design phase.
- Consuming web services, parsing WSDL files using DOM parser.
- Developed a fully functional prototype application using JavaScript and Bootstrap, connecting to a Restful server on a different domain.
- Implemented GUI pages by using JSP, HTML, CSS, JavaScript, JQuery, JSON, and AJAX.
- Implemented the online application using Spring MVC framework, Core Java, JSP, Servlet.
- Re-created and transferred the entire workspace to Rational Application Developer (RAD 7.0) in order to employ future benefits over using Eclipse IDE.
- Starting and stopping Weblogic servers and creating DB connection pools, Queue connection factories, configuring SSL and installing s.
- Implemented bottom-up approach Web services implementation using WSDL
- Developed web-based customer management software using Facelets, Ice faces and JSF.
- Implemented Ajax Frame works, jQuery tools examples like Auto Completer, Tab Module, and Calendar and Floating windows.
- Developed web services for sending and getting raw Extract data from different applications using SOAP messages.
- Created REST Web Services for the management of data using Apache CXF (JAX-RS).
- Involved in JUnit Testing of various modules by generating the Test Cases.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Developed JavaScript based components using Ext JS framework like GRID, Tree Panel with client reports customized according to user requirements.
- Performed building and deployment of WAR, JAR files on test, stage, and production systems in JBoss application server.
- Used My Eclipse IDE tool and deployed the application in Bea Web logic Application Server using ANT Scripts.
- Helped trainees to finish their assignments using several frameworks such as: Java applet, Spring MVC, JDBC, Struts.
Environment: Core Java, Struts Framework, Spring Framework, Hibernate, JSON, JSP, Servlets, JavaScript, JQuery, Maven, JUnit, JIRA, Tomcat, XML, XSL, ANT, PL/SQL, Oracle, Eclipse IDE, HTML, CSS, UML, Unix.
