We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

4.00/5 (Submit Your Rating)

Armonk, NY

SUMMARY:

  • Above 8+ of working experience as a Big Data & Hadoop Developer in designed and developed various applications on Hadoop, Java/J2EE technologies.
  • Strong development skills in Hadoop, HDFS, Map Reduce, Hive, Sqoop, HBase with solid understanding of Hadoop internals.
  • Strong understanding of Agile Scrum and Waterfall SDLC methodologies.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands on experience in using Hadoop ecosystem components Hive, Pig, Oozie, Sqoop, Flume, HUE, Zookeeper
  • Experience in Amazon AWS cloud which includes services like: EC2, S3, EBS, ELB, Route53, Autoscaling, CloudFront, CloudWatch, Security Groups.
  • Familiar with data warehousing and ETL tools like Informatica.
  • Extensive experience in SOA - based solutions - Web Services, Web API, WCF, SOAP including Restful APIs services
  • Experience with front end technologies HTML (5), CSS, JavaScript, XML and JQuery.
  • Working extensively on different databases Oracle, MySQL and have good database programming experience with SQL.
  • Experience in writing java programs to parse JSON files and XML files using SAX and DOM parsers.
  • Configuring web applications in spring and employed spring MVC architecture and Inversion of Control.
  • Experience in building, deploying and integrating applications in Application Servers with ANT, Maven and Gradle.
  • Significant application development experience with REST Web Services, SOAP, WSDL, and XML.
  • Experience working with NoSQL database technologies, including MongoDB, Cassandra and HBase.
  • Experience in consuming Web services with Apache Axis using JAX-RS(REST) API's.
  • Experienced in building tool Maven, ANT and logging tool Log4J.
  • Experience in working with Web Servers like Apache Tomcat and Application Servers like IBM Web Sphere and JBOSS.
  • Good knowledge on Amazon EMR, Amazon RDS, S3 Buckets, Dynamo DB, Redshift.
  • Expert in TSQL, creating and using Stored Procedures, Views, User Defined Functions, implementing Business Intelligence solutions using SQL Server.
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • Expertise with Application servers and web servers like Oracle Weblogic, IBM WebSphere and Apache Tomcat.
  • Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Experience developing Kafka producers and Kafka Consumers for streaming millions of events per second on streaming data
  • Experience in using various Hadoop Distributions like Cloudera, Hortonworks and Amazon EMR.
  • Expertise in Database Design, Creation and Management of Schemas, writing Stored Procedures, Functions, DDL and DML Sql queries and writing complex queries for Oracle.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS).
  • Good experience working on analysis tool like Tableau for regression analysis, pie charts, and bar graphs.

TECHNICAL SKILLS:

Hadoop/Big Data: MapReduce, HDFS, Hive 2.3, Pig 0.17, HBase 1.2, Zookeeper 3.4, Sqoop 1.4, Oozie 4.3, Flume 1.8, Scala 2.12, Kafka 1.0, Storm 1.0.5, MongoDB 3.6, Hadoop 3.0, Spark 2.3, Cassandra 3.11, Impala 2.1, Control-M

Languages: Java/J2EE, SQL, Shell Scripting, C/C++, Python 3.6

Java/J2EE Technologies: JDBC, Java Script, JSP, Servlets, JQuery

IDE and Build Tools: Eclipse, NetBeans, MS Visual Studio, Ant, Maven, JIRA, Confluence Version Control Git, SVN, CVS

Database: Oracle 12c, DB2, MySQL 5.7, MS SQL server, Teradata15.

Web Tools: HTML 5.1, Java Script, XML, ODBC, JDBC, Hibernate, JSP, Servlets, Java, Struts, spring, and Avro.

Operating System: Windows, Unix, Linux.

Tools: Eclipse Maven, ANT, JUnit, Jenkins, Soap UI, Log4j

Scripting Languages: JavaScript, JQuery, AJAX, CSS, XML, DOM, SOAP, REST

PROFESSIONAL EXPERIENCE:

Confidential - Armonk, NY

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Architected, Designed and Developed Business applications and Data marts for Marketing and IT department to facilitate departmental reporting.
  • Developed Big Data solutions focused on pattern matching and predictive modeling
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Worked in AWS EC2, configuring the servers for Auto scaling and Elastic load balancing.
  • Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating HIVE with existing applications.
  • Designed & Developed a Flattened View (Merge and Flattened dataset) de-normalizing several Datasets in Hive/HDFS which consists of key attributes consumed by Business and other down streams.
  • Worked on NoSQL (HBase) for support enterprise production and loading data into HBASE using Impala and SQOOP.
  • Performed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
  • Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
  • Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
  • Created tables in HBase to store variable data formats of PII data coming from different portfolios.
  • Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
  • Working on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting
  • Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
  • Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
  • Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
  • Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
  • Worked on Proof of concept with Spark with Scala and Kafka.
  • Worked on visualizing the aggregated datasets in Tableau.
  • Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Implemented Map Reduce jobs in HIVE by querying the available data.
  • Configured Hive meta store with MySQL, which stores the metadata for Hive tables.
  • Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
  • Performance tuning of Hive queries, MapReduce programs for different applications.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.
  • Developing data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Collaborating with business users/product owners/developers to contribute to the analysis of functional requirements.
  • Worked on MongoDB, HBase (NoSQL) databases which differ from classic relational databases
  • Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
  • Integrated Kafka-Spark streaming for high efficiency throughput and reliability
  • Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
  • Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts

Environment: HDFS, Map Reduce, Pig, Hive, Sqoop, Oracle 12c, Flume, Oozie, HBase, Impala, Spark Streaming, Yarn, Eclipse, spring, PL/SQL, UNIX Shell Scripting, Cloudera.

Confidential - Chicago, IL

Sr. Hadoop Developer

Responsibilities:

  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Scripts were written for distribution of query for performance test jobs in Amazon Data Lake.
  • Created Hive Tables, loaded transactional data from Teradata using Sqoop and worked with highly unstructured and semi structured data of 2 Petabytes in size.
  • Developed MapReduce (YARN) jobs for cleaning, accessing and validating the data.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Apache Hadoop installation & configuration of multiple nodes on AWS EC2 system
  • Developed Pig Latin scripts for replacing the existing legacy process to the Hadoop and the data is fed to AWS S3.
  • Responsible for building scalable distributed data solutions using Hadoop Cloudera.
  • Designed and developed automation test scripts using Python
  • Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafka to HDFS.
  • Writing Pig-scripts to transform raw data from several data sources into forming baseline data.
  • Analyzed the SQL scripts and designed the solution to implement using Pyspark
  • Implemented Hive Generic UDF's to in corporate business logic into Hive Queries.
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS.
  • Developed syllabus/Curriculum data pipelines from Syllabus/Curriculum Web Services to HBASE and Hive tables.
  • Uploaded streaming data from Kafka to HDFS, HBase and Hive by integrating with storm.
  • Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most visited page on website.
  • Supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS) cloud performed Export and import of data into S3.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
  • Involved in designing the row key in Hbase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Worked on custom Talend jobs to ingest, enrich and distribute data in Cloudera Hadoop ecosystem.
  • Creating Hive tables and working on them using Hive QL.
  • Designed and Implemented Partitioning (Static, Dynamic) Buckets in HIVE.
  • Developed multiple POCs using Pyspark and deployed on the YARN cluster, compared the performance of Spark, with Hive and SQL and Involved in End-to-End implementation of ETL logic.
  • Worked on Cluster co-ordination services through Zookeeper.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
  • Exported the analyzed data to the RDBMS using Sqoop for to generate reports for the BI team.
  • Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
  • Creating the cube in Talend to create different types of aggregation in the data and also to visualize them.

Environment: Hive, Teradata, MapReduce, HDFS, Sqoop, AWS, Hadoop, Pig, Python, Kafka, Apache Storm, SQL scripts, data pipeline, HBase, JSON, Oozie, ETL, Zookeeper, Maven, Jenkins, RDBMS

Confidential - Sparks, MD

Sr. Java/Hadoop Developer

Responsibilities:

  • Involved in analysis, design, testing phases and responsible for documenting technical specifications.
  • Worked as part of the Agile Application Architecture (A3) development team responsible for setting up the architectural components for different layers of the application.
  • Wrote data ingestion systems to pull data from traditional RDBMS platforms such as Oracle and Teradata and store it in NoSQL databases such as MongoDB.
  • Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to analyze HDFS data.
  • Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
  • Designed & developed web based GUI architecture using HTML, CSS, AJAX, JQuery, AngularJS, and JavaScript.
  • Developed Map Reduce programs for some refined queries on big data.
  • Involved in loading data from UNIX file system to HDFS.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Extracted the data from Databases into HDFS using Sqoop.
  • Handled importing of data from various data sources, performed transformations using Hive, PIG and loaded data into HDFS.
  • Used PIG predefined functions to convert the fixed width file to delimited file.
  • Used HIVE join queries to join multiple tables of a source system and load them into Elastic Search Tables.
  • Responsible for loading the data from BDW Oracle database, Teradata into HDFS using Sqoop.
  • Implemented AJAX, JSON, and JavaScript to create interactive web screens.
  • Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
  • Developed data formatted web applications and deploy the script using HTML, XHTML, CSS, and Client- side scripting using JavaScript.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Manage and review Hadoop log files. Implemented lambda architecture as a solution.
  • Adept Confidential understanding Partitions, bucketing concepts managed and created external tables in Hive to optimize performance.
  • Written Hadoop Jobs for analyzing data using HiveQL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Used Hadoop streaming jobs to process terabytes data in Hive.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Created reports for the BI team using Sqoop to import data into HDFS and Hive.

Environment: CDH, Hadoop, HDFS, MapReduce, Yarn, Hive, PIG, Oozie, Sqoop, Linux, Shell scripting, Java, SBT, Amazon S3, JIRA, Git Stash, HDFS, Eclipse, SQL, Oracle 11g.

Confidential - New York, NY

Sr. Java/J2EE Developer

Responsibilities:

  • Played a key role in discussing about the requirements, analysis of the entire system along with estimation, development and testing accordingly keeping BI requirements as a note.
  • Involved in the analysis, design and development of the application based on J2EE using Spring and Hibernate.
  • Involved actively in designing web page using HTML, Backbone, AngularJS, JQuery, JavaScript, Bootstrap and CSS.
  • Created Application Configuration tool using Web works MVC framework and HTML, CSS and JavaScript.
  • Developed Web applications using Spring Core, Spring MVC, IBatis, Apache, Tomcat, JSTL and Spring tag libraries.
  • User help tooltips implemented with Dojo Tooltip Widget with multiple custom colors
  • Used eclipse as IDE to write the code and debug application using separate log files.
  • Designed and developed frameworks for Payment Workflow System, Confirmations Workflow System, Collateral System using GWT, Core Java, Servlets, JavaScript, XML, AJAX, J2EE design patterns and OOPS/J2EE technologies.
  • Used Hibernate to manage Transactions (update, delete) along with writing complex SQL and HQL queries.
  • The business logic is developed using J2EE framework and deployed components on Application server where Eclipse was used for component building.
  • Established continuous integration with JIRA, Jenkins,
  • Developed the user interface screens using JavaScript and HTML and also conducted client side validations.
  • Used JDBC to connect to database and wrote SQL queries and stored procedures to fetch and insert/update to database tables.
  • Used Maven as the build tool and Tortoise SVN as the Source version controller.
  • Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
  • Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Excessive work in writing SQL Queries, Stored procedures, Triggers using TOAD.
  • Code development using core java concepts to provide service and persistence layers. Used JDBC to provide connectivity layer to the Oracle database for data transaction.
  • Implemented logging and transaction manager using spring's Aspect Oriented Programming (AOP) concept.
  • Created build scripts for compiling and creating war, jar using ANT tool kit.
  • Used Angular to connect the web application to back-end APIs, used RESTFUL methods to interact with several API's,
  • Developed POJO classes and writing Hibernate query language (HQL) queries.
  • Experience in using TIBCO Administrator for User Management, Resource Management and Application Management.
  • Developed user interface using JSP, JSP Tag libraries to simplify the complexities of the application.

Environment: Java 1.5/1.7, Core java, Swing, Struts Framework 2.0, Hibernate4.0, Eclipse 3.2, Junit 4.x, JSP 2.x, Oracle SQL Developer 2.1, Oracle Weblogic 12.1, Restful Web Services, SOAP, Tortoise SVN 1.5

Confidential

Java Developer

Responsibilities:

  • Involved in Software Development Life Cycle (SDLC) of the application: Requirement gathering, Design Analysis and Code development.
  • Implemented Struts framework based on the Model View Controller design paradigm.
  • Designed the application by implementing Struts based on MVC Architecture, simple Java Beans as a Model, JSP UI Components as View and Action Servlets as a Controller.
  • Used JNDI to perform lookup services for the various components of the system.
  • Involved in designing and developing dynamic web pages using HTML and JSP with Struts tag libraries.
  • Used HQL (Hibernate Query Language) to query the Database System and used JDBC Thin Driver to connect to the database.
  • Developed Hibernate entities, mappings and customized criterion queries for interacting with database.
  • Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML and AJAX and developed WebServices by using SOAP UI.
  • Used JPA to persistently store large amount of data into database.
  • Implemented modules using Java APIs, Java collection, Threads, XML, and integrating the modules.
  • Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, Front Controller Pattern and MVC.
  • Used JPA for the management of relational data in application.
  • Designed and developed business components using Session and Entity Beans in EJB.
  • Developed the EJBs (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to the service providers.
  • Developed XML configuration files, properties files used in struts Validate framework for validating Form inputs on server side.
  • Extensively used AJAX technology to add interactivity to the web pages.
  • Developed JMS Sender and Receivers for the loose coupling between the other modules and Implemented asynchronous request processing using Message Driven Bean.
  • Used JDBC for data access from Oracle tables.
  • JUnit was used to implement test cases for beans.
  • Successfully installed and configured the IBM WebSphere Application server and deployed the business tier components using EAR file.
  • Involved in deployment of application on WebLogic Application Server in Development & QA environment.
  • Used Log4j for External Configuration Files and debugging.

Environment: JSP 1.2, Servlets, Struts1.2.x, JMS, EJB 2.1, Java, OOPS, Spring, Hibernate, JavaScript, Ajax, Html, CSS, JDBC, JMS, Eclipse, WebSphere, DB2, JPA, ANT.

We'd love your feedback!