We provide IT Staff Augmentation Services!

Java Hadoop Consultant Resume

4.00/5 (Submit Your Rating)

Franklin, LakeS

SUMMARY

  • Over 5.2 years of professional IT experience with Over 5 Years of Hadoop/Spark experience in ingestion, storage, querying, processing and analysis of big data and 5 Years of Java.
  • Experienced with real - time data processing mechanism in Big Data Ecosystem such as Apache Kafka and Spark Streaming.
  • Experienced in Spark Scala API, Spark Python API to transfer, process and analyze data in different formats and structures.
  • Knowledge of Unit Testing with Scala Check, Scala Test, JUnit and MR Unit, also used JIRA for basic issue tracking, Jenkins for continuous integration and A/B testing for certain projects.
  • Good exposure to Java web and client server development with noledge in all teh phases of teh life cycle of teh software (SDLC) including requirement analysis, design, coding, testing, deployment, change and configuration management, process definitions and documentation.
  • Proficient in Installation, Configuration and migrating and upgrading of data from Hadoop MapReduce, HIVE, HDFS, HBase, Sqoop, Pig, Cloudera, YARN.
  • Excellent understanding/noledge of Hadoop architecture and various components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
  • Hands on experience on Cloudera Hadoop environments.
  • Experienced in application design using Unified Modeling Language (UML), Sequence diagrams, Case diagrams, Entity Relationship Diagrams (ERD), Data Flow Diagrams (DFD).
  • Worked on Amazon Web Services like EC2, ELB, VPC, S3, Cloud Front, IAM, RDS, Route 53, Cloud Watch, SNS, Auto Scaling, Elastic Load Balance, AMIs, Dynamo DB, firewalls, routing technologies and DNS, Amazon RDS DB services.
  • Good experience on working with Amazon Web Services like EC2, S3, Amazon Simple DB, Amazon RDS, Amazon Elastic Load Balancing, Amazon SQS, AWS Identity and access management, AWS Cloud Watch, Amazon EBS and Amazon Cloud Front.
  • Having good Knowledge in NOSQL data base like DynamoDB and MongoDB
  • Proficiency in programming with different Java IDE's like Eclipse, and NetBeans.
  • Experience in database development using SQL and PL/SQL and experience working on databases like Oracle, SQL Server and MySQL.
  • Developed Spark jobs and Hive Jobs to summarize and transform data.
  • Experience in Writing Producers/Consumers and creating messaging centric applications using Apache Kafka.
  • Hands on experience in Amazon Web Services (AWS) provisioning tools like EC2, Simple Storage Service (S3), Elastic Map Reduce.
  • Extensive Experience in Java development skills using J2SE, J2EE technologies like Servlets, Spring Hibernate, JSP, JDBC.
  • Experienced in Java components like Frame work collection, Exception handling, and Multithreading and I/O system.
  • Experience in SOA using Soap and RESTful.
  • Knowledge on writing Hadoop Jobs for analyzing data using Hive and Pig.
  • Experience in NoSQL database.
  • Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem-solving technique and leadership skills.

TECHNICAL SKILLS

Big Data Eco Systems: Hadoop (HDFS & Map Reduce), PIG, HIVE, HBASE, Sqoop, Kafka, Apache Spark

Databases: Oracle 9i/10g/11g, SQL Server 2008, MS-SQL Server, Dynamo DB, Mango DB. Neo4j Graph Database (Less TEMPthan 1 year)

Hadoop Distributions: Cloudera, Horton works.

Languages: Java, SQL, JavaScript, XML

Web Technologies: JavaScript, J-Query, Boot Strap, AJAX, XML, CSS, HTML, AngularJS.

Web Services: REST, SOAP, JAX-WS, JAX-RPC, JAX-RS, WSDL, Axis2, Apache HTTP, CVS, SVN.

IDE: Eclipse, Net beans, IntelliJ.

Operating Systems: Windows Variants, Linux, UNIX.

Cloud Computing: Amazon EC2, Amazon S3, Amazon RDS

Hadoop ecosystem: Cloud Platform, Hadoop2.X, Spark1.4+, MapReduce, Hive2.1, Google Cloud Platform (Dataproc, Compute, Impala1.2+ Sqoop1.4, Flume1.7, Kafka1.2+, Engine, Bucket, SQL), Amazon Web Service \ Hbase1.0+, Oozie3.0+, Zookeeper3.4+\ (EC2, S3, EMR), Databricks Cloud Community\ Programming Language\ Operating Systems\ Java 8+, C++, Scala2.1+\ Linux, Ubuntu, Mac OS, CentOS, Windows\ Web Development\ Database\ JavaScript, jQuery, AngularJS, HTML, CSS, \ MySQL5.X, Oracle11g, PostgreSQL9.X, \ Node.js\ Netezza7.X, MongoDB3.2, HBase0.98\ IDE Application\ Data Analysis & Visualization\ NetBeans, Eclipse, Visual Studio Code \ Python, R, Tableau, Matplotlib, D3.js\ IntelliJ Idea, SQL Server 2008 R2+\ Scripting Language\ Machine Learning\ UNIX Shell, HTML, XML, CSS, JSP, SQL, \ Regression, Decision Tree, Random Forest, \ Markdown\ K-Means, Neural Networks, SVM, NLP\ Environment\ Collaboration\ Agile, Scrum, waterfall\ Git, Microsoft TFS, JIRA, Jenkins.

PROFESSIONAL EXPERIENCE

Confidential, Franklin Lakes

Java Hadoop consultant

Responsibilities:

  • Utilized object-oriented programming and Java for creating business logic.
  • Used GCP Console, monitor DataProc cluster and jobs.
  • Used GCP - Stack Driver for monitoring, logging, compute engine and DataProc.
  • Experience working with Docker to improve our (CD) Continuous Delivery framework to streamline releases
  • Developed Spark scripts by using Scala Shell commands as per teh requirement.
  • Created visualization reports using tableau.
  • Designed and built a custom and generic ETL framework - Spark application using Scala; for data loading and transformations.
  • Dev Testing on Different Models.
  • Created phases of client-side and server-side web applications using HTML, CSS, Java and Neo4j Graph Database.
  • Used Hive to do analysis on teh data and identify different correlations.
  • Wrote SQL queries to retrieve data from Database using JDBC.
  • Performed file transfers using Tectia SSH Client.
  • Implemented Hadoop framework to capture user navigation across teh application to validate teh user interface and provide analytic feedback/result to teh UI team.
  • Developed Map-Reduce jobs on Yarn and Hadoop clusters to produce daily and monthly reports.
  • Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Loaded teh data from Teradata to HDFS using Teradata Hadoop connectors.
  • Wrote Map Reduce jobs using Java API and Pig Latin.
  • Wrote Pig scripts to run ETL jobs on teh data in HDFS.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, TEMPEffective & efficient Joins, Transformations and other during ingestion process itself.
  • Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark.
  • Implemented teh workflows using Apache Oozie framework to automate tasks. Used Zookeeper to co-ordinate cluster services.
  • Used Hive to do analysis on teh data and identify different correlations.
  • Deployment and administration of Splunk and Hortonworks Distribution.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Involved in creating Hive tables and working on them using Hive QL.
  • Wrote various queries using SQL and used SQL server as teh database.
  • Utilized Agile Scrum Methodology to halp manage and organize a team of 4 developers with regular code review sessions.

Environment: Hadoop, Map Reduce, Spark, Scala, HDFS, Pig, Hive, HBase, Sqoop, Hortonworks, Zookeeper, Cloudera, Docker, Oracle, agile, Windows, UNIX Shell Scripting, Big Query, GCP, Mango DB, Dynamo DB, Amazon EC2, Amazon S3, Neo4j Graph Database. Hadoop2.X, Cloudera CDH, HDFS, Java 8+, Python, Scala 2.1+, Spark1.4+, HIVE 2.1, Kafka 1.2+, SQOOP 1.4, Flume 1.7, Talend, Zookeeper 3.4+, Oozie 3.0+, Git, JIRA, Tableau.

Confidential - Seattle Washington

Java Hadoop consultant

Responsibilities:

  • Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
  • Developed data pipeline using Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Worked on teh Hortonworks environment.
  • Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Developed several new Map Reduce programs to analyze and transform teh data to uncover insights into teh customer usage patterns.
  • Developed Hive UDFs to validate against business rules before data move to hive table
  • Developed MapReduce jobs in both PIG and Hive for data cleaning and pre-processing.
  • Developed Sqoop scripts for loading data into HDFS from DB2 and preprocessed with PIG.
  • Created Hive External tables and loaded teh data in to tables and query data using SQL.
  • Involved in writing Flume and Hive scripts to extract, transform and load teh data into Database.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL database and Sqoop.
  • Developed shell script to pull teh data from third party system's into Hadoop file system.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
  • Involved in Database design and developing SQL Queries, stored procedures on MySQL.
  • Involved in Database design with Oracle as backend.

Environment: Hadoop, MapReduce, HDFS, Sqoop, Pig, HBase, Hive, Horton Works, Cassandra, Zookeeper, Cloudera, Oozie, DynamoDB, MongoDB, Sqoop, NoSQL, SQL, Oracle, UNIX/LINUX, Big Query, GCP, Amazon EC2, Amazon S3.

Confidential

Hadoop Administrator

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) of teh application like Requirement gathering, Design, Analysis and Code development.
  • Prepared Use Cases, sequence diagrams, class diagrams and deployment diagrams based on UML to enforce Rational Unified Process using Rational Rose.
  • Developed a prototype of teh application and demonstrated to business users to verify teh application functionality.
  • Developed and implemented teh MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
  • Developed JSP's with Custom Tag Libraries for control of teh business processes in teh middle-tier and was involved in their integration.
  • Developed teh User Interface using spring, html, logic, bean, JSP, Java Script, HTML and CSS.
  • Designed and developed backend java Components residing on different machines to exchange information and data using JMS.
  • Developed teh war/ear file using Ant script and deployed into Web Logic Application Server.
  • Used parsers like SAX and DOM for parsing XML documents.
  • Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
  • Used Rational Clear Case as Version control.
  • Written stored procedures, triggers, and cursors using Oracle PL/SQL.
  • Worked with QA team for testing and resolving defects.
  • Used ANT automated build scripts to compile and package teh application.
  • Used Jira for bug tracking and project management.

Environment: J2EE, JSP, JDBC, Spring Core, Struts, Hibernate, Design Patterns, XML, WebLogic, Apache Axis, ANT, Clear case, Junit, UML, Webservices, SOAP, XSLT, Jira, Oracle, PL/SQL Developer and Windows.

We'd love your feedback!