Java Hadoop Consultant Resume
Franklin, LakeS
SUMMARY
- Over 5.2 years of professional IT experience wif Over 5 Years of Hadoop/Spark experience in ingestion, storage, querying, processing and analysis of big data and 5 Years of Java.
- Experienced wif real - time data processing mechanism in Big Data Ecosystem such as Apache Kafka and Spark Streaming.
- Experienced in Spark Scala API, Spark Python API to transfer, process and analyze data in different formats and structures.
- Knowledge of Unit Testing wif Scala Check, Scala Test, JUnit and MR Unit, also used JIRA for basic issue tracking, Jenkins for continuous integration and A/B testing for certain projects.
- Good exposure to Java web and client server development wif knowledge in all the phases of the life cycle of the software (SDLC) including requirement analysis, design, coding, testing, deployment, change and configuration management, process definitions and documentation.
- Proficient in Installation, Configuration and migrating and upgrading of data from Hadoop MapReduce, HIVE, HDFS, HBase, Sqoop, Pig, Cloudera, YARN.
- Excellent understanding/knowledge of Hadoop architecture and various components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
- Experience wif leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
- Hands on experience on Cloudera Hadoop environments.
- Experienced in application design using Unified Modeling Language (UML), Sequence diagrams, Case diagrams, Entity Relationship Diagrams (ERD), Data Flow Diagrams (DFD).
- Worked on Amazon Web Services like EC2, ELB, VPC, S3, Cloud Front, IAM, RDS, Route 53, Cloud Watch, SNS, Auto Scaling, Elastic Load Balance, AMIs, Dynamo DB, firewalls, routing technologies and DNS, Amazon RDS DB services.
- Good experience on working wif Amazon Web Services like EC2, S3, Amazon Simple DB, Amazon RDS, Amazon Elastic Load Balancing, Amazon SQS, AWS Identity and access management, AWS Cloud Watch, Amazon EBS and Amazon Cloud Front.
- Having good Knowledge in NOSQL data base like DynamoDB and MongoDB
- Proficiency in programming wif different Java IDE's like Eclipse, and NetBeans.
- Experience in database development using SQL and PL/SQL and experience working on databases like Oracle, SQL Server and MySQL.
- Developed Spark jobs and Hive Jobs to summarize and transform data.
- Experience in Writing Producers/Consumers and creating messaging centric applications using Apache Kafka.
- Hands on experience in Amazon Web Services (AWS) provisioning tools like EC2, Simple Storage Service (S3), Elastic Map Reduce.
- Extensive Experience in Java development skills using J2SE, J2EE technologies like Servlets, Spring Hibernate, JSP, JDBC.
- Experienced in Java components like Frame work collection, Exception handling, and Multithreading and I/O system.
- Experience in SOA using Soap and RESTful.
- Knowledge on writing Hadoop Jobs for analyzing data using Hive and Pig.
- Experience in NoSQL database.
- Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
- Ability to blend technical expertise wif strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem-solving technique and leadership skills.
TECHNICAL SKILLS
Big Data Eco Systems: Hadoop (HDFS & Map Reduce), PIG, HIVE, HBASE, Sqoop, Kafka, Apache Spark
Databases: Oracle 9i/10g/11g, SQL Server 2008, MS-SQL Server, Dynamo DB, Mango DB. Neo4j Graph Database (Less TEMPthan 1 year)
Hadoop Distributions: Cloudera, Horton works.
Languages: Java, SQL, JavaScript, XML
Web Technologies: JavaScript, J-Query, Boot Strap, AJAX, XML, CSS, HTML, AngularJS.
Web Services: REST, SOAP, JAX-WS, JAX-RPC, JAX-RS, WSDL, Axis2, Apache HTTP, CVS, SVN.
IDE: Eclipse, Net beans, IntelliJ.
Operating Systems: Windows Variants, Linux, UNIX.
Cloud Computing: Amazon EC2, Amazon S3, Amazon RDS
Hadoop ecosystem: Cloud Platform, Hadoop2.X, Spark1.4+, MapReduce, Hive2.1, Google Cloud Platform (Dataproc, Compute, Impala1.2+ Sqoop1.4, Flume1.7, Kafka1.2+, Engine, Bucket, SQL), Amazon Web Service \ Hbase1.0+, Oozie3.0+, Zookeeper3.4+\ (EC2, S3, EMR), Databricks Cloud Community\ Programming Language\ Operating Systems\ Java 8+, C++, Scala2.1+\ Linux, Ubuntu, Mac OS, CentOS, Windows\ Web Development\ Database\ JavaScript, jQuery, AngularJS, HTML, CSS, \ MySQL5.X, Oracle11g, PostgreSQL9.X, \ Node.js\ Netezza7.X, MongoDB3.2, HBase0.98\ IDE Application\ Data Analysis & Visualization\ NetBeans, Eclipse, Visual Studio Code \ Python, R, Tableau, Matplotlib, D3.js\ IntelliJ Idea, SQL Server 2008 R2+\ Scripting Language\ Machine Learning\ UNIX Shell, HTML, XML, CSS, JSP, SQL, \ Regression, Decision Tree, Random Forest, \ Markdown\ K-Means, Neural Networks, SVM, NLP\ Environment\ Collaboration\ Agile, Scrum, waterfall\ Git, Microsoft TFS, JIRA, Jenkins.
PROFESSIONAL EXPERIENCE
Confidential, Franklin Lakes
Java Hadoop consultant
Responsibilities:
- Utilized object-oriented programming and Java for creating business logic.
- Used GCP Console, monitor DataProc cluster and jobs.
- Used GCP - Stack Driver for monitoring, logging, compute engine and DataProc.
- Experience working wif Docker to improve our (CD) Continuous Delivery framework to streamline releases
- Developed Spark scripts by using Scala Shell commands as per the requirement.
- Created visualization reports using tableau.
- Designed and built a custom and generic ETL framework - Spark application using Scala; for data loading and transformations.
- Dev Testing on Different Models.
- Created phases of client-side and server-side web applications using HTML, CSS, Java and Neo4j Graph Database.
- Used Hive to do analysis on the data and identify different correlations.
- Wrote SQL queries to retrieve data from Database using JDBC.
- Performed file transfers using Tectia SSH Client.
- Implemented Hadoop framework to capture user navigation across the application to validate the user interface and provide analytic feedback/result to the UI team.
- Developed Map-Reduce jobs on Yarn and Hadoop clusters to produce daily and monthly reports.
- Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Managing and scheduling Jobs on a Hadoop cluster.
- Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
- Wrote Map Reduce jobs using Java API and Pig Latin.
- Wrote Pig scripts to run ETL jobs on the data in HDFS.
- Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, TEMPEffective & efficient Joins, Transformations and other during ingestion process itself.
- Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark.
- Implemented the workflows using Apache Oozie framework to automate tasks. Used Zookeeper to co-ordinate cluster services.
- Used Hive to do analysis on the data and identify different correlations.
- Deployment and administration of Splunk and Hortonworks Distribution.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Involved in creating Hive tables and working on them using Hive QL.
- Wrote various queries using SQL and used SQL server as the database.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers wif regular code review sessions.
Environment: Hadoop, Map Reduce, Spark, Scala, HDFS, Pig, Hive, HBase, Sqoop, Hortonworks, Zookeeper, Cloudera, Docker, Oracle, agile, Windows, UNIX Shell Scripting, Big Query, GCP, Mango DB, Dynamo DB, Amazon EC2, Amazon S3, Neo4j Graph Database. Hadoop2.X, Cloudera CDH, HDFS, Java 8+, Python, Scala 2.1+, Spark1.4+, HIVE 2.1, Kafka 1.2+, SQOOP 1.4, Flume 1.7, Talend, Zookeeper 3.4+, Oozie 3.0+, Git, JIRA, Tableau.
Confidential - Seattle Washington
Java Hadoop consultant
Responsibilities:
- Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
- Developed data pipeline using Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Worked on the Hortonworks environment.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed several new Map Reduce programs to analyze and transform the data to uncover insights into the customer usage patterns.
- Developed Hive UDFs to validate against business rules before data move to hive table
- Developed MapReduce jobs in both PIG and Hive for data cleaning and pre-processing.
- Developed Sqoop scripts for loading data into HDFS from DB2 and preprocessed wif PIG.
- Created Hive External tables and loaded the data in to tables and query data using SQL.
- Involved in writing Flume and Hive scripts to extract, transform and load the data into Database.
- Performed data analysis in Hive by creating tables, loading it wif data and writing hive queries which will run internally in a MapReduce way.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL database and Sqoop.
- Developed shell script to pull the data from third party system's into Hadoop file system.
- Supported in setting up QA environment and updating configurations for implementing scripts wif Pig.
- Involved in Database design and developing SQL Queries, stored procedures on MySQL.
- Involved in Database design wif Oracle as backend.
Environment: Hadoop, MapReduce, HDFS, Sqoop, Pig, HBase, Hive, Horton Works, Cassandra, Zookeeper, Cloudera, Oozie, DynamoDB, MongoDB, Sqoop, NoSQL, SQL, Oracle, UNIX/LINUX, Big Query, GCP, Amazon EC2, Amazon S3.
Confidential
Hadoop Administrator
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
- Prepared Use Cases, sequence diagrams, class diagrams and deployment diagrams based on UML to enforce Rational Unified Process using Rational Rose.
- Developed a prototype of the application and demonstrated to business users to verify the application functionality.
- Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
- Developed JSP's wif Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.
- Developed the User Interface using spring, html, logic, bean, JSP, Java Script, HTML and CSS.
- Designed and developed backend java Components residing on different machines to exchange information and data using JMS.
- Developed the war/ear file using Ant script and deployed into Web Logic Application Server.
- Used parsers like SAX and DOM for parsing XML documents.
- Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
- Used Rational Clear Case as Version control.
- Written stored procedures, triggers, and cursors using Oracle PL/SQL.
- Worked wif QA team for testing and resolving defects.
- Used ANT automated build scripts to compile and package the application.
- Used Jira for bug tracking and project management.
Environment: J2EE, JSP, JDBC, Spring Core, Struts, Hibernate, Design Patterns, XML, WebLogic, Apache Axis, ANT, Clear case, Junit, UML, Webservices, SOAP, XSLT, Jira, Oracle, PL/SQL Developer and Windows.