Hadoop Administrator Resume
Princeton, NJ
SUMMARY
- Around 8 years of professional IT experience with Over 4 Years of Hadoop/Spark experience in ingestion, storage, querying, processing and analysis of big data and 4 Years of Java
- Certified Big Data Expert and Apache Spark and Scala
- Proficient in Installation, Configuration and migrating and upgrading of data from Hadoop MapReduce, HIVE, HDFS, HBase, Sqoop, Pig, Cloudera, YARN.
- Excellent understanding/knowledge of Hadoop architecture and various components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
- Knowledge on writing Hadoop Jobs for analyzing data using Hive and Pig.
- Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
- Hands on experience on Cloudera Hadoop environments.
- Experience in NoSQL database.
- Experienced in application design using Unified Modeling Language (UML), Sequence diagrams, Case diagrams, Entity Relationship Diagrams (ERD), Data Flow Diagrams (DFD).
- Proficiency in programming with different Java IDE's like Eclipse, and NetBeans.
- Experience in database development using SQL and PL/SQL and experience working on databases like Oracle, SQL Server and MySQL.
- Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
- Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem-solving technique and leadership skills.
TECHNICAL SKILLS
Big Data: Hadoop, MapReduce, HDFS, YARN, Hive, Pig, Sqoop, and HBase
Languages: Java, SQL, JavaScript, XML
Java/J2EE Technologies: JSP, Servlets
Web Design Tools: HTML, DHTML JavaScript
IDEs: NetBeans, Eclipse.
Databases: Oracle 9i/10g/11g, SQL Server 2008, MS-SQL Server.
Operating systems: Windows Variants, Linux, UNIX
PROFESSIONAL EXPERIENCE
Confidential, Irving, TX
Java Hadoop Consultant
Responsibilities:
- Utilized object-oriented programming and Java for creating business logic.
- Used Hive to do analysis on the data and identify different correlations.
- Wrote SQL queries to retrieve data from Database using JDBC.
- Involved in testing such as Unit Testing, Integration Testing at various levels.
- Performed file transfers using Tectia SSH Client.
Confidential, Birmingham, AL
Java Hadoop Consultant
Responsibilities:
- Implemented Hadoop framework to capture user navigation across the application to validate the user interface and provide analytic feedback/result to the UI team.
- Developed Spark scripts by using Scala Shell commands as per the requirement.
- Developed Map-Reduce jobs on Yarn and Hadoop clusters to produce daily and monthly reports.
- Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Managing and scheduling Jobs on a Hadoop cluster.
- Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
- Wrote Map Reduce jobs using Java API and Pig Latin.
- Wrote Pig scripts to run ETL jobs on the data in HDFS.
- Used Hive to do analysis on the data and identify different correlations.
- Deployment and administration of Splunk and Hortonworks Distribution.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Involved in creating Hive tables and working on them using Hive QL.
- Wrote various queries using SQL and used SQL server as the database.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
Environment: Hadoop, Map Reduce, Spark, Scala, HDFS, Pig, Hive, HBase, Sqoop, Hortonworks, Zookeeper, Cloudera, Oracle, agile, Windows, UNIX Shell Scripting.
Confidential, Denver CO
Java Hadoop Consultant
Responsibilities:
- Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
- Developed data pipeline using Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Worked on the Hortonworks environment.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed several new Map Reduce programs to analyze and transform the data to uncover insights into the customer usage patterns.
- Developed Hive UDFs to validate against business rules before data move to hive table
- Developed MapReduce jobs in both PIG and Hive for data cleaning and pre-processing.
- Developed Sqoop scripts for loading data into HDFS from DB2 and preprocessed with PIG.
- Created Hive External tables and loaded the data in to tables and query data using SQL.
- Involved in writing Flume and Hive scripts to extract, transform and load the data into Database.
- Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL database and Sqoop.
- Developed shell script to pull the data from third party system’s into Hadoop file system.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
- Involved in Database design and developing SQL Queries, stored procedures on MySQL.
- Involved in Database design with Oracle as backend.
Environment: Hadoop, MapReduce, HDFS, Sqoop, Pig, HBase, Hive, Horton Works, Cassandra, Zookeeper, Cloudera, Oozie, MongoDB, Sqoop, NoSQL, SQL, Oracle, UNIX/LINUX.
Confidential, Princeton, NJ
Hadoop Administrator
Responsibilities:
- Extracted data from relational databases such as SQL Server and MySQL by developing Scala and SQL code.
- Uploaded it to Hive and combined new tables with existing databases.
- Developed code to pre-process large sets of various types of file formats such as Text, Avro, Sequence files, XML, JSON and Parquet.
- Configured big data workflows to run on the top of Hadoop which comprises of heterogeneous jobs like Pig, Hive, Sqoop and MapReduce.
- Loaded various formats of structured and unstructured data from Linux file system to HDFS.
- Written Pig Scripts to ETL the data into NOSQL database for faster analysis.
- Read from Flume and involved in pushing batches of data to HDFS and HBase for real time processing of the files.
- Parsing XML data into structured format and loading into HDFS.
- Scheduled various ETL process and Hive scripts by developing Oozie workflow.
- Utilized Tableau to visualize the analyzed data and performed report design and delivery.
- Created POC for Flume implementation.
- Involved in reviewing both functional and non-functional aspects of the business model.
- Championed to communicate and present the models to business customers and executives, using the same.
Environment: Hadoop, HDFS, Map Reduce, Sqoop, HBase, Shell Scripting, PIG, HIVE, Oozie, Core Java, LINUX
Confidential, Columbus, OH
Java Developer
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
- Prepared Use Cases, sequence diagrams, class diagrams and deployment diagrams based on UML to enforce Rational Unified Process using Rational Rose.
- Developed a prototype of the application and demonstrated to business users to verify the application functionality.
- Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
- Developed JSP’s with Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.
- Developed the User Interface using spring, html, logic, bean, JSP, Java Script, HTML and CSS.
- Designed and developed backend java Components residing on different machines to exchange information and data using JMS.
- Developed the war/ear file using Ant script and deployed into Web Logic Application Server.
- Used parsers like SAX and DOM for parsing XML documents.
- Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
- Used Rational Clear Case as Version control.
- Written stored procedures, triggers, and cursors using Oracle PL/SQL.
- Worked with QA team for testing and resolving defects.
- Used ANT automated build scripts to compile and package the application.
- Used Jira for bug tracking and project management.
Environment: J2EE, JSP, JDBC, Spring Core, Struts, Hibernate, Design Patterns, XML, WebLogic, Apache Axis, ANT, Clear case, Junit, UML, Webservices, SOAP, XSLT, Jira, Oracle, PL/SQL Developer and Windows.
Confidential
Java Developer
Responsibilities:
- Worked on designing and developing the Web Application User Interface and implemented its related functionality in JAVA/J2EE for the product.
- Designed and developed applications using JSP, Servlets and HTML.
- Used Hibernate ORM module as an Object Relational mapping tool for back end operations.
- Provided Hibernate configuration file and mapping files and also involved in the integration of Struts with Hibernate libraries.
- Extensively used Java Multi-Threading concept for downloading files from a URL.
- Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
- Developed Web Service client interface for invoking the methods using SOAP.
- Created navigation component that reads the next page details from an XML config file.
- Developed applications with HTML, JSP and Tag libraries.
- Developed required stored procedures and database functions using PL/SQL.
- Developed, Tested and debugged various components in WebLogic Application Server.
- Used XML, XSL for Data presentation, Report generation and customer feedback documents.
- Involved in code review and documentation review of technical artifacts.
Environment: Java, Servlets, JSP, XML, Tomcat, Rational Rose, Eclipse, XML, XSL, and Windows XP.