We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

5.00/5 (Submit Your Rating)

Huston, TX

PROFESSIONAL SUMMARY:

  • More than 3 years of experience in Hadoop and Java development with extensive skill set in developing web based and backend enterprise applications using Java and J2EE technologies and worked on the complete System Development Life Cycle (SDLC).
  • Excellent knowledge in Hadoop architecture such as HDFS, Map Reduce, PIG and HIVE, SQOOP, HBASE, Flume and Oozie for data storage and analysis.
  • Excellent knowledge on multiple distributions like Cloudera, Hortonworks, MapR
  • Experience in managing and reviewing Hadoop log files.
  • Hands on experience in Import/Export of data using SQOOP.
  • Performed data analysis using Hive and Pig.
  • Experienced in writing custom Hive UDF's to incorporate business logic with Hive queries.
  • Experience in developing Map Reduce Programs.
  • Excellent knowledge on creating Hbase tables and loading data into Hbase tables.
  • Excellent knowledge on NOSQL databases like Mongo DB, HBase, and Cassandra.
  • Basic knowledge on Hadoop Yarn, Apache Zookeeper and Spark, Storm.
  • Expertise in developing both front end & back end applications using Java/J2EE Technologies (Java, Servlets, JSP, JSF, AJAX, EJB, Struts, Spring, Hibernate, JAXB, JMS, JDBC, Web Services).
  • Good experience in using Oracle, SQL Server and MySQL databases.
  • Good experience in related Web Technologies like AJAX, HTML, DHTML, JavaScript, CSS on Windows, UNIX, Linux, and Solaris OS.
  • Experience with servers like WebLogic and WebSphere.
  • Extensive Experience working in Spring Framework, Struts Framework and O/R mapping Hibernate framework.
  • Solid working knowledge of Java Web Services with real time knowledge using SOAP, WSDL, REST, XML, JAXP XML Beans and Axis.
  • Extensive experience in design, development and implementation of Model - View-Controller frame works using Struts and Spring MVC.
  • Strong experience in RDBMS technologies like SQL, Queries, Stored Procedures, Triggers, Functions.
  • Basic knowledge on Perl scripting language.
  • Good experience with agile methodology, User stories, Iterative process.
  • Good exposure to Continuous Integration and Automated build; Code refactoring techniques.
  • Good experience in identifying actors, use cases and representing UML diagrams.
  • Proven expertise in distributed application development including extensive work in Object Oriented Analysis, Design.
  • Proficiency in programming with different Java IDE's like Eclipse, JBuilder, Web logic Workshop and Toad.
  • Used log4J for application logging and notification tracing mechanisms.
  • Expertise in development of test cases using JUnit.
  • Excellent communication skills, team player and strong analytical & problem solving abilities.

PROFESSIONAL EXPERIENCE:

Confidential, Huston, TX

Hadoop Consultant

Responsibilities:

  • Obtained the requirement specifications from the SME's, Business Analysts in the BR, and SR meetings for corporate work place project. Interacted with the Business users to build the sample report layouts.
  • Involved in writing the HLD's along with the RTM's tracing back to the corresponding BR's and SR's and reviewed them with the Business.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Installed and configured Apache Hadoop and Hive/Pig Ecosystems.
  • Installed and Configured Cloudera Hadoop CDH4via Cloudera Manager in a pseudo distributed mode and cluster mode as a proof of concept.
  • Created Map Reduce Jobs using Hive/Pig Queries.
  • Extensively used Pig for data cleansing.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig and HiveQL.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Involved in configuring Sqoop to map SQL types to appropriate Java classes.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Cluster co-ordination services through ZooKeeper.
  • Past 5 years TPSS data was collected from Teradata and pushed into HDFS using Sqoop.
  • Involved in Unit testing, System Integration testing and UAT post development.

Environment: Hadoop, Oracle, Cloudera Hadoop CDH4, HiveQL, PigLatin, MapReduce, HDFS, HBase, ZooKeeper, Oozie, Oracle, PL/SQL, SQL*PLUS, Windows, UNIX, Shell Scripting.

Confidential,Houston, TX

Hadoop Developer

Responsibilities:

  • Worked on several modules and projects that involve gathering very high-level conceptual requirements.
  • Executed data migration in coordination with management and technical services personnel.
  • Maintained existing data migration program with occasional upgrades and enhancements.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Analyzed large data sets by running Hive queries and Pig Scripts.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig
  • Imported the retail and commercial data from various vendors into HDFS using EDE process and Sqoop.
  • Used Pig as ETL tool to do transformations, event joins, filtering and some pre aggregations before storing the data onto HDFS.
  • Created a cross platform framework based on JavaScript and HTML which acts as platform to host multiple apps in single window.
  • Built RESTful web services based on JSON for All devices.
  • Involved in testing the tool Spark for exporting the data from HDFS to external database in POC.
  • Written scripts to save emails content as HTML files, for ease while parsing data.
  • Applied Hive quires to perform data analysis on HBase estimating the per annum claims by the customers and Used Flume to stream the log data from servers.
  • Involved in converting Hive/SQL queries into Spark transformations and actions using Spark SQL (RDDs and Data frames) in Python and Scala
  • Worked on CSV files while trying to get input from the MySQL database.
  • Wrote Pyton scripts to parse XML documents and load the data in database.
  • Developed simple to complex MapReduce jobs using Java, Pig and Hive.
  • Exported the analyzed data to the relational databases using Sqoop for visualization.
  • Used Maven for building and managing dependencies of the application.
  • Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, Spark, Hive, PIG, MapReduce, HDFS Sqoop, Cascading, MySQL, HTML, JavaScript, JSON, HBASE, ETL, Flume, Hadoop Cluster, LINUX, Maven, Agile.

Confidential, Minneapolis , MN

Hadoop Consultant

Responsibilities:

  • Moving data from Oracle to HDFS and vice-versa using SQOOP.
  • Written the Apache PIG scripts to process the HDFS data.
  • Associated with creating Hive Tables, and loading and analyzing data using Hive Queries for reports.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Wrote Map Reduce job using Pig Latin.
  • Responsible for data extraction and data ingestion from different data sources into hadoop Data Lake by creating ETL pipelines using Pig, and Hive.
  • Importing and exporting data into HDFS and HIVE using SQOOP.
  • Analyzing/Transforming data with HIVE and PIG.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Developed job flows to automate the workflow for pig and hive jobs.
  • Collecting and aggregating large amounts of data using Apache Flume and staging data in HDFS for further analysis.
  • Designed and Implemented Partitioning (Multi-level), Buckets in HIVE.
  • Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting on the dashboard.
  • Extensively involved in performance tuning of Oracle queries
  • Good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Used agile methodology in developing the application, which included iterative application development, weekly status report and stand up meetings.

Environment: Java (JDK1.6), HDFS, Hbase, Map Reduce, Apache Pig, Sqoop, Hive, Ubuntu/CentOS, Oracle 10g, Eclipse LINUX, Python.

We'd love your feedback!