We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

Chicago, IL

PROFESSIONAL SUMMARY:

  • Deep expertise in Analysis, Design, Development and Testing phases of Enterprise Data Warehousing solutions.
  • More than 6 years of experience in IT Industry with 3+ Years of Experience in Hadoop technologies such as in Hadoop, Pig, Hive, HBase, Oozie, Zookeeper and Sqoop with hands on experience on writing Map Reduce/YARN jobs.
  • Big data development experience with Google cloud.
  • Experience in working with various Hadoop distributions - Cloudera and HortonWorks.
  • Experience in migrating the data using Sqoop from Hadoop to Relational Database System and vice-versa.
  • Expertise in Hadoop administration such as managing cluster, reviewing Hadoop log files.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in NoSQL database MongoDB and Cassandra.
  • Proficient in configuring Zookeeper, Cassandra & Flume to the existing Hadoop cluster.
  • Experience in installation, configuration, supporting and managing - CloudEra's Hadoop platform along with CDH3&4 &5 clusters.
  • Familiarity on real time streaming data with Spark and Kafka.
  • Experience in ETL analytics on ingested data using scripts built with Hive, Pig, Spark, MapReduce that include interactive, batch and real time processing.
  • Expertise in Java/J2EE technologies such as Core Java, spring, Hibernate, JDBC, JSON, HTML, Struts, Servlets, JSP, JBOSS and JavaScript.
  • Have Experience of using integrated development environment like Eclipse, Net beans, JDeveloper, My Eclipse.
  • Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
  • Good Experience in writing complex SQL queries with databases like DB2, Oracle 10g, MySQL, SQL Server and MS SQL Server 2005/2008.
  • Extensive Experience in developing test cases, performing Unit Testing and Integration Testing using source code management tools such as GIT, SVN and Perforce.
  • Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS:

Big Data: Hadoop, Map Reduce, Pig, Hive, Hbase, Sqoop, Oozie, Cassandra, MongoDB, Horton Works, Kafka, Spark and Zookeeper, Big Query

Web development: HTML, Java Script, XML, PHP, JSP, Servlets, JavaScript

Databases: DB2, MySQL,MS Access, MS SQL server,Teradata, NoSQL, Vertica, Aster nCluster, SSAS, Oracle, Oracle Essbase.

Languages: Java / J2EE, HTML, SQL,Spring, Hibernate, JDBC,JSON, JavaScript

Operating Systems: Mac OS, Unix, Linux (Various Versions), Windows 2003/7/8/8.1/XP/Vista

Web/Application server: Apache Tomcat, WebLogic, WebSphere Tools Eclipse, NetBeans

Version Control: Git, SVN, Perforce

IDE’S: Intellij, Eclipse, NetBeans, JDeveloper

PROFESSIONAL EXPERIENCE:

Confidential, Chicago, IL

Hadoop Consultant

Responsibilities:

  • Cloudera Hadoop installation & configuration of multiple nodes using Cloudera Manager and CDH 4.X/5.X.
  • Prepared low level Design document and estimated efforts for the project.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks and within AWS.
  • Experience with NoSQL data modeling with Cassandra/Hbase/MongoDB etc.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
  • Developed and implemented jobs in MR2 Horton Works Cluster and Cluster within AWS.
  • Developed the PIG code for loading, filtering and storing the data.
  • Developed Hive Scripts (HQL) for automating the joins for different sources.
  • Developed various Big Data workflows using Oozie.
  • Big data development with cloud experience Google cloud preferred.
  • Development of MapReduce programs and data migration from existing data source using Sqoop.
  • Developed the custom writable Python programs to load the data into the HBase.
  • Developed Map Reduce Programs using MRv1 and MRv2 (YARN).
  • Developed Spark SQL jobs that read data from Data Lake using Hive transform and save it in Hbase.
  • Strong application DBA skills with Data modeling skills for NoSQL and relation databases.
  • Built Java client that is responsible for receiving XML file using REST call and publishing it to Kafka.
  • Built Kafka + Spark streaming job that is responsible for reading XML file messages from Kafka and transforming it to POJO using JAXB.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Responsible to migrate from Hadoop to Spark frameworks, in-memory distributed computing for real time fraud detection.
  • Effectively used Oozie to develop automatic workflows of Sqoop, MapReduce and Hive jobs.
  • Involved in running Hadoop jobs for processing millions of records of text data for batch and online processes by using Tuned/Modified SQL.
  • Responsible for designing highly scalable big data cluster to support various data storage and computation across varied big data cluster - Hadoop, Cassandra, MongoDB & Elastic Search.
  • Designed and published workbooks and dashboards using Tableau Dashboard/Server 6.X/7.X

Environment: Hadoop (HDFS), HBase, Map Reduce, Hive, Spark, Kafka, Oozie, flume, Spark, Cassandra, Horton works, UNIX Shell Scripting, MongoDB, MySQL, Eclipse, Toad, and HP Vertica 6.X/7.X.

Confidential, Dallas, TX

Hadoop Consultant

Responsibilities:

  • Responsible for loading the customer's data and event logs from Kafka into HBase using REST API.
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
  • Worked on debugging, performance tuning and Analyzing data using Hadoop components Hive & Pig.
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Developed and implemented jobs in MR2 Horton Works Cluster.
  • Developed and executed Hive, Spark and PIG Queries for de-normalizing the data.
  • Created Hive tables from JSON data using data serialization framework like AVRO.
  • Implemented generic export framework for moving data from HDFS to RDBMS and vice-versa.
  • Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
  • Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Worked on loading data from LINUX file system to HDFS.
  • Created HBase tables to store various data formats of PII data coming from different portfolios Implemented Map-reduce for loading data from oracle database to NoSQL database.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.
  • Moved data from Hadoop to Cassandra using Bulk output format class.
  • Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Responsible for processing unstructured data using Pig and Hive.
  • Adding nodes into the clusters & decommission nodes for maintenance.
  • Created PIG script jobs in maintaining minimal query optimization.
  • Worked on various Business Object Reporting functionalities such as Slice and Dice, Master/detail, User Response function and different Formulas.
  • Strong experience on Apache server configuration.

Environment: Hadoop, HDFS, HBase, Pig, Hive, Spark, HortonWorks, Oozie, MapReduce, Sqoop, Cloudera, MongoDB, Cassandra, Kafka, LINUX, Java APIs, Java collection, Windows.

Confidential, Seattle, WA

Hadoop Admin/Developer

Responsibilities:

  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in using Pig Latin to analyze the large scale data.
  • Involved in loading data from UNIX file system to HDFS.
  • Interacted with business users on regular basis to consolidate and analyze the requirements and presented them with design results.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Involved in data visualization and provided the files required for the team by analyzing the data in hive and developed Pig scripts for advanced analytics on the data
  • Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of the complex logical solutions.
  • Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive.
  • Worked on improving the performance by using various performance tuning strategies.
  • Managed the evaluation of ETL and OLAP tools and recommended the most suitable solutions depending on business needs.
  • Migrated jobs from development to test and production environments.
  • Created external tables with proper partitions for efficiency and loaded the structured data in HDFS resulted from MR jobs.
  • Involved in moving all log files generated from various sources to HDFS for further processing.
  • Used Shell Scripts for loading, unloading, validating and records auditing purposes.
  • Used Teradata Aster bulk load feature to bulk load flat files to Aster.
  • Shell Scripts are also used for file validating, records auditing purposes.
  • Used Aster UDFs to unload data from staging tables and client data for SCD which resided on Aster database.
  • Extensively used SQL and PL/SQL for development of Procedures, Functions, Packages and Triggers.

Environment: Java, SQL, PL/SQL, Unix Shell Scripting, XML, Teradata Aster, Hive, Pig, Hadoop, MapReduce, Clear Case, HP Unix, Windows XP professional.

Confidential

Java Developer

Responsibilities:

  • Involved in Requirements analysis, design, and development and testing.
  • Involved in developing of Group portal and Member portal applications.
  • Developed front end using Struts and JSP.
  • Developed webpages using HTML, Java script, JQuery and CSS.
  • Developed customized reports and Unit Testing using JUNIT.
  • Used Java 1.6, spring, Hibernate, Oracle, to build the product suite.
  • Responsible for building projects in deployable files (WAR files and JAR files).
  • Coded Java Servlets to control and maintain the session state and handle user requests.
  • Involved in development, and Testing, phases of the project by following agile methodology.
  • Implemented the logging mechanism using log4j framework.
  • Developed Web Services.
  • Verified software errors and interacted with developers to resolve the technical issues.
  • Used Maven to build the J2EE application.
  • Wrote complex SQL queries and stored procedures.
  • Involved in maintenance of different applications.

Environment: Servlet, Enterprise Javabeans, Custom Tags, Stored Procedures, JavaScript, Java, Spring Framework, Struts, Web Services, Oracle.

Confidential

Java Developer

Responsibilities:

  • Involved in the designing of the project using UML.
  • Followed J2EE Specifications in the project.
  • Designed the user interface pages in JSP.
  • Used XML and XSL for mapping the fields in database.
  • Used JavaScript for client side validations.
  • Created stored procedures and triggers that are required for project.
  • Created functions and views in Oracle.
  • Enhanced the performance of the whole application using the stored procedures and prepared statements.
  • Responsible for updating database tables and designing SQL queries using PL/SQL.
  • Created bean classes for communicating with database.
  • Involved in documentation of the module and project.
  • Prepared test cases and test scenarios as per business requirements.
  • Involved in bug fixing.
  • Prepared coded applications for unit testing using JUnit.

Environment: Java, JSP, Servlets, J2EE, EJB 3, Java Beans, Oracle, HTML, DHTML, XML, XSL, JavaScript, BEA WebLogic.

Hire Now