We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

3.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY

  • Result Oriented Professional building on 8+ years of progressive experience in Software Development includes application design and development along with 3+ years in Big Data/ Hadoop experience in Hadoop ecosystem such as HDFS, MapReduce, Hive, Pig, Flume, Sqoop, Zookeeper, HBase, and Spark.
  • Big data development experience with Google cloud.
  • Experience in working with various Hadoop distributions - Cloudera and HortonWorks.
  • Experience in migrating teh data using Sqoop from Hadoop to Relational Database System and vice-versa.
  • Expertise in Hadoop administration such as managing cluster, reviewing Hadoop log files.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in NoSQL database MongoDB and Cassandra.
  • Proficient in configuring Zookeeper, Cassandra & Flume to teh existing Hadoop cluster.
  • Experience in installation, configuration, supporting and managing- CloudEra's Hadoop platformalong with CDH3&4 &5 clusters.
  • Familiarity on real time streaming data with Spark and Kafka.
  • Experience in ETL analytics on ingested data using scripts built with Hive, Pig, Spark, MapReduce that include interactive, batch and real time processing.
  • Expertise in Java/J2EE technologies such as Core Java, spring, Hibernate, JDBC, JSON, HTML, Struts, Servlets, JSP, JBOSS and JavaScript.
  • Have Experience of using integrated development environment like Eclipse, Net beans, JDeveloper, My Eclipse.
  • Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
  • Good Experience in writing complex SQL queries with databases like DB2, Oracle 10g, MySQL, SQL Server and MS SQL Server 2005/2008.
  • Extensive Experience in developing test cases, performing Unit Testing and Integration Testing using source code management tools such as GIT, SVN and Perforce.
  • Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS

Big Data: Hadoop, Map Reduce, Pig, Hive, Hbase, Sqoop, Oozie, Cassandra, MongoDB, Horton Works, Kafka, Spark and Zookeeper, Big Query

Web development: HTML, Java Script, XML, PHP, JSP, Servlets, JavaScript

Databases: DB2, MySQL,MS Access, MS SQL server,Teradata, NoSQL, Vertica, Aster nCluster, SSAS, Oracle, Oracle Essbase.

Languages: Java / J2EE, HTML, SQL,Spring, Hibernate, JDBC,JSON, JavaScript

Operating Systems: Mac OS, Unix, Linux (Various Versions), Windows 2003/7/8/8.1/XP/Vista

Web/Application server: Apache Tomcat, WebLogic, WebSphere Tools Eclipse, NetBeans

Version Control: Git, SVN, Perforce

IDE’S: Intellij, Eclipse, NetBeans, JDeveloper

PROFESSIONAL EXPERIENCE

Confidential - Chicago, IL

Hadoop Consultant

Responsibilities:

  • Cloudera Hadoop installation & configuration of multiple nodes using Cloudera Manager and CDH 4.X/5.X.
  • Prepared low level Design document and estimated efforts for teh project.
  • Developed teh UNIX scripting code for loading, filtering and storing teh data.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on teh Hadoop cluster.
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks and within AWS.
  • Experience with NoSQL data modeling with Cassandra/Hbase/MongoDB etc.
  • Involved in loading data from UNIX file system to HDFS.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
  • Running process improvement processes to reduce defects in order to close production issues and improve applications.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
  • Developed and implemented jobs in MR2 Horton Works Cluster and Cluster within AWS.
  • Developed teh PIG code for loading, filtering and storing teh data.
  • Developed Hive Scripts (HQL) for automating teh joins for different sources.
  • Developed various Big Data workflows using Oozie.
  • Big data development with cloud experience Google cloud preferred.
  • Development of MapReduce programs and data migration from existing data source using Sqoop.
  • Developed teh custom writable Python programs to load teh data into teh HBase.
  • Developed Map Reduce Programs using MRv1 and MRv2 (YARN).
  • Developed Spark SQL jobs that read data from Data Lake using Hive transform and save it in Hbase.
  • Strong application DBA skills with Data modeling skills for NoSQL and relation databases.
  • Built Java client that is responsible for receiving XML file using REST call and publishing it to Kafka.
  • Built Kafka + Spark streaming job that is responsible for reading XML file messages from Kafka and transforming it to POJO using JAXB.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Responsible to migrate from Hadoop to Spark frameworks, in-memory distributed computing for real time fraud detection.
  • Effectively used Oozie to develop automatic workflows of Sqoop, MapReduce and Hive jobs.
  • Involved in running Hadoop jobs for processing millions of records of text data for batch and online processes by using Tuned/Modified SQL.
  • Responsible for designing highly scalable big data cluster to support various data storage and computation across varied big data cluster - Hadoop, Cassandra, MongoDB & Elastic Search.
  • Designed and published workbooks and dashboards using Tableau Dashboard/Server 6.X/7.X

Environment: Hadoop (HDFS), HBase, Map Reduce, Hive, Spark, Kafka, Oozie, flume, Spark, Cassandra, Horton works, UNIX Shell Scripting, MongoDB, MySQL, Eclipse, Toad, and HP Vertica 6.X/7.X.

Confidential - Hartford

Hadoop/Spark Developer

Responsibilities:

  • Review Business requirements documents test team to provide insights into teh data scenarios and test cases.
  • Analyzing and understanding teh Business requirements and Verifying teh Business requirement document and Technical design document against requirements.
  • Experience in Extract, Transform, and Load (ETL) Design, development and Testing.
  • Experience working on Spark and Scala.
  • Experience in scheduling teh Workflows and monitoring them. Provided Pro-Active Production Support after go-live.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs for data cleaning and preprocessing.
  • Extracted and processed teh data from Legacy systems and stored it on HDFS.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in creating Hive tables, writing complex Hive queries to populate Hive tables.
  • Generating user reports using HQL on teh data stored on HDFS.
  • Experience in tuning teh HQL queries to improve teh performance.
  • Experienced in managing and reviewing Hadoop log files.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Supported Map Reduce programs those are running on teh cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Used Oozie as an automation tool for running teh jobs.
  • Experience working on Hadoop and utilities like HDFS, Map Reduce, SQOOP, HIVE, OOZIE, KAFKA, IMPALA, HUE.
  • Experience in Unix scripting.
  • Experience in utilizing Teradata utilities FastLoad, MultiLoad, BTEQ scripting, TPT and FastExport.
  • Identified and performed field level compression on Teradata tables.
  • Experience in testing Data Marts, Data Warehouse/ETL Applications developed in mainframe/Teradata.
  • Experience in loading from various data sources like Teradata, Oracle, Fixed Width and Delimited Flat Files.
  • Involved in Data Extraction from Teradata and Flat Files using SQL assistant.
  • Written several complex SQL queries for validating Reports.
  • Tested several stored procedures.
  • Attending reviews, status meetings and participated in customer interaction.
  • Debugging teh SQL-Statements and stored procedures for business scenarios.
  • Performed extensive Data Validation, Data Verification against Data Warehouse.
  • Analyzed teh bug reports running SQL queries against teh source system(s) to perform root-cause analysis.
  • Created SQL queries to generate ad-hoc reports for teh business.
  • Verifying teh Business requirement document and Technical design document against requirements.
  • Worked on data profiling and data validation to ensure teh accuracy of teh data between teh warehouse and source systems.
  • Created and validated teh test data environment for Staging area, loading teh Staging area with data from multiple sources.
  • Created data masking rules to mask sensitive data before extracting of test data from various sources and loading of data into tables.
  • Created ETL test data for all transformation rules and covered all teh scenarios required for implementing business logic.
  • Developed and tested various stored procedure as part of process automation in Teradata.
  • Tested teh ETL process for both before and after data cleansing process.
  • Validating teh data passed to downstream systems.
  • Experience in generating following Hadoop performance metrics using Cloudera Manager that portrays teh overall cluster health status on weekly basis for senior management Confidential teh bank.
  • CPU and Memory Utilization for all edge nodes and Data nodes
  • Disk Space Utilization on all mount points for all teh edge nodes
  • Disk Space and Memory Utilization on Name Node
  • Edge Node Disk Utilization by application
  • Job tracker Memory used
  • Average map & reduce task running
  • RPC average processing time Remote procedure calls
  • HDFS Cluster Disk usage by applications
  • Healthy task tracker
  • Block distribution across all PROD data nodes

Environment: Teradata, Hadoop, Unix, Spark, Scala, Subversion, Git, Bitbucket, DM Express, Mainframe, MS Visio, MS Office Suite, Quality Centre, MS Outlook, HP Quality Centre.

Confidential - Dallas, TX

Hadoop Consultant

Responsibilities:

  • Responsible for loading teh customer's data and event logs from Kafka into HBase using REST API.
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
  • Worked on debugging, performance tuning and Analyzing data using Hadoop components Hive & Pig.
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Developed and implemented jobs in MR2 Horton Works Cluster.
  • Developed and executed Hive, Spark and PIG Queries for de-normalizing teh data.
  • Created Hive tables from JSON data using data serialization framework like AVRO.
  • Implemented generic export framework for moving data from HDFS to RDBMS and vice-versa.
  • Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
  • Created Hive External tables and loaded teh data in to tables and query data using HQL.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Worked on loading data from LINUX file system to HDFS.
  • Created HBase tables to store various data formats of PII data coming from different portfolios Implemented Map-reduce for loading data from oracle database to NoSQL database.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.
  • Moved data from Hadoop to Cassandra using Bulk output format class.
  • Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Responsible for processing unstructured data using Pig and Hive.
  • Adding nodes into teh clusters & decommission nodes for maintenance.
  • Created PIG script jobs in maintaining minimal query optimization.
  • Worked on various Business Object Reporting functionalities such as Slice and Dice, Master/detail, User Response function and different Formulas.
  • Strong experience on Apache server configuration.

Environment: Hadoop, HDFS, HBase, Pig, Hive, Spark, HortonWorks, Oozie, MapReduce, Sqoop, Cloudera, MongoDB, Cassandra, Kafka, LINUX, Java APIs, Java collection, Windows.

Confidential, Seattle, WA

Hadoop Admin/Developer

Responsibilities:

  • Supported Map Reduce Programs those are running on teh cluster.
  • Involved in using Pig Latin to analyze teh large scale data.
  • Involved in loading data from UNIX file system to HDFS.
  • Interacted with business users on regular basis to consolidate and analyze teh requirements and presented them with design results.
  • Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Involved in data visualization and provided teh files required for teh team by analyzing teh data in hive and developed Pig scripts for advanced analytics on teh data
  • Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of teh complex logical solutions.
  • Monitoring Hadoop scripts which take teh input from HDFS and load teh data into Hive.
  • Worked on improving teh performance by using various performance tuning strategies.
  • Managed teh evaluation of ETL and OLAP tools and recommended teh most suitable solutions depending on business needs.
  • Migrated jobs from development to test and production environments.
  • Created external tables with proper partitions for efficiency and loaded teh structured data in HDFS resulted from MR jobs.
  • Involved in moving all log files generated from various sources to HDFS for further processing.
  • Used Shell Scripts for loading, unloading, validating and records auditing purposes.
  • Used Teradata Aster bulk load feature to bulk load flat files to Aster.
  • Shell Scripts are also used for file validating, records auditing purposes.
  • Used Aster UDFs to unload data from staging tables and client data for SCD which resided on Aster database.
  • Extensively used SQL and PL/SQL for development of Procedures, Functions, Packages and Triggers.

Environment: Java, SQL, PL/SQL, Unix Shell Scripting, XML, Teradata Aster, Hive, Pig, Hadoop, MapReduce, Clear Case, HP Unix, Windows XP professional.

Confidential

Java Developer / Hadoop Developer

Responsibilities:

  • Involved in Requirements analysis, design, and development and testing.
  • Involved in developing of Group portal and Member portal applications.
  • Developed front end using Struts and JSP.
  • Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Involved in data visualization and provided teh files required for teh team by analyzing teh data in hive and developed Pig scripts for advanced analytics on teh data
  • Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of teh complex logical solutions.
  • Monitoring Hadoop scripts which take teh input from HDFS and load teh data into Hive.
  • Developed webpages using HTML, Java script, JQuery and CSS.
  • Developed customized reports and Unit Testing using JUNIT.
  • Used Java 1.6, spring, Hibernate, Oracle, to build teh product suite.
  • Responsible for building projects in deployable files (WAR files and JAR files).
  • Coded Java Servlets to control and maintain teh session state and handle user requests.
  • Involved in development, and Testing, phases of teh project by following agile methodology.
  • Implemented teh logging mechanism using log4j framework.
  • Developed Web Services.
  • Verified software errors and interacted with developers to resolve teh technical issues.
  • Used Maven to build teh J2EE application.
  • Wrote complex SQL queries and stored procedures.
  • Involved in maintenance of different applications.

Environment: Servlet, Enterprise Javabeans, Custom Tags, Stored Procedures, JavaScript, Java, Spring Framework, Struts, Web Services, Oracle.

Confidential

Java Developer

Responsibilities:

  • Involved in teh designing of teh project using UML.
  • Followed J2EE Specifications in teh project.
  • Designed teh user interface pages in JSP.
  • Used XML and XSL for mapping teh fields in database.
  • Used JavaScript for client side validations.
  • Created stored procedures and triggers that are required for project.
  • Created functions and views in Oracle.
  • Enhanced teh performance of teh whole application using teh stored procedures and prepared statements.
  • Responsible for updating database tables and designing SQL queries using PL/SQL.
  • Created bean classes for communicating with database.
  • Involved in documentation of teh module and project.
  • Prepared test cases and test scenarios as per business requirements.
  • Involved in bug fixing.
  • Prepared coded applications for unit testing using JUnit.

Environment: Java, JSP, Servlets, J2EE, EJB 3, Java Beans, Oracle, HTML, DHTML, XML, XSL, JavaScript, BEA WebLogic.

We'd love your feedback!