Hadoop Consultant Resume Chicago, IL - Hire IT People

SUMMARY

Result Oriented Professional building on 8+ years of progressive experience in Software Development includes application design and development along with 3+ years in Big Data/ Hadoop experience in Hadoop ecosystem such as HDFS, MapReduce, Hive, Pig, Flume, Sqoop, Zookeeper, HBase, and Spark.
Big data development experience with Google cloud.
Experience in working with various Hadoop distributions - Cloudera and HortonWorks.
Experience in migrating teh data using Sqoop from Hadoop to Relational Database System and vice-versa.
Expertise in Hadoop administration such as managing cluster, reviewing Hadoop log files.
Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
Experience in NoSQL database MongoDB and Cassandra.
Proficient in configuring Zookeeper, Cassandra & Flume to teh existing Hadoop cluster.
Experience in installation, configuration, supporting and managing- CloudEra's Hadoop platformalong with CDH3&4 &5 clusters.
Familiarity on real time streaming data with Spark and Kafka.
Experience in ETL analytics on ingested data using scripts built with Hive, Pig, Spark, MapReduce that include interactive, batch and real time processing.
Expertise in Java/J2EE technologies such as Core Java, spring, Hibernate, JDBC, JSON, HTML, Struts, Servlets, JSP, JBOSS and JavaScript.
Have Experience of using integrated development environment like Eclipse, Net beans, JDeveloper, My Eclipse.
Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
Good Experience in writing complex SQL queries with databases like DB2, Oracle 10g, MySQL, SQL Server and MS SQL Server 2005/2008.
Extensive Experience in developing test cases, performing Unit Testing and Integration Testing using source code management tools such as GIT, SVN and Perforce.
Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS

Big Data: Hadoop, Map Reduce, Pig, Hive, Hbase, Sqoop, Oozie, Cassandra, MongoDB, Horton Works, Kafka, Spark and Zookeeper, Big Query

Web development: HTML, Java Script, XML, PHP, JSP, Servlets, JavaScript

Databases: DB2, MySQL,MS Access, MS SQL server,Teradata, NoSQL, Vertica, Aster nCluster, SSAS, Oracle, Oracle Essbase.

Languages: Java / J2EE, HTML, SQL,Spring, Hibernate, JDBC,JSON, JavaScript

Operating Systems: Mac OS, Unix, Linux (Various Versions), Windows 2003/7/8/8.1/XP/Vista

Web/Application server: Apache Tomcat, WebLogic, WebSphere Tools Eclipse, NetBeans

Version Control: Git, SVN, Perforce

IDE’S: Intellij, Eclipse, NetBeans, JDeveloper

PROFESSIONAL EXPERIENCE

Confidential - Chicago, IL

Hadoop Consultant

Responsibilities:

Cloudera Hadoop installation & configuration of multiple nodes using Cloudera Manager and CDH 4.X/5.X.
Prepared low level Design document and estimated efforts for teh project.
Developed teh UNIX scripting code for loading, filtering and storing teh data.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on teh Hadoop cluster.
Installing, Upgrading and Managing Hadoop Cluster on Hortonworks and within AWS.
Experience with NoSQL data modeling with Cassandra/Hbase/MongoDB etc.
Involved in loading data from UNIX file system to HDFS.
Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
Running process improvement processes to reduce defects in order to close production issues and improve applications.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
Developed and implemented jobs in MR2 Horton Works Cluster and Cluster within AWS.
Developed teh PIG code for loading, filtering and storing teh data.
Developed Hive Scripts (HQL) for automating teh joins for different sources.
Developed various Big Data workflows using Oozie.
Big data development with cloud experience Google cloud preferred.
Development of MapReduce programs and data migration from existing data source using Sqoop.
Developed teh custom writable Python programs to load teh data into teh HBase.
Developed Map Reduce Programs using MRv1 and MRv2 (YARN).
Developed Spark SQL jobs that read data from Data Lake using Hive transform and save it in Hbase.
Strong application DBA skills with Data modeling skills for NoSQL and relation databases.
Built Java client that is responsible for receiving XML file using REST call and publishing it to Kafka.
Built Kafka + Spark streaming job that is responsible for reading XML file messages from Kafka and transforming it to POJO using JAXB.
Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Responsible to migrate from Hadoop to Spark frameworks, in-memory distributed computing for real time fraud detection.
Effectively used Oozie to develop automatic workflows of Sqoop, MapReduce and Hive jobs.
Involved in running Hadoop jobs for processing millions of records of text data for batch and online processes by using Tuned/Modified SQL.
Responsible for designing highly scalable big data cluster to support various data storage and computation across varied big data cluster - Hadoop, Cassandra, MongoDB & Elastic Search.
Designed and published workbooks and dashboards using Tableau Dashboard/Server 6.X/7.X

Environment: Hadoop (HDFS), HBase, Map Reduce, Hive, Spark, Kafka, Oozie, flume, Spark, Cassandra, Horton works, UNIX Shell Scripting, MongoDB, MySQL, Eclipse, Toad, and HP Vertica 6.X/7.X.

Confidential - Hartford

Hadoop/Spark Developer

Responsibilities:

Review Business requirements documents test team to provide insights into teh data scenarios and test cases.
Analyzing and understanding teh Business requirements and Verifying teh Business requirement document and Technical design document against requirements.
Experience in Extract, Transform, and Load (ETL) Design, development and Testing.
Experience working on Spark and Scala.
Experience in scheduling teh Workflows and monitoring them. Provided Pro-Active Production Support after go-live.
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs for data cleaning and preprocessing.
Extracted and processed teh data from Legacy systems and stored it on HDFS.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables, writing complex Hive queries to populate Hive tables.
Generating user reports using HQL on teh data stored on HDFS.
Experience in tuning teh HQL queries to improve teh performance.
Experienced in managing and reviewing Hadoop log files.
Load and transform large sets of structured, semi structured and unstructured data.
Supported Map Reduce programs those are running on teh cluster.
Involved in loading data from UNIX file system to HDFS.
Used Oozie as an automation tool for running teh jobs.
Experience working on Hadoop and utilities like HDFS, Map Reduce, SQOOP, HIVE, OOZIE, KAFKA, IMPALA, HUE.
Experience in Unix scripting.
Experience in utilizing Teradata utilities FastLoad, MultiLoad, BTEQ scripting, TPT and FastExport.
Identified and performed field level compression on Teradata tables.
Experience in testing Data Marts, Data Warehouse/ETL Applications developed in mainframe/Teradata.
Experience in loading from various data sources like Teradata, Oracle, Fixed Width and Delimited Flat Files.
Involved in Data Extraction from Teradata and Flat Files using SQL assistant.
Written several complex SQL queries for validating Reports.
Tested several stored procedures.
Attending reviews, status meetings and participated in customer interaction.
Debugging teh SQL-Statements and stored procedures for business scenarios.
Performed extensive Data Validation, Data Verification against Data Warehouse.
Analyzed teh bug reports running SQL queries against teh source system(s) to perform root-cause analysis.
Created SQL queries to generate ad-hoc reports for teh business.
Verifying teh Business requirement document and Technical design document against requirements.
Worked on data profiling and data validation to ensure teh accuracy of teh data between teh warehouse and source systems.
Created and validated teh test data environment for Staging area, loading teh Staging area with data from multiple sources.
Created data masking rules to mask sensitive data before extracting of test data from various sources and loading of data into tables.
Created ETL test data for all transformation rules and covered all teh scenarios required for implementing business logic.
Developed and tested various stored procedure as part of process automation in Teradata.
Tested teh ETL process for both before and after data cleansing process.
Validating teh data passed to downstream systems.
Experience in generating following Hadoop performance metrics using Cloudera Manager that portrays teh overall cluster health status on weekly basis for senior management Confidential teh bank.
CPU and Memory Utilization for all edge nodes and Data nodes
Disk Space Utilization on all mount points for all teh edge nodes
Disk Space and Memory Utilization on Name Node
Edge Node Disk Utilization by application
Job tracker Memory used
Average map & reduce task running
RPC average processing time Remote procedure calls
HDFS Cluster Disk usage by applications
Healthy task tracker
Block distribution across all PROD data nodes

Environment: Teradata, Hadoop, Unix, Spark, Scala, Subversion, Git, Bitbucket, DM Express, Mainframe, MS Visio, MS Office Suite, Quality Centre, MS Outlook, HP Quality Centre.

Confidential - Dallas, TX

Hadoop Consultant

Responsibilities:

Responsible for loading teh customer's data and event logs from Kafka into HBase using REST API.
Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
Worked on debugging, performance tuning and Analyzing data using Hadoop components Hive & Pig.
Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Developed and implemented jobs in MR2 Horton Works Cluster.
Developed and executed Hive, Spark and PIG Queries for de-normalizing teh data.
Created Hive tables from JSON data using data serialization framework like AVRO.
Implemented generic export framework for moving data from HDFS to RDBMS and vice-versa.
Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
Created Hive External tables and loaded teh data in to tables and query data using HQL.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Worked on loading data from LINUX file system to HDFS.
Created HBase tables to store various data formats of PII data coming from different portfolios Implemented Map-reduce for loading data from oracle database to NoSQL database.
Used Cloudera Manager for installation and management of Hadoop Cluster.
Moved data from Hadoop to Cassandra using Bulk output format class.
Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Responsible for processing unstructured data using Pig and Hive.
Adding nodes into teh clusters & decommission nodes for maintenance.
Created PIG script jobs in maintaining minimal query optimization.
Worked on various Business Object Reporting functionalities such as Slice and Dice, Master/detail, User Response function and different Formulas.
Strong experience on Apache server configuration.

Environment: Hadoop, HDFS, HBase, Pig, Hive, Spark, HortonWorks, Oozie, MapReduce, Sqoop, Cloudera, MongoDB, Cassandra, Kafka, LINUX, Java APIs, Java collection, Windows.

Confidential, Seattle, WA

Hadoop Admin/Developer

Responsibilities:

Supported Map Reduce Programs those are running on teh cluster.
Involved in using Pig Latin to analyze teh large scale data.
Involved in loading data from UNIX file system to HDFS.
Interacted with business users on regular basis to consolidate and analyze teh requirements and presented them with design results.
Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
Involved in data visualization and provided teh files required for teh team by analyzing teh data in hive and developed Pig scripts for advanced analytics on teh data
Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of teh complex logical solutions.
Monitoring Hadoop scripts which take teh input from HDFS and load teh data into Hive.
Worked on improving teh performance by using various performance tuning strategies.
Managed teh evaluation of ETL and OLAP tools and recommended teh most suitable solutions depending on business needs.
Migrated jobs from development to test and production environments.
Created external tables with proper partitions for efficiency and loaded teh structured data in HDFS resulted from MR jobs.
Involved in moving all log files generated from various sources to HDFS for further processing.
Used Shell Scripts for loading, unloading, validating and records auditing purposes.
Used Teradata Aster bulk load feature to bulk load flat files to Aster.
Shell Scripts are also used for file validating, records auditing purposes.
Used Aster UDFs to unload data from staging tables and client data for SCD which resided on Aster database.
Extensively used SQL and PL/SQL for development of Procedures, Functions, Packages and Triggers.

Environment: Java, SQL, PL/SQL, Unix Shell Scripting, XML, Teradata Aster, Hive, Pig, Hadoop, MapReduce, Clear Case, HP Unix, Windows XP professional.

Confidential

Java Developer / Hadoop Developer

Responsibilities:

Involved in Requirements analysis, design, and development and testing.
Involved in developing of Group portal and Member portal applications.
Developed front end using Struts and JSP.
Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
Involved in data visualization and provided teh files required for teh team by analyzing teh data in hive and developed Pig scripts for advanced analytics on teh data
Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of teh complex logical solutions.
Monitoring Hadoop scripts which take teh input from HDFS and load teh data into Hive.
Developed webpages using HTML, Java script, JQuery and CSS.
Developed customized reports and Unit Testing using JUNIT.
Used Java 1.6, spring, Hibernate, Oracle, to build teh product suite.
Responsible for building projects in deployable files (WAR files and JAR files).
Coded Java Servlets to control and maintain teh session state and handle user requests.
Involved in development, and Testing, phases of teh project by following agile methodology.
Implemented teh logging mechanism using log4j framework.
Developed Web Services.
Verified software errors and interacted with developers to resolve teh technical issues.
Used Maven to build teh J2EE application.
Wrote complex SQL queries and stored procedures.
Involved in maintenance of different applications.

Environment: Servlet, Enterprise Javabeans, Custom Tags, Stored Procedures, JavaScript, Java, Spring Framework, Struts, Web Services, Oracle.

Confidential

Java Developer

Responsibilities:

Involved in teh designing of teh project using UML.
Followed J2EE Specifications in teh project.
Designed teh user interface pages in JSP.
Used XML and XSL for mapping teh fields in database.
Used JavaScript for client side validations.
Created stored procedures and triggers that are required for project.
Created functions and views in Oracle.
Enhanced teh performance of teh whole application using teh stored procedures and prepared statements.
Responsible for updating database tables and designing SQL queries using PL/SQL.
Created bean classes for communicating with database.
Involved in documentation of teh module and project.
Prepared test cases and test scenarios as per business requirements.
Involved in bug fixing.
Prepared coded applications for unit testing using JUnit.

Environment: Java, JSP, Servlets, J2EE, EJB 3, Java Beans, Oracle, HTML, DHTML, XML, XSL, JavaScript, BEA WebLogic.

We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

Chicago, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship