Hadoop Consultant Resume Chicago, IL - Hire IT People

SUMMARY

Result Oriented Professional building on 8+ years of progressive experience in Software Development includes application design and development along wif 3+ years in Big Data/ Hadoop experience in Hadoop ecosystem such as HDFS, MapReduce, Hive, Pig, Flume, Sqoop, Zookeeper, HBase, and Spark.
Big data development experience wif Google cloud.
Experience in working wif various Hadoop distributions - Cloudera and HortonWorks.
Experience in migrating the data using Sqoop from Hadoop to Relational Database System and vice-versa.
Expertise in Hadoop administration such as managing cluster, reviewing Hadoop log files.
Experience wif leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
Experience in NoSQL database MongoDB and Cassandra.
Proficient in configuring Zookeeper, Cassandra & Flume to the existing Hadoop cluster.
Experience in installation, configuration, supporting and managing- CloudEra's Hadoop platformalong wif CDH3&4 &5 clusters.
Familiarity on real time streaming data wif Spark and Kafka.
Experience in ETL analytics on ingested data using scripts built wif Hive, Pig, Spark, MapReduce dat include interactive, batch and real time processing.
Expertise in Java/J2EE technologies such as Core Java, spring, Hibernate, JDBC, JSON, HTML, Struts, Servlets, JSP, JBOSS and JavaScript.
Have Experience of using integrated development environment like Eclipse, Net beans, JDeveloper, My Eclipse.
Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
Good Experience in writing complex SQL queries wif databases like DB2, Oracle 10g, MySQL, SQL Server and MS SQL Server 2005/2008.
Extensive Experience in developing test cases, performing Unit Testing and Integration Testing using source code management tools such as GIT, SVN and Perforce.
Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning.
Ability to blend technical expertise wif strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS

Big Data: Hadoop, Map Reduce, Pig, Hive, Hbase, Sqoop, Oozie, Cassandra, MongoDB, Horton Works, Kafka, Spark and Zookeeper, Big Query

Web development: HTML, Java Script, XML, PHP, JSP, Servlets, JavaScript

Databases: DB2, MySQL,MS Access, MS SQL server,Teradata, NoSQL, Vertica, Aster nCluster, SSAS, Oracle, Oracle Essbase.

Languages: Java / J2EE, HTML, SQL,Spring, Hibernate, JDBC,JSON, JavaScript

Operating Systems: Mac OS, Unix, Linux (Various Versions), Windows 2003/7/8/8.1/XP/Vista

Web/Application server: Apache Tomcat, WebLogic, WebSphere Tools Eclipse, NetBeans

Version Control: Git, SVN, Perforce

IDE’S: Intellij, Eclipse, NetBeans, JDeveloper

PROFESSIONAL EXPERIENCE

Confidential - Chicago, IL

Hadoop Consultant

Responsibilities:

Cloudera Hadoop installation & configuration of multiple nodes using Cloudera Manager and CDH 4.X/5.X.
Prepared low level Design document and estimated efforts for the project.
Developed the UNIX scripting code for loading, filtering and storing the data.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Installing, Upgrading and Managing Hadoop Cluster on Hortonworks and wifin AWS.
Experience wif NoSQL data modeling wif Cassandra/Hbase/MongoDB etc.
Involved in loading data from UNIX file system to HDFS.
Performed data analysis in Hive by creating tables, loading it wif data and writing hive queries which will run internally in a MapReduce way.
Running process improvement processes to reduce defects in order to close production issues and improve applications.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
Developed and implemented jobs in MR2 Horton Works Cluster and Cluster wifin AWS.
Developed the PIG code for loading, filtering and storing the data.
Developed Hive Scripts (HQL) for automating the joins for different sources.
Developed various Big Data workflows using Oozie.
Big data development wif cloud experience Google cloud preferred.
Development of MapReduce programs and data migration from existing data source using Sqoop.
Developed the custom writable Python programs to load the data into the HBase.
Developed Map Reduce Programs using MRv1 and MRv2 (YARN).
Developed Spark SQL jobs dat read data from Data Lake using Hive transform and save it in Hbase.
Strong application DBA skills wif Data modeling skills for NoSQL and relation databases.
Built Java client dat is responsible for receiving XML file using REST call and publishing it to Kafka.
Built Kafka + Spark streaming job dat is responsible for reading XML file messages from Kafka and transforming it to POJO using JAXB.
Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Responsible to migrate from Hadoop to Spark frameworks, in-memory distributed computing for real time fraud detection.
Effectively used Oozie to develop automatic workflows of Sqoop, MapReduce and Hive jobs.
Involved in running Hadoop jobs for processing millions of records of text data for batch and online processes by using Tuned/Modified SQL.
Responsible for designing highly scalable big data cluster to support various data storage and computation across varied big data cluster - Hadoop, Cassandra, MongoDB & Elastic Search.
Designed and published workbooks and dashboards using Tableau Dashboard/Server 6.X/7.X

Environment: Hadoop (HDFS), HBase, Map Reduce, Hive, Spark, Kafka, Oozie, flume, Spark, Cassandra, Horton works, UNIX Shell Scripting, MongoDB, MySQL, Eclipse, Toad, and HP Vertica 6.X/7.X.

Confidential - Hartford

Hadoop/Spark Developer

Responsibilities:

Review Business requirements documents test team to provide insights into the data scenarios and test cases.
Analyzing and understanding the Business requirements and Verifying the Business requirement document and Technical design document against requirements.
Experience in Extract, Transform, and Load (ETL) Design, development and Testing.
Experience working on Spark and Scala.
Experience in scheduling the Workflows and monitoring them. Provided Pro-Active Production Support after go-live.
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs for data cleaning and preprocessing.
Extracted and processed the data from Legacy systems and stored it on HDFS.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables, writing complex Hive queries to populate Hive tables.
Generating user reports using HQL on the data stored on HDFS.
Experience in tuning the HQL queries to improve the performance.
Experienced in managing and reviewing Hadoop log files.
Load and transform large sets of structured, semi structured and unstructured data.
Supported Map Reduce programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Used Oozie as an automation tool for running the jobs.
Experience working on Hadoop and utilities like HDFS, Map Reduce, SQOOP, HIVE, OOZIE, KAFKA, IMPALA, HUE.
Experience in Unix scripting.
Experience in utilizing Teradata utilities FastLoad, MultiLoad, BTEQ scripting, TPT and FastExport.
Identified and performed field level compression on Teradata tables.
Experience in testing Data Marts, Data Warehouse/ETL Applications developed in mainframe/Teradata.
Experience in loading from various data sources like Teradata, Oracle, Fixed Width and Delimited Flat Files.
Involved in Data Extraction from Teradata and Flat Files using SQL assistant.
Written several complex SQL queries for validating Reports.
Tested several stored procedures.
Attending reviews, status meetings and participated in customer interaction.
Debugging the SQL-Statements and stored procedures for business scenarios.
Performed extensive Data Validation, Data Verification against Data Warehouse.
Analyzed the bug reports running SQL queries against the source system(s) to perform root-cause analysis.
Created SQL queries to generate ad-hoc reports for the business.
Verifying the Business requirement document and Technical design document against requirements.
Worked on data profiling and data validation to ensure the accuracy of the data between the warehouse and source systems.
Created and validated the test data environment for Staging area, loading the Staging area wif data from multiple sources.
Created data masking rules to mask sensitive data before extracting of test data from various sources and loading of data into tables.
Created ETL test data for all transformation rules and covered all the scenarios required for implementing business logic.
Developed and tested various stored procedure as part of process automation in Teradata.
Tested the ETL process for both before and after data cleansing process.
Validating the data passed to downstream systems.
Experience in generating following Hadoop performance metrics using Cloudera Manager dat portrays the overall cluster health status on weekly basis for senior management Confidential the bank.
CPU and Memory Utilization for all edge nodes and Data nodes
Disk Space Utilization on all mount points for all the edge nodes
Disk Space and Memory Utilization on Name Node
Edge Node Disk Utilization by application
Job tracker Memory used
Average map & reduce task running
RPC average processing time Remote procedure calls
HDFS Cluster Disk usage by applications
Healthy task tracker
Block distribution across all PROD data nodes

Environment: Teradata, Hadoop, Unix, Spark, Scala, Subversion, Git, Bitbucket, DM Express, Mainframe, MS Visio, MS Office Suite, Quality Centre, MS Outlook, HP Quality Centre.

Confidential - Dallas, TX

Hadoop Consultant

Responsibilities:

Responsible for loading the customer's data and event logs from Kafka into HBase using REST API.
Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
Worked on debugging, performance tuning and Analyzing data using Hadoop components Hive & Pig.
Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Developed and implemented jobs in MR2 Horton Works Cluster.
Developed and executed Hive, Spark and PIG Queries for de-normalizing the data.
Created Hive tables from JSON data using data serialization framework like AVRO.
Implemented generic export framework for moving data from HDFS to RDBMS and vice-versa.
Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Worked on loading data from LINUX file system to HDFS.
Created HBase tables to store various data formats of PII data coming from different portfolios Implemented Map-reduce for loading data from oracle database to NoSQL database.
Used Cloudera Manager for installation and management of Hadoop Cluster.
Moved data from Hadoop to Cassandra using Bulk output format class.
Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Responsible for processing unstructured data using Pig and Hive.
Adding nodes into the clusters & decommission nodes for maintenance.
Created PIG script jobs in maintaining minimal query optimization.
Worked on various Business Object Reporting functionalities such as Slice and Dice, Master/detail, User Response function and different Formulas.
Strong experience on Apache server configuration.

Environment: Hadoop, HDFS, HBase, Pig, Hive, Spark, HortonWorks, Oozie, MapReduce, Sqoop, Cloudera, MongoDB, Cassandra, Kafka, LINUX, Java APIs, Java collection, Windows.

Confidential, Seattle, WA

Hadoop Admin/Developer

Responsibilities:

Supported Map Reduce Programs those are running on the cluster.
Involved in using Pig Latin to analyze the large scale data.
Involved in loading data from UNIX file system to HDFS.
Interacted wif business users on regular basis to consolidate and analyze the requirements and presented them wif design results.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Involved in data visualization and provided the files required for the team by analyzing the data in hive and developed Pig scripts for advanced analytics on the data
Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of the complex logical solutions.
Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive.
Worked on improving the performance by using various performance tuning strategies.
Managed the evaluation of ETL and OLAP tools and recommended the most suitable solutions depending on business needs.
Migrated jobs from development to test and production environments.
Created external tables wif proper partitions for efficiency and loaded the structured data in HDFS resulted from MR jobs.
Involved in moving all log files generated from various sources to HDFS for further processing.
Used Shell Scripts for loading, unloading, validating and records auditing purposes.
Used Teradata Aster bulk load feature to bulk load flat files to Aster.
Shell Scripts are also used for file validating, records auditing purposes.
Used Aster UDFs to unload data from staging tables and client data for SCD which resided on Aster database.
Extensively used SQL and PL/SQL for development of Procedures, Functions, Packages and Triggers.

Environment: Java, SQL, PL/SQL, Unix Shell Scripting, XML, Teradata Aster, Hive, Pig, Hadoop, MapReduce, Clear Case, HP Unix, Windows XP professional.

Confidential

Java Developer / Hadoop Developer

Responsibilities:

Involved in Requirements analysis, design, and development and testing.
Involved in developing of Group portal and Member portal applications.
Developed front end using Struts and JSP.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Involved in data visualization and provided the files required for the team by analyzing the data in hive and developed Pig scripts for advanced analytics on the data
Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of the complex logical solutions.
Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive.
Developed webpages using HTML, Java script, JQuery and CSS.
Developed customized reports and Unit Testing using JUNIT.
Used Java 1.6, spring, Hibernate, Oracle, to build the product suite.
Responsible for building projects in deployable files (WAR files and JAR files).
Coded Java Servlets to control and maintain the session state and handle user requests.
Involved in development, and Testing, phases of the project by following agile methodology.
Implemented the logging mechanism using log4j framework.
Developed Web Services.
Verified software errors and interacted wif developers to resolve the technical issues.
Used Maven to build the J2EE application.
Wrote complex SQL queries and stored procedures.
Involved in maintenance of different applications.

Environment: Servlet, Enterprise Javabeans, Custom Tags, Stored Procedures, JavaScript, Java, Spring Framework, Struts, Web Services, Oracle.

Confidential

Java Developer

Responsibilities:

Involved in the designing of the project using UML.
Followed J2EE Specifications in the project.
Designed the user interface pages in JSP.
Used XML and XSL for mapping the fields in database.
Used JavaScript for client side validations.
Created stored procedures and triggers dat are required for project.
Created functions and views in Oracle.
Enhanced the performance of the whole application using the stored procedures and prepared statements.
Responsible for updating database tables and designing SQL queries using PL/SQL.
Created bean classes for communicating wif database.
Involved in documentation of the module and project.
Prepared test cases and test scenarios as per business requirements.
Involved in bug fixing.
Prepared coded applications for unit testing using JUnit.

Environment: Java, JSP, Servlets, J2EE, EJB 3, Java Beans, Oracle, HTML, DHTML, XML, XSL, JavaScript, BEA WebLogic.

We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

Chicago, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship