Hadoop Spark Consultant Resume Charlotte, NC - Hire IT People

PROFESSIONAL SUMMARY:

8+ Years of Professional IT experience in Big Data, Hadoop, Java /J2EE and Cloud technologies in Financial, Retail and HealthCare domains.
4+ years of experience working with large data sets and distributed computing on Hadoop and BigData technologies.
Experienced in building high performance and scalable solutions using various Hadoop ecosystem components like Yarn, Hive, Sqoop, Pig, Spark, Nifi, and Kafka.
Handled Data Movement, Data transformation, Analysis and visualization across the Data Lake by integrating the DataLake with various tools.
Extensively worked on Spark 1.6 & 2.1 and its components like Spark SQL and Spark streaming for data manipulation preparation and cleansing.
Defined extract - translate-load (ETL) and extract-load-translate (ELT) processes for the Data Lake and experienced working with ETL tools like Talend.
Experienced in writing Spark applications in Spark SQL library by using several inbuilt and user defined functionalities.
Extensively worked on Hive, Sqoop, and Pig for analysis, transformation, and management of Structured and Semi-Structured data.
Very good understanding and Working Knowledge of Object Oriented Programming (OOPS) and programming languages Scala and Python.
Worked on all major distributions of Hadoop Cloudera (CDH4, CDH5) and Hortonworks (HDP 2.4, 2.6).
Good knowledge of analyzing streaming data and defined real-time data streaming solutions across the cluster using Spark Streaming, Apache Storm, Kafka, and Flume.
Configured AWSEC2 instances, S3Buckets, Cloud services and architected the flow of data to and from AWS.
Transformed and aggregated data for analysis by implementing workflow management of Sqoop, Hive, and scripts.
Experienced in ETL operations from RDBMS databases like Oracle and Teradata to Hadoop and perform required analysis on the data.
Experience working on different file formats like Avro, Parquet, ORC, Sequence and Compression techniques like Gzip, Lzo, snappy in Hadoop.
Good experience working on Tableau and Spotfire and enabled the JDBC/ODBC data connectivity from those to Hive tables.
Well versed in SQL/PL SQL and Oracle database in writing queries, stored procedures, triggers, and functions.
Expertise in Unix/Linux environment in writing scripts and schedule or execute jobs.
Experience in developing Applications using Java, J2EE, JSP, MVC, Servlets, Struts, Hibernate, JDBC, JSF, EJB, XML, AJAX, and web-based development tools.
Expertise in Web Technologies like HTML, CSS, PHP, XML.
Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL Developer, SQL*Plus.
Highly motivated with the ability to work independently or as an integral part of a team and Committed to highest levels of the profession.

TECHNICAL SKILLS:

Big Data / Hadoop: HDFS, MapReduce, HBase, Kafka, PIG, HIVE, Sqoop, Flume, Spark, Zeppelin and Nifi

Realtime/Stream Processing: Apache Storm, Apache Spark

Cloud Technologies: Amazon web services, EC2, S3, EMR, Redshift

Operating Systems: Windows, Unix, and Linux

Programming Language: C, Java, Scala, J2EE, SQL

Data Base: Oracle 9i/10g, SQL Server, Teradata

Web Technologies: HTML, XML, JavaScript

Development & Build Tools: Eclipse, IntelliJ IDE, SBT and Gradle

Methodologies: Agile, Scrum, and Waterfall

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop Spark Consultant

Responsibilities:

Involved in design and development of Marketing Data Application in Hadoop and Spark Environment.
Data ingestion from data sourcing layers like Teradata and Oracle databases into HDFS and Hive by Scoop Import.
Developed Spark on Scala ETL data pipelines for data movement from landing HDFS to the target HDFS directory after transformation and analysis.
Performed and implemented Change Data Capture (CDC) model on customer behavioral and Account related daily and historical data by using Spark SQL on Scala.
Archival of Data of the Data sourcing layer and separation of required analytical and production data from the entire Raw data in Hive.
Worked extensively with new functionalities of Apache Spark like RDD’s, Data frames and Datasets and various performance tuning techniques.
Created Hive Views, External and Managed tables for published and production data for usage for Data Analytics.
Wrote Hive scripts for target table creation and maintained the optimum and necessary partition and bucket strategies for performance improvement.
Used Apache Pig and wrote Pig scripts for series of data operations like cleansing and research on Raw Data and iterative data processing.
Used Apache Nifi to automate and manage data movement between the base data layer to the target layer.
Involved in finalizing several performance tuning methods to optimise the Spark jobs.
Wrote Shell Scripts for the execution and deployment of developed Spark - Scala code.
Used Apache Zeppelin notebook for data discovery, visualizations, and analysis.
Used Gradle as build tool in IntelliJ Idea and SVN as a version control tool for the code generation.
Defect rectification during the Testing phase and supported QA’s during testing to identify the source of defects.

Environment: Hadoop 2.6, HDFS, YARN, Hive, Sqoop, Pig, Spark 2.1, Scala, SVN, IntelliJ, Gradle, AutoSys, UNIX, and Teradata.

Confidential, Sunnyvale, CA

Hadoop Consultant

Responsibilities:

Ingested data from various RDBMS database sources like Oracle and Teradata into HDFS and Hive using Sqoop import.
Developed and Created Partition and Bucket strategies for Managed and External tables for optimizing HIVE performance.
Handled large datasets by Spark using functionalities like Broadcast Joins, repartitions, persistence and degree of parallelism to optimise performance.
Involved in Data flow management, data recovery and secure data flow by prioritized queuing using Apache Nifi.
Experience in writing customized UDF's in java to extend Hive and Pig Latin functionality.
Used REST API’s to establish communication between external applications and Hadoop and implemented Restful services to pull data from the database tables.
Developed a Data pipeline using Kafka and Storm to store data into HDFS and performed real-time analysis of the incoming data.
Configured Spark Streaming to receive real-time data from the Kafka and store the stream data to HDFS using Scala.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Involved in scheduling Oozie workflow engine to run multiple Hive, Pig, and Spark jobs.
Wrote Shell Scripts for the execution and deployment of developed Spark code.
Defect rectification during the Testing phase and supported QA’s during testing

Environment: Hadoop, Yarn, HDFS, Pig, Hive, Sqoop, AWS, LINUX, Spark 1.6, Kafka, Hbase, and UNIX, IntelliJ

Confidential, Atlanta, GA

Sr. Hadoop Developer

Responsibilities:

Worked on performance analysis and improvements for Hive and Pig scripts at MapReduce job tuning level.
Used Sqoop to load data from RDBMS databases like Oracle and MySQL into Hadoop File System.
Worked on several POCs to validate and fit the several Hadoop ecosystem tools on CDH and Hortonworks distributions
Designed and Implemented Error-Free Data Warehouse-ETL and Hadoop Integration.
Proficient in data modeling with Hive partitioning, bucketing, and other optimization techniques.
Developed Python scripts to automate and provide Control flow to Pig scripts for extracting the data and load into HDFS.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Set up standards and processes for Hadoop based application design and implementation.
Wrote Shell Scripts for several day-to-day processes and worked on their automation and deployment.
Collected the logs data from web servers and integrated into HDFS using Flume.
Implemented Fair Schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
Worked on establishing connectivity between Tableau and Spotfire.

Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, HBase, Sqoop, Oozie, Flume, Linux, UNIX.

Confidential, King of Prussia, PA

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop .
Collection and Downloading of data generated by sensors from the Patients body activities to HDFS.
Performed necessary transformations and aggregation to build the common learner data model in NoSQL store (HBase).
Used Pig , Hive, and MapReduce for analyzing the Health insurance data and patient information.
Developed workflow in Oozie to orchestrate a series of Pig scripts to remove, merge and compress files using pig pipelines in the data preparation stage.
Moving all log files generated from various sources to HDFS for further processing through Flume .
Extensively used PIG to communicate with Hive and HBase using Hcatalog and Handlers.
Involved in transforming data from legacy tables to HDFS , and HBase tables using Sqoop .
Implemented test scripts to support the test-driven development and continuous integration.
Exported analyzed data to relational databases using Sqoop for visualization and generate reports for the BI team.
Good understanding of ETL tools and their application to Big Data environment.

Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Oozie, Java, HBase, Flume, Oracle 10g, UNIX Shell Scripting.

Confidential, St Petersburg, FL

Java Developer

Responsibilities:

Designed the application in J2EE architecture and developed dynamic and browser compatible User Interfaces for online account management, order and payment processing.
Used Hibernate Object-relational mapping (ORM) to achieve data persistence.
Developed Servlets and JSPs based on MVC pattern using Spring Framework.
Developed required helper classes following Core Java multi-threaded programming.
Developed the presentation layer using JSP, Tag libraries, HTML, CSS and client validations using JavaScript.
Designed and developed Web services based on SOAP and WSDL for handling transaction history.
Developed web applications using Spring MVC, jQuery and implemented Spring Dependency Injection mechanism.
Developed data access classes using JDBC and created SQL queries and used PL/SQL procedures with Oracle Database.
Used LOG4J & JUnit for debugging, testing and maintaining the system state and tested the website with older and latest versions/releases on multiple browsers.
Implemented test cases for Unit testing of modules using JUnit and used ANT for building the project.
Provided production support for two of the applications involving swing and struts framework.

Environment: JDK 1.6, JSP, HTML, JavaScript, JSON, XML, jQuery, Servlets, Spring MVC, Hibernate, Web Services, SOAP, NetBeans.

Confidential, Charlotte, NC

Java Developer

Responsibilities:

Worked with Business analysts and Product owners to analyze and understand the requirements and giving the estimates.
Implement J2EE design patterns such as Singleton, DAO, DTO and MVC.
Developed this web application to store all system information in a central location using Spring MVC, JSP, Servlet, and HTML.
Designed and developed database objects like Tables, Views, Stored Procedures, User Functions using PL/SQL, SQL Developer and used them in WEB components.
Developed JavaScript and JQuery functions for all Client side Validations.
Developed Junit test cases for Unit Testing &Used Maven as build and configuration tool.
Used Shell scripting to create jobs to run on daily basis.
Debugged the application using Firebug and traversed through the nodes of the tree using DOM functions.
Monitored the error logs using log4jand fixed the problems.
Used Eclipse IDE and deployed the application on Web Logic server.
Responsible for configuring and deploying the builds on Web Sphere App Server.

Environment: Java, J2EE, JavaScript, XML, JavaScript, JDBC, Spring Framework, Hibernate, Rest Full Web services, Web Logic Server, Log4j, JUnit, ANT, SoapUI, Oracle11g.

Confidential, Plano TX

Java Developer

Responsibilities:

Design and development of Java classes using Object Oriented Methodology.
Worked in system using Java, JSP and SERVLET.
Development of Java classes and methods for handling Data from the database.
Created and modified web pages using HTML and CSS with JavaScript validation.
Used JDBC/Jconnect for Oracle.
Create SQL script to create/drop database objects like tables, views, indexes, constraints, sequences, and synonyms.
Developing efficient queries and views to produce customers delight.
Creating Servlets, JSP for administration module.
Creating Unix Shell Scripts for sequential execution of Java scripts including data extraction, loading and Oracle Stored Procedure execution.
Developing many KSH scripts for data file movement and scheduling.
Attended and Conducted User meetings for requirement analysis and project reporting.
Testing and bug fixing and providing support the production.

Environment: Windows XP, Oracle 9i database, EJB 2.1, JSP, Struts Framework, BEA Web logic 8.1, HTML, JavaScript, and Eclipse.

Confidential

Java Developer

Responsibilities:

Collecting and understanding the User requirements and Functional specifications.
Development of GUI Using HTML, CSS, JSP, and JavaScript.
Creating components for isolated business logic.
Deployment of application in J2EE Architecture.
Using Oracle 8i as the Database Server.
Designing EJB 2.0 components with various design patterns like Service Locator and Business Delegate.
Finalize the design specifications for the new system.
Involvement in design, development, and maintenance of the application.
Performing Unit Integration and performance testing and continuous interaction with Quality Assurance group.
Provided on-call support based on the priority of the issues.

Environment: Java, JSP, SQL, MS-Access, JavaScript, HTML.

We provide IT Staff Augmentation Services!

Hadoop Spark Consultant Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship