Hadoop developer Resume Charlotte, NC - Hire IT People

SUMMARY

Around 8 years of IT professional Experience along with Hadoop, experience in developing, configuring, implementing Hadoop and Big - data ecosystems on various platforms.
Experience in dealing with Apache Hadoop components like HDFS, MapReduce, HIVE, HBase, PIG, SQOOP, Spark and Flume Big Data and Big Data Analytics.
Experience in analyzing data using HiveQL (HQL), Pig Latin.
Got experience in managing and reviewing Hadoop Log files.
Worked with Sqoop to move (import/export) data from a relational database into Hadoop and used FLUME to collect data and populate Hadoop.
Experience in working with versions of Hadoop 1.0 and Hadoop 2.0 (YARN).
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Name Node, Job Tracker, Data Node, Task Tracker and Map Reduce concepts.
Worked on the data ingestion from SQL Server to our Datalake by using Sqoop and Shell scripts.
Involved in data transfer HDFS to RDBMS using SQOOP and vice-versa.
Excellent knowledge on Spark Core architecture.
Hands on expertise in writing different RDD transformations, actions using Scala.
Created Data Frames and performed analysis using Spark SQL.
Have Good experience in Client side designing and validations using HTML, DHTML and JavaScript.
Have good knowledge in creating the Autosys jobs and Beatle jobs.

TECHNICAL SKILLS

Big Data Tools: HDFS, MapReduce, YARN, Hive (HQL), Pig, Sqoop, Flume, Oozie, Kafka, spark, Horton work.

Hadoop Distribution: Cloudera Distribution of Hadoop (CDH).

Web Technologies: HTML, XML, XHTML, JAVASCRIPT.

Programming Languages: SQL, MySQL, Oracle, Scala.

Database: MySQL, NoSQL, HBase, Oracle.

Operating Systems: UNIX, Linux, Windows Variants.

Tools: Eclipse, IntelliJ, SBT, SQL Server Management Studio, GitHub.

PROFESSIONAL EXPERIENCE

Confidential, Charlotte, NC

Hadoop developer

Responsibilities:

Involved in spark migration from Spark 1.x to Spark 2.x.
Involved in implementing Spark Dataframes using Java API from Different sources of systems
Involved in writing AutoSys Jobs like Box and Jil files creation.
Used HDFS File system API for moving files between LFS and HDFS.
Perform transformations/partitioning on incoming data and created external Hive tables to store the processed results and storing data in Parquet format.
Data validation, schema validation & validating data between staging tables & final tables in HULC.
Manage and monitor Hadoop, Spark, HIVE development issues and bugs.

Confidential, Charlotte, NC

Hadoop developer

Responsibilities:

Create schema document for hive tables, and Create Hive tables in Datalake bronze layer with and without partitions
Create hql files to load data from staging to permanent table
Load the hive tables from source files through file ingestion process
Load the hive tables from external database server tables through Sqoop ingestion process
Create config files for data loading process
Create and schedule Autosys jobs for load process
Unit testing of loading process
Attending daily standup calls and provide updates on current task status
Attending Sprint Planning meetings and provide estimates for upcoming tasks.
Supporting the application in Production environment and providing fixes to the issues.
Developing enhancements to the existing application.

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

Helped business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
Assess existing and available data warehousing technologies and methods to ensure our Data warehouse/BI architecture meets the needs of the business unit and enterprise and allows for business growth.
Capturing data from existing databases that provide SQL interfaces using Sqoop.
Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa. Loading data into HDFS.
Created Hive queries (HQL) that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Worked with Spark and Scala.
Responsible for end to end design on Spark Sql, Development to meet the requirements.
Experienced in working with Spark eco-system using SparkSQL and Scala queries on different formats like Text file, CSV file.
Experience in Query data using Spark SQL on the top of Spark Engine implementing Spark RDD’s in Scala.
Performed data frame and dataset operations on rdd.
Involved in converting MapReduce programs into Spark transformations using Spark RDD
Managed and reviewed Hadoop log files.
Tested raw data and executed performance scripts.
Shared responsibility for administration of Hadoop, Hive and Pig.
Developed Hive queries for the analysts.

Tools: Used: Hadoop, HDFS, Hive(HQL), Sqoop, Oozie, Spark, Spark SQL, Cloudera, PL/SQL, SQL*PLUS, Windows NT, UNIX Shell Scripting.

Confidential

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Checking of Hadoop daemon services and responding accordingly to any warning or failure conditions.
Deployed Hadoop Cluster in the different modes- Standalone, Pseudo-distributed, Fully Distributed
Writing shell scripts to monitor the health check of Hadoop daemon services and responding accordingly to any warning or failure conditions.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Replaced default Derby metadata storage system for Hive with MySQL system.
Executed queries using Hive and developed Map-Reduce jobs to analyze data.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Implemented best practices to create Hive tables with appropriate partition methods and processing of data be consistent with Enterprise standards Developed Scripts and Batch Job to schedule various Hadoop Programs.
Develop Hive queries for the analysts.
Writing Hive queries for data analysis to meet the business requirements.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre- processing with Pig.
Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
Developed Map-Reduce, Hive, Pig Scripts to process data.
Imported data from RDBMS to HDFS using Sqoop.
Involved in Hive and HBase Integration by using HBase Storage Handler.
Developed custom Event flume sink responsible for collecting the data in real time and storing it in cache for Analysis.
Analyzed the data using Hadoop ecosystem like Hive, Flume.

Tools: Used: Hadoop, Cloudera, Hive, HBase, SQL,HQL, Flume, Kafka, Oozie and Sqoop, Linux, MapReduce, HDFS, MapR, Java, MySQL, Horton work.

Confidential

PL-SQL Developer

Responsibilities:

Developing personal projects with guidance of database veterans as mentors
Creating databases
Good knowledge and Experience in dealing with Relational Database Management Systems, including Normalization, Stored Procedures, Constraints, Querying, Joins, Keys, Indexes, Complex Views, Dynamic SQL, Triggers and Cursors.
Expertise with DDL and DML statements, RDBMS, data dictionaries and normal forms.
Writing stored procedures, using temporary tables, views, indexes, triggers when required and complex queries including correlated queries and queries with complex joins and aggregate functions.
Experience in using Try Catch Block introduced in SQL Server 2005.
Experience in writing complex T-SQL queries using Inner Joins, Outer Joins and Cross Joins.
SQL server administration skills including, backups, disaster recovery, database maintenance, user authorizations, database creation
Experienced with cross browser compatibility and worked on various browsers like Google chrome, Mozilla Firefox, Internet Explorer and Safari.

Tools: Used: SQL Server 2000/2005, SSIS, Microsoft Office, Windows 2003/XP.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship