Hadoop Developer Resume Phoenix, AZ - Hire IT People

SUMMARY:

Offering over 6 years of overall IT experience with 5 Years of experience in Application integration and management in Cloud and Big Data.
Expertise in Hadoop, HDFS, Map Reduce and Hadoop Ecosystem including Hive, HBase, HBase - Hive, Integration, PIG, Sqoop, Flume, Oozie, Zookeeper & knowledge of Mapper/Reduce/HDFS Framework.
Good understanding on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Application Master, Resource Manager, Node Manager and MapReduce programming paradigm.
Knowledge of NO SQL databases like MongoDB and Cassandra.
Have experience in Shell Scripting and used it extensively with Spark for data processing.
Developing various cross platform products while working with different Hadoop file formats like Sequence File, RC File, ORC, AVRO & Parquet.
Experience in Apache Spark, Spark Streaming, Spark SQL and No SQL databases like Cassandra and HBase.
Analyzing Data through Hive QL, Pig Latin & MapReduce programs in Java. Extending HIVE and PIG core functionalities by implementing custom UDF’s.
Experience in AWS services like EMR, EC2 and S3.
Good exposure and experience in Spark, Scala, Big Data and AWS Stack.
Good hands on experience in creating the RDD's, Data frames for the required input data and performed the data transformations using Spark and Scala.
Hands on experience on Cloudera & MapR Hadoop environments.
Experience in writing Maven and SBT scripts to build and deploy Java and Scala Applications.
Good understanding of Hadoop administration with Cloudera & MapR.
Uses Talend Open Studio to load files into Hadoop HIVE tables and performed ETL aggregations in Hadoop HIVE.
Experience in performance tuning, monitoring the Hadoop cluster by gathering and analyzing the existing infrastructure using Cloudera manager.
Involved in production monitoring using workflow monitor and experience in development and support environments.
Experienced in using waterfall, Agile and Scrum models of software development process framework.
Strong knowledge of version control systems like SVN & GIT.

TECHNICAL SKILLS:

Hadoop: HDFS, Spark, Flume, Kafka, Oozie.

Hadoop Distributions: Cloudera, Hortonworks, Azure HDINSIGHT.

Languages: Java, Scala, Python, LINUX Shell Scripting, AZURE PowerShell.

Scripts: JavaScript, Shell Scripting.

Database: Oracle 10g, MySQL, MSSQL.

No SQL Database: HBase, Cassandra, MongoDB.

Web Servers: Apache Tomcat.

Operating Systems: Windows, Linux (Cent OS, Ubuntu).

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix, AZ

Hadoop Developer

Roles & Responsibilities:

Developed various main & service classes through Scala using spark SQLs for the requirement specific tasks.
Cluster size is scalable and with Band width of 40GB nodes with a twostep process namely Data Ingestion flowed by Data Processing.
Hands on coding with Scala for leveraging Apache Spark through the Scala APIs.
Performed Data preprocessing and data cleaning using Hive & Pig.
Populating HBase tables through automation. Used Spark Shell for querying database.
Great familiarity with Hive joins & used HQL for querying the databases eventually leading to complex Hive UDFs.
Veracity of Data from various sources: Sequence file, Avro, Text, Hive Tables, Batch Streams and Logfiles.
Wrapper scripts in Unix Shell for automation using shell scripting.
Build Tools used were Maven and SBT.
Involved in Data Validation and fixing discrepancies by working in coordination with the Data Integration.
Implemented SPARK batch jobs.
Developing the Tasks and setting up the requirement environment for running Hadoop in cloud on various instances.
Automated complex workflow schedulers using the Oozie workflow scheduler.

Environment: MapR Hadoop Distribution, Hive, Python, HBase, Sqoop, Maven builds, Spark, Spark SQL, Oozie, Linux/Unix, Shell Scripting, GIT.

Confidential, Irving, TX

Hadoop Developer

Roles & Responsibilities:

Worked in a team with 30-node cluster and increased cluster by adding Nodes, the configuration for additional data nodes was done by Commissioning process in Hadoop.
Daily Monitoring of Cluster status and health using Hue UI.
Created data importing pipelines from the MySQL and Oracle into the HDFS using Sqoop.
Stored data in AWS S3 like HDFS. Also performed EMR operations on data stored in S3.
Scripted AWS environments using secured VPC and different data pipelines and Redshift cluster.
Written Hadoop MapReduce jobs using Java for processing data on HDFS.
Written Spark applications utilizing Spark-Core, Data frames, Spark-SQL using Scala.
Import the data from HDFS/HBase into Spark RDD.
Created Hive tables and implemented partitioning, dynamic partitions, buckets and created external tables to optimize performance.
Extensively Involved in loading data from UNIX file system to HDFS.
Involved in evaluating the business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
Performed CRUD operations in HBase.
Developed Hive queries to process the data.
Oozie for automating events for Data Ingestion and Processing.
Generated aggregations and groups and visualizations using Tableau.

Environment: HDFS, MapReduce, Sqoop, Hive, Pig, Oozie, HBase, Yarn, Spark, Tableau, Cloudera Manager.

Confidential, Bentonville, AR

Hadoop Developer

Roles & Responsibilities:

Involved in all phases of Installation and upgradation of Hadoop big data platform. Implementing security for Hadoop big data platform
Designed the sequence diagrams to depict the data flow into Hadoop.
Involved in importing and exporting data between HDFS and Relational Systems like Oracle, MySQL and DB2 using Sqoop.
Prepare SOPs for product installations, upgrades and any other new process. Analyze encryption methodologies and implement them in the environment
Setup best practices for monitoring. Analyze Hardware, Software requirements for the projects
Help Application and Operations team to troubleshoot the performance issues
Implemented Partitioning, Dynamic Partitions and bucketing in HIVE for efficient data access.
Created final tables in Parquet format. Use of Impala to create and manage Parquet tables.
Implemented data Ingestion and handling clusters in real time processing using Apache Kafka.
Involve in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
Implement Partitioning, Dynamic Partitions, Buckets in HIVE.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.

Environment: Hadoop, HDFS, Pig, Hive, Spark, MapReduce, Java.

Confidential

Hadoop Engineer

Roles & Responsibilities:

Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required. ETL & BI concepts, testing methodologies
Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
Designed a data warehouse using Hive.
Using Hive, Map-reduce, and loaded data into HDFS.
Monitored workload, job performance and capacity planning using Cloudera Manager.
Extensively worked on SQOOP for importing metadata from Oracle.
Extensively used Pig for data cleansing.
Created partitioned tables in Hive.
Worked with business teams and created Hive queries for ad hoc access.
Evaluated usage of Oozie for Workflow Orchestration.
Mentored analyst and test team for writing Hive Queries.
Wrote MapReduce programs with Java API to cleanse Structured and unstructured data.
Worked on loading the data from MySQL to HBase where necessary using Sqoop.
Launched Amazon EC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and Configuring launched instances with respect to specific applications.

Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, AWS, Java, Oozie, MySQL

Confidential

Associate Engineer

Roles & Responsibilities:

Created database objects like tables, views, indexes, stored-procedures, triggers, and user defined functions
Written stored procedures and SQL scripts in SQL server implement business rules for various clients
Written T- SQL queries for the retrieval of the data
Writing and debugging T- SQL, stored procedures, views and user defined functions
Data migration (import & export - BCP) from text to SQL Server
Error handling using Try-Catch Block
Normalization and De-Normalization of tables
Developed backup and restore scripts for SQL Server 2008
Installed and configured SQL Server 2008 with latest service packs
Customized the stored procedures and database triggers to meet the changing business rules
Implemented indexes for performance tuning.
Wrote Triggers and Stored Procedures and T- SQL Queries to capture updated and deleted data from OLTP systems
Designed data models using Erwin. Developed physical data models and created DDL scripts to create database schema and database objects
Wrote T- SQL queries using inner join, outer join, and self joins, merge join. And implemented functionality for removing duplicate records like using CTE and ranking function.

Environment: MS SQL Server 2008, T- SQL, DTS, SQL Server Enterprise Manager, SQL Profiler.

Environment: SSIS/SSRS, T- SQL, SQL Server 2008, MS Excel MS Office 2007, Windows 7.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Phoenix, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship