Hadoop Developer Resume Phoenix AZ - Hire IT People

SUMMARY:

5+ years of IT experience in the field of Information Technology that includes analysis, design, development and testing of complex applications.
Strong working experience with Big Data and Hadoop Ecosystems including HDFS, PIG, HIVE, HBase, Yarn, Sqoop, Flume, Oozie, Hue, MapReduce and Spark.
Extensive experience in analyzing data using Hive QL, Pig Latin and MapReduce programs in Java.
Extensively implemented POC's on migrating to Spark - Streaming to process the live data.
Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
Hands on with real time data processing using distributed technologies Storm and Kafka.
Used Different Spark Modules like Spark core, Spark RDD's, Spark Dataframe, Spark SQL.
Converted Various Hive queries into Spark transformations and Actions that are required.
Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
Worked on analyzing Hadoop cluster and different big data analytic tools including HBase database and Sqoop.
Worked on Data Serialization formats for converting Complex objects into sequence bits by using CSS, Avro, Parquet, JSON, CSV.
Having good knowledge of Oracle as Database and excellent in writing the SQL queries and scripts.
Experience in implementing Kerberos authentication protocol in Hadoop for data security.
Worked with cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration.
Experience in setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Hive, MapReduce, Pig, Sqoop, Oozie, FlumeKafka, YARN and Spark

Scripting Languages: Shell, Python

Programming Languages: Java, Scala, Python, SQL, C

Hadoop Distributions: Cloudera, Hortonworks, MapR

NoSQL databases: HBase, Cassandra

Tools: SVN, GitHub, Jenkins, Tableau

Operating systems: UNIX, LINUX, Mac OS and Windows

Databases: Oracle, SQL Server, MySQL.

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix AZ

Hadoop Developer

Responsibilities:

Involved in the Complete Software development life cycle (SDLC) to develop the application.
Developed Kafka consumer component for near real-time and Real-Time data processing in Java.
Worked on load and transform large sets of structured data using spark streaming.
Worked on loading the data into spark RDD, Data Frames and performed in-memory data computation to get faster output response.
Part of designing and developing a custom Java to pull data from source systems and publish the resultant to a specific Kafka Topic.
Developing spark jobs by using Java and Spark-SQL migrating the SQL procedures.
Creating Kafka connectors to pull the data from the database and publish the data into Kafka.
Experience in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
Working on migrating the history ingestion data from Datawarehouse to Bigdata hive.
Involved in loading data from Unix File System into HDFS with different format of data.
Working with different data sources like XML files, JSON files, SQL server and DB2 to load data into Hive tables and HDFS.
Worked with to create external tables, staging tables and joined the tables as per the requirement and built multiple data pipelines.
Worked on performance tuning of Hive and Spark jobs.
Used HiveQL for data analysis for importing the structured data to specified tables for reporting.
Monitored and tracked the issues within the team using JIRA.
Working application in Agile methodology.

Environment: Hadoop, Java, Linux, Map R, SQL, Kafka, Hive, Spark, Oracle, DB2, Netezza, Oozie, Informatica, Jira, Rally.

Confidential, New Jersey NJ

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Load the data into spark RDD and performed in-memory data computation to get faster output response.
Developed Spark jobs and Hive Jobs to transform data.
Developed Spark scripts by writing custom RDDs in Python for data transformations and perform actions on RDDs.
Worked on Oozie workflow engine for job scheduling Imported and exported data into MapReduce and Hive using Sqoop.
Developed Sqoop scripts to import, export data from relational sources and handled incremental loading on the data by date.
Developed Kafka consumer component for Real-Time data processing in Java and Scala.
Used Impala to query Hive tables for faster query response times.
Experience in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
Created Partitioned and Bucketed Hive tables in Parquet and Avro File Formats with Snappy compression and then loaded data.
Written Hive queries using spark SQL that integrates with spark environment.
Developed MapReduce programs to parse the raw JSON data and store the refined data in tables
Used Kafka to load data in to HDFS and move data into HBase.
Captured the data logs from web server into HDFS using Flume for analysis.
Worked on moving data pipelines from CDH cluster to run on AWS EMR.
Involved in moving data from HDFS to AWS Simple Storage Service (S3) and extensively worked with S3 bucket in AWS.
Developed spark application for filtering Json source data in AWS S3 location and store it into HDFS with partitions and used spark to extract schema of Json files.
Responsible for migrating the code base from Cloudera Platform to Amazon EMR and evaluated Amazon eco systems components like Redshift.

Environment: Linux, Hadoop, Python, Scala, CDH, SQL, Sqoop, HBase, Hive, Spark, Oozie, Cloudera Manager, Oracle, Windows, Yarn, Spring, Sentry, AWS, S3, SQL.

Confidential, Richardson, TX

Hadoop Developer

Responsibilities:

Involved in the Complete Software development life cycle (SDLC) to develop the application.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce.
Loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
Generated Java APIs for retrieval and analysis on No-SQL Cassandra database.
Helped with the sizing and performance tuning of the Cassandra cluster.
Developed Hive queries to process the data and generate the results in a tabular format.
Handled importing of data from multiple data sources using Sqoop, performed transformations using Hive, MapReduce and loaded data into HDFS.
Worked on extracting data from CSV, JSON Files and stored them in Avro and parquet formats.
Implemented Partition, bucketing concepts in Hive and designed both Managed and External tables in Hive.
Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement in project.
Load and transform large sets of structured, semi structured using Hive.
Involved in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
Worked on Creating Kafka topics, partitions, writing custom partitioned classes.
Worked in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Used Spark-Streaming APIs to perform necessary transformations and actions on the data.
Monitoring and controlling local file system disk space usage, log files, cleaning log files with automated scripts.
Involved in writing OOZIE jobs for workflow automation.
Involved in collecting metrics for Hadoop clusters using Ganglia and Ambari.

Environment : Unix, Linux, Hortonworks, Scala, HDFS, Map Reduce, Hive, Flume, Sqoop, Ganglia, Ambari, Oracle, Ranger, Python, Apache Hadoop, Cassandra.

Confidential

JAVA Developer

Responsibilities:

Performed Requirement Gathering & Analysis by actively soliciting, analyzing and negotiating customer requirements and prepared the requirements specification document for the application using Microsoft Word.
Developed Use Case diagrams, business flow diagrams, Activity/State diagrams.
Developed presentation layer using Java Server Faces (JSF) MVC framework.
Used JSP, HTML and CSS, jQuery as view components in MVC.
Developed custom controllers for handling the requests using the spring MVC controllers.
Used JDBC to invoke Stored Procedures and used JDBC for database connectivity to SQL.
Deployed the applications on WebLogic Application Server.
Developed Web services using Restful and JSON.
Created and managed microservices using Spring Boot that create, update, delete and get the data.
Used Oracle database for tables creation and involved in writing SQL queries using Joins and Stored Procedures.
Developed JUnit Test Cases for Code unit test.
Worked with configuration management groups for providing various deployment environments set up including System Integration testing, Quality Control testing etc.

Environment: Java/J2EE, SQL, Oracle, JSP, JSON, Java Script, Web Logic, HTML, JDBC, Spring, Hibernate, XML, JMS, log4j, JUnit, Servlets, MVC, Eclipse.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Phoenix, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship