We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

SUMMARY:

  • Professional experience of almost 2 years in IT and Expertise in Hadoop, HDFS, HBASE, Hive, Pig, Sqoop, Flume, Spark, Linux, SQL, and NoSQL.
  • Certified Hadoop Developer.
  • Passionate about working on Big Data/Hadoop Technologies.
  • Implemented Innovative solutions for big data Analysis using various Ecosystem components like Sqoop, Flume, Hive, Pig, Hbase, Spark and Zookeeper.
  • Great Understanding and extensive knowledge of HDFS Architecture and various components such as Name node, Job tracker, Data node, Task tracker and Map Reduce.
  • Hands on experience in Hive Query Language and debugging Hive issues.
  • Expertise in data loading tools like Sqoop and Flume.
  • Experience in analyzing and processing the streaming data into HDFS using Kafka with Spark .
  • Experience in developing Map Reduce programs using Java for to process and perform analytics on Big Data.
  • Excellent understanding of NoSQL Databases including Cassandra, Hbase, MongoDB.
  • Wrote custom UDFs for Hive and Pig using Java for analyzing the data efficiently.
  • Experience in using Sqoop for to import and export data from RDBMS to HDFS, vice versa.
  • Experience in Full Software Development Life Cycle including Analysis, Design, Implementation, Deployment and Maintenance of web applications.
  • Worked with the software development models, Waterfall Model and the Agile Software Development Methodology.
  • Excellent interpersonal skills, problem solving skills, good team player, ability to learn new concepts and hard worker.

TECHNICAL SKILLS:

Hadoop Ecosystem: HDFS, Map Reduce Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper, and YARN.

Programming Languages: Java, SQL, Scala, C/C++, Python.

Operating Systems: Linux, Unix, Windows.

Databases: SQL Server, MySQL, Oracle, Cassandra, Hbase, MongoDb.

Other Concepts: OOPS, Data Structures, Algorithms, ETL, Data Analytics.

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Developer

Responsibilities:
  • Responsible for writing the Map Reduce code in java for data cleaning and preprocessing
  • Importing/Exporting the data from RDBMS to HDFS.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Worked with Multiple input formats such as Text File, Key Value, Sequence File input formats.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Cluster coordination services through Zookeeper.
  • Developing Scripts and Batch Job to schedule various Hadoop Programs.
  • Written Hive queries for data analysis to meet the business requirements.
  • Creating Hive tables and working on them using Hive QL.
  • Wrote Map Reduce job using Pig Latin.
  • Analyzed large data sets using PIG scripting and Hive Queries.
  • Developing Spark code in Scala and Spark - SQL environment for faster testing and processing of data.
  • Loading the data into Spark RDD and doing in-memory computation to generate the output response with less memory usage.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Imported and processed the Structured, Semi-Structured, Unstructured data using Map Reduce, Hive, and Pig.
  • Created Hive External and Internal tables, involved in loading data and writing Hive Queries.
  • Experience in analyzing and processing the streaming data into HDFS using Kafka with Spark .
  • Worked on NoSQL databases like Cassandra, MongoDB, and Hbase.
  • Experience in using Sequence files, RC File, AVRO and HAR file formats.

Environment: Hadoop, HDFS, Map Reduce, Yarn, Java, Pig, Spark, Hive, Sqoop, Flume, Cassandra, Cloudera, MongoDb, Hbase, Linux, SQL, MySql, Oozie.

Confidential

Data Analysist

Responsibilities:
  • This data set consists of more than 25 columns separated with '\t'.
  • Loaded the Data from Local File system to HDFS.
  • Wrote the Pig Script to find out what are the top 5 categories with maximum number of videos uploaded.
  • Wrote Pig Script to find out the top 10 rated videos in YouTube.
  • Using Pig Script to find which video has maximum number of views.
  • Using Pig Script found which video has maximum number of comments.
  • Used Flume to collect the streaming data.
  • Configured the source, sink, and channel in flume configuration file in order to collect streaming data.
  • Started the flume agent. This will start fetching the data from twitter, and send into HDFS.
  • After some time stopped the fetching of data using 'ctrl + c'.
  • Checked the sink HDFS folder for whether data stored or not.
  • Because the data from twitter is in Json format, so Pig Json Loader is required.
  • Registered required jars for Json Loader.
  • The tweets are in nested Json format and consist of map data types. So to load the tweets using Json Loader which supports maps, so I used elephant bird Json Loader to load the tweets.
  • Wrote the complex pig script to find the time zone and average rating on topic.
  • This dataset consists of 12+ columns separated with 'comma'.
  • Loaded the data into HDFS.
  • Loaded the data into Pig Grunt shell using load command.
  • Wrote pig script to find the average age of males and females who died in the Titanic tragedy.
  • Wrote the pig script to find out how many people died and how many are survived in each class with their genders and ages.
  • Stored the results into HDFS.

Mobile Contacts: .Using C++ and Data Structures concepts I wrote the code for Mobile contacts. This is a menu driven code. With this code we can View, Add, Delete, Update, and Search the contacts.

We'd love your feedback!