Hadoop Developer Resume
SUMMARY:
- Professional experience of almost 2 years in IT and Expertise in Hadoop, HDFS, HBASE, Hive, Pig, Sqoop, Flume, Spark, Linux, SQL, and NoSQL.
- Certified Hadoop Developer.
- Passionate about working on Big Data/Hadoop Technologies.
- Implemented Innovative solutions for big data Analysis using various Ecosystem components like Sqoop, Flume, Hive, Pig, Hbase, Spark and Zookeeper.
- Great Understanding and extensive knowledge of HDFS Architecture and various components such as Name node, Job tracker, Data node, Task tracker and Map Reduce.
- Hands on experience in Hive Query Language and debugging Hive issues.
- Expertise in data loading tools like Sqoop and Flume.
- Experience in analyzing and processing the streaming data into HDFS using Kafka with Spark .
- Experience in developing Map Reduce programs using Java for to process and perform analytics on Big Data.
- Excellent understanding of NoSQL Databases including Cassandra, Hbase, MongoDB.
- Wrote custom UDFs for Hive and Pig using Java for analyzing the data efficiently.
- Experience in using Sqoop for to import and export data from RDBMS to HDFS, vice versa.
- Experience in Full Software Development Life Cycle including Analysis, Design, Implementation, Deployment and Maintenance of web applications.
- Worked with the software development models, Waterfall Model and the Agile Software Development Methodology.
- Excellent interpersonal skills, problem solving skills, good team player, ability to learn new concepts and hard worker.
TECHNICAL SKILLS:
Hadoop Ecosystem: HDFS, Map Reduce Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper, and YARN.
Programming Languages: Java, SQL, Scala, C/C++, Python.
Operating Systems: Linux, Unix, Windows.
Databases: SQL Server, MySQL, Oracle, Cassandra, Hbase, MongoDb.
Other Concepts: OOPS, Data Structures, Algorithms, ETL, Data Analytics.
PROFESSIONAL EXPERIENCE:
Confidential
Hadoop Developer
Responsibilities:- Responsible for writing the Map Reduce code in java for data cleaning and preprocessing
- Importing/Exporting the data from RDBMS to HDFS.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Worked with Multiple input formats such as Text File, Key Value, Sequence File input formats.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Cluster coordination services through Zookeeper.
- Developing Scripts and Batch Job to schedule various Hadoop Programs.
- Written Hive queries for data analysis to meet the business requirements.
- Creating Hive tables and working on them using Hive QL.
- Wrote Map Reduce job using Pig Latin.
- Analyzed large data sets using PIG scripting and Hive Queries.
- Developing Spark code in Scala and Spark - SQL environment for faster testing and processing of data.
- Loading the data into Spark RDD and doing in-memory computation to generate the output response with less memory usage.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Imported and processed the Structured, Semi-Structured, Unstructured data using Map Reduce, Hive, and Pig.
- Created Hive External and Internal tables, involved in loading data and writing Hive Queries.
- Experience in analyzing and processing the streaming data into HDFS using Kafka with Spark .
- Worked on NoSQL databases like Cassandra, MongoDB, and Hbase.
- Experience in using Sequence files, RC File, AVRO and HAR file formats.
Environment: Hadoop, HDFS, Map Reduce, Yarn, Java, Pig, Spark, Hive, Sqoop, Flume, Cassandra, Cloudera, MongoDb, Hbase, Linux, SQL, MySql, Oozie.
Confidential
Data Analysist
Responsibilities:- This data set consists of more than 25 columns separated with '\t'.
- Loaded the Data from Local File system to HDFS.
- Wrote the Pig Script to find out what are the top 5 categories with maximum number of videos uploaded.
- Wrote Pig Script to find out the top 10 rated videos in YouTube.
- Using Pig Script to find which video has maximum number of views.
- Using Pig Script found which video has maximum number of comments.
- Used Flume to collect the streaming data.
- Configured the source, sink, and channel in flume configuration file in order to collect streaming data.
- Started the flume agent. This will start fetching the data from twitter, and send into HDFS.
- After some time stopped the fetching of data using 'ctrl + c'.
- Checked the sink HDFS folder for whether data stored or not.
- Because the data from twitter is in Json format, so Pig Json Loader is required.
- Registered required jars for Json Loader.
- The tweets are in nested Json format and consist of map data types. So to load the tweets using Json Loader which supports maps, so I used elephant bird Json Loader to load the tweets.
- Wrote the complex pig script to find the time zone and average rating on topic.
- This dataset consists of 12+ columns separated with 'comma'.
- Loaded the data into HDFS.
- Loaded the data into Pig Grunt shell using load command.
- Wrote pig script to find the average age of males and females who died in the Titanic tragedy.
- Wrote the pig script to find out how many people died and how many are survived in each class with their genders and ages.
- Stored the results into HDFS.
Mobile Contacts: .Using C++ and Data Structures concepts I wrote the code for Mobile contacts. This is a menu driven code. With this code we can View, Add, Delete, Update, and Search the contacts.