We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Lawrenceville, NJ

SUMMARY

  • Around 7 years of IT experience in a variety of industries, which includes 4+ years of hands on experience in Big Data Analytics and development
  • Expertise with the tools in Hadoop Ecosystem including HDFS, MapReduce, Hive, Pig, Impala, Sqoop, Spark, Yarn, Oozie
  • Excellent knowledge on Hadoop Ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experienced on major Hadoop ecosystem's projects such as PIG, HIVE, HBASE and monitoring them with Cloudera Manager
  • Extensive experience in using Hive Query Language for data analytics
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa
  • Good knowledge in using Oozie for scheduling jobs and monitoring tools
  • Experience in designing and developing applications in Spark using python to compare the performance of Spark with Hive and SQL/Oracle
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications
  • Experience in manipulating/analyzing large datasets and finding patterns and insights within structured data and semi-structured data
  • Have extensive experience in building and deploying applications on Web/Application Servers like IBM WebSphere
  • Strong experience on Hadoop distributions like Cloudera
  • Experience of working in 2PB-10TB-35 nodes cluster running on CDH-5.9.x
  • Strong experience in database design, writing complex SQL Queries and Stored Procedures
  • Work closely with the product management and development teams to rapidly translate the understanding of customer data and requirements to product and solutions.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications
  • Excellent communication skills and experienced in client interaction while providing technical support and knowledge transfer
  • Strong problem-solving skills, good communication, interpersonal skills and a good team player

TECHNICAL SKILLS

Big Data Technologies: Hadoop, HDFS, MapReduce, Spark, Hive, Sqoop, Pig, Impala, Oozie, Yarn, HBase, Basics of HTML.

Programming Languages: Linux Programming, Basics of Python language, HQL, MYSQL, C

Microsoft Suite: Advanced Excel (V-lookup, Pivot Tables), PowerPoint, Project

Tools: CDH 5.9.1, Cloudera Navigator, Cloudera Manager Basics, IBM WebSphere Application Server, VM Ware, Wireshark.

KEY COURSES: Big Data Analytics, Data ware Housing Concepts, Multi-Variant Data Analysis, Data Mining, Database Management and Systems, Software Project Management, Computer Network Design and Analysis, Computer Communication Networks

PROFESSIONAL EXPERIENCE

Confidential - Lawrenceville, NJ

Sr. Hadoop Developer

Responsibilities:

  • Monitoring day to day activities of the jobs in Production and Staging environments.
  • Working with Spark and Hadoop components and managing analytics services related to digital media systems including identifying and implementing products and services for Web based delivery.
  • Working on data transferring from Acheron and Hades API to Mercury endpoint and final goal is to depreciate Analytics Kafka Cluster.
  • Demonstrating competence in full life cycle management of REST or Web Service APIs including specification and system deployment.
  • Working with Google Cloud, Amazon EC2 Instances and tracking real time data using Spark, Apache Kafka and storing it into backend Hadoop Distributed File System.
  • Optimizing Hadoop MapReduce code, Java code, Scala and shell scripts for better scalability, reliability, performance and scheduling the jobs in Task Forest.
  • Deploying Software Components into cloud-based infrastructures, caching and scaling-out using cloud API's.
  • Developing code using Scala Programming, Java, SparkSQL, Oracle (PL/SQL) and MySQL and Interface using Intelli-J IDE, Databricks.

Environment: Spark 1.6 and 2.x, Hadoop 2x, Hive, Airflow, GCS, Linux, VMware 14.0.0, Cloudera 5.9.1, MySQL

Confidential, Woonsocket, RI

Sr. Hadoop developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Working with data management team during data integration specifically focusing on data quality, data profiling to integrate, transform and data provisioning.
  • Designed and Implemented real-time Big Data processing to enable real-time analytics, event detection and notification for Data-in-Motion.
  • Involved in importing data from Weblog and Apps log using Flume.
  • Involved in Map Reduce and Hive Optimization.
  • Involved in importing data from Oracle to HDFS using SQOOP.
  • Involved in writing Map Reduce program and Hive queries to load and process data in Hadoop File System.
  • Involved in creating Hive tables, loading with data and extensively worked on writing hive queries.
  • Worked on performance tuning of Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Experience in scripting for automation and monitoring using Python.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.

Environment: Linux, Hadoop 2.X and Hadoop 3.X HDFS cluster with Cloudera manager 5.10.X, HDFS, Hive, Pig, MapReduce, Spark, Sqoop, Oozie, Flume, Teradata, Scala, HBase, SQL, Unix.

Confidential

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop
  • Involved in loading data from Oracle database into HDFS using Sqoop queries
  • Developed Spark scripts by using python commands on Jupyter notebook as per the requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Worked on tuning the performance of Hive queries
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE
  • Responsible to manage data coming from different sources
  • Configured Time Based Schedulers that get data from multiple sources parallel using Oozie work flows
  • Installed Oozie workflow engine to run multiple Hive and pig jobs
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop1x, Map Reduce, HDFS, Pig, Hive, python, Oozie, Java, Linux, Cloudera 5.9.1, MySQL

We'd love your feedback!