We provide IT Staff Augmentation Services!

Big Data Engineer Resume

2.00/5 (Submit Your Rating)

Sanjose, CA

SUMMARY:

  • Around 7 years of experience spread across Big Data and Java that includes extensive work on Big Data Technologies along with development of web applications in multi - tiered environment using Hadoop, Spark, Scala, HBase, Java, Sqoop, Kafka.
  • Hands on experience with Hadoop, HDFS, MapReduce and Hadoop Ecosystem like HBase.
  • Experience in working with various RDBMS.
  • Good hands on experience with SQL databases MYSQL.
  • Experience in writing SQL queries based on the Business requirement.
  • Experience in developing a data pipeline through Kafka-Spark API.
  • Experience in loading data into Spark schema RDD's and querying them using Spark-SQL.
  • Good at writing custom RDD's in Scala and implemented design patterns to improve the performance.
  • Excellent understanding of Hadoop distributed File system and experienced in developing efficient MapReduce jobs to process large datasets.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experienced with different file formats like fixed file Width, CSV, Text files, Sequence files, XML, JSON and Avro files.
  • Worked on NoSQL databases like HBase to store structured and unstructured data.
  • Experience in Object Oriented language like Java and Core Java.
  • Worked with agile and Scrum software development framework for managing product development.
  • Strong analytical and problem-solving skills. Willingness and ability to quickly adapt to new environments and learn new technologies.

TECHNICAL SKILLS:

Hadoop Ecosystem: Kafka, Spark, Sqoop, MapReduce, HDFS, Zoo Keeper

Databases: Oracle, MySQL

Methodologies:Agile, Scrum

NoSQL: - HBase

Languages: Java, Scala, HTML, SQL, Python.

Others: Eclipse, Maven, JIRA, Git and Linux.

PROFESSIONAL EXPERIENCE:

Big Data Engineer

Confidential, SanJose CA

Responsibilities:

  • Developed spark job to consume data from kafka topic and perform validations on the data before pushing data into HBase and Oracle databases.
  • Developed Spark job to perform various analytics on the Oracle database.
  • Developed Spark Streaming and Spark SQL job with windowing functions to find the highest revenue of the sellers for each month.
  • Developed code to handle exceptions and push the code into exception kafka topic.
  • Involved in Requirement Analysis, Development and Documentation.

Environment: Scala, Apache Kafka, Apache Spark, Spring framework, Spring boot, Hive, HBase

Big Data Engineer

Confidential, Tampa, FL

Responsibilities:

  • Designed kafka producer client using Confluent kafka and produced events into kafka topic.
  • Subscribing the kafka topic with kafka consumer client and process the events in real time using spark.
  • Developed RESTFUL API using spring framework.
  • Developed automatic code using Apache Avro plug-in.
  • Used Avro serializer and Avro De serializer for developing the kafka clients.
  • Good knowledge on defining Avro schema.
  • Good knowledge on micro service architecture.
  • Involved in Agile methodologies, daily Scrum meetings, Sprint planning.

Environment: Apache kafka, Scala, Spring framework, Spring boot, BitBucket, Spark.

Big Data Engineer

Confidential

Responsibilities:

  • Involved in Requirement Analysis, Development and Documentation.
  • Developing Scripts to schedule various Sqoop Jobs.
  • Developed Map-Reduce programs to clean and aggregate the data
  • Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in Hadoop development using HDFS, Map Reduce, Hive, Sqoop.
  • Hands-on coding and scripting (automation) experience using OO languages such as Java, Python.
  • Created a relational model and dimensional model for online services such as online banking and automated bill pay.
  • Defined and deployed monitoring, metrics, and logging systems on AWS.
  • Worked on Apache Spark along with SCALA Programming language for transferring the data in much faster and efficient way.
  • Experienced in Ingesting real time into HBASE using Kafka through Spark Streaming
  • Developed Spark Streaming applications for Real Time Processing.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Developed HBase data model on top of HDFS data to perform real time analytics using Java
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Worked extensively on development and maintenance of HADOOP applications using JAVA and MapReduce.

Environment: Hadoop, HDFS, HBase, Spark, Kafka, Java, Map Reduce, Python, Sqoop, Zoo Keeper, Cloudera, NoSQL

Software Engineer

Confidential

Responsibilities:

  • Developed Hadoop Jobs to parse raw Hadoop logs and convert them into easier to work with Avro format
  • Developed Map Reduce Jobs to read the Avro-fied data and aggregates it per hour, writing the data out in Avro format
  • Collaborated with the team to tune and optimize the Map Reduce jobs
  • Coordinated with team every week to review progress and update goals
  • Developed MRUnit test cases for the project

Environment: Java, Python, Hadoop, HDFS, Avro

We'd love your feedback!