Sr. Bigdata Engineer Resume

SUMMARY

14+ Years of extensive IT experience in Analysis, Architecture, Design, Development, Testing, Maintenance, and User training of software applications which including 4+ years of experience working on Apache Hadoop ecosystem and Apache Spark and over 10+ Years of experience in Java/J2EE.
Hands on experience in developing and deploying enterprise - based applications using major components in Hadoop ecosystem such as Hadoop 2.x, YARN, Hive, Pig, Map Reduce, Sqoop, Spark, Scala, Kafka, Oozie.
Good knowledge in handling messaging services using Apache Kafka.
Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, and Pair RDD's.
Expertise in using Spark-SQL with various data sources like JSON, Parquet and Hive.
Experience in usage of Hadoop distribution like Cloudera and Horton Works distribution.
Experience in transferring data from RDBMS to HDFS and HIVE table using Sqoop.
Experience in creating tables, partitioning, bucketing, loading and aggregation using HIVE.
Migrating the code from Hive to Spark/PySpark and Scala/Python using Spark-SQL and Spark Windows function.
Extensive experience in Spring Core, Spring IOC, Spring MVC, Spring Web Flow, Spring Batch, Spring Security, Spring Boot for micro-services, Hibernate framework, iBatis and AJAX.
Experience in writing Apache MAVEN, and Log4J and JUnit for unit testing.
Extensive experience in developing Use Cases, Activity Diagrams, Sequence Diagrams and Class Diagrams using Visio.
Sun Certified Java Programmer (SCJP 1.5).
Work with business analysts to understand problems and provided architecture optimal solution.
Experience working in environments using Agile (SCRUM) and BDD (Behavior-Driven Development), Test-Driven development methodologies.
Excellent team player with good communication, people and leadership skills.

PROFESSIONAL EXPERIENCE

Confidential

Sr. Bigdata Engineer

Responsibilities:

Responsible for developing data pipeline using Spark, Scala, Apache Kafka to ingestion the data from CSL source and store in HDFS protected folder.
Implemented many Kafka ingestion jobs to consume the real time data processing and batch processing.
Used HBase for storing the Kafka topic, partition number and Offsets value. Also used phoenix jar to connect HBase table.
Used PySpark to creating batch job for merge multiple small files (Kafka stream files) into single larger files in parquet format.
All Spark/PySpark jobs we are implemented Progtegrity API for writing & reading PCI/PII data from HDFS location or Hive table.
Implemented multiple function in PySpark program like 'UnionAll' function to combine the two Dataset & remove duplicates.
Implemented on spark using Scala/Java custom function for map object.
Developed Autosys scripts to schedule the Kafka streaming and batch job.
Involved in creating Hive tables and loading and analyzing data using hive queries.
Used Ambari to monitor node’s health and status of the jobs in Hadoop clusters.
Used Rally for user-story/bug tracking and Bit Bucket to check-in and checkout code changes.

Confidential

Bigdata Developer & Java Tech Lead

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop Eco system and Spark.
Developed Spark applications for the entire batch processing by using Scala.
Imported data from different sources into Spark RDD for processing. Developed custom aggregate functions using Spark SQL and performed interactive querying.
Utilized spark data frame and spark sqlapi extensively for all the processing.
Integrated Kafka with Spark Streaming for real time data processing.
Experience in managing and reviewing Hadoop log files. Experience in hive partitioning, bucketing and perform joins on hive tables.
Importing and exporting the analyzed data to the relational databases into HDFS using Sqoop.
New library development with microservices architecture using Rest APIs, spring boot, Pivotal Cloud Foundry and AWS.
Create and configured the continuous delivery pipelines for deploying microservices using Jenkins.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship