We provide IT Staff Augmentation Services!

Big Data Engineer Resume

2.00/5 (Submit Your Rating)

Sunnyvale, CA

SUMMARY

  • Believer in data - driven decision, interested in big data engineering and penchant for solving business problems with help of technology
  • 2.5+ years of professional experience using Java, SQL, MySQL, and Big Data Ecosystem including Hive, Pig, MapReduce, SQOOP
  • Strong knowledge of Apache Spark, Hadoop, Object Oriented Programming, Data Structures, Design Patterns, Algorithms

TECHNICAL SKILLS

  • Big Data Analytics: Hadoop, MapReduce, Spark, Kafka, NiFi, HDFS, NoSQL, Zookeeper, Hive, Pig, HBase, YARN, Flume, SQOOP, Teradata
  • Programming Languages: Java, Scala, Python, Shell Scripting, C, C++
  • Database: MySQL, PL/SQL (Oracle), Cassandra, HBase, NoSQL, MongoDB, SQL Server, SQLite
  • Familiarity: Amazon Redshift, Weka, MLlib, MRUnit, JIRA, Spring Boot, REST, API, Jackson, JSON, GSON, XML, Avro, parquet
  • Methodologies: Agile (Scrum, Lean, XP, Crystal, Kanban, FDD, DSDM) and Waterfall Certifications Spark Scala - CCA175 (ongoing), Microsoft DBA
  • PROFESSIONAL EXPERIENCE

Big Data Engineer

Confidential, Sunnyvale, CA

Responsibilities:

  • Implementing Spark-Scala UDFs for concrete use cases. Visualizing User Agent data using Tableau
  • Developing reports by fetching, transforming streaming data and pushing data through Kafka pipelines API for analysis of client data
  • Working on Spark SQL to load and transform the data, then identify and isolate, both frequent and consistent changes in the same

Software Developer Intern

Confidential, Binghamton, NY

Responsibilities:

  • Worked with NoSQL Cassandra to store, retrieve and update and manage details for scheduling employees using CQL scripts
  • Implemented service layer for data processing on top of Cassandra using core Java, Java Swings and Git
  • Designed and tested app by functional test cases using TDD approach which saves 10% time to schedule employees and optimized cost

Summer Research Assistant

Confidential, Binghamton, NY

Responsibilities:

  • Built custom ETL pipelines to populate the data warehouse and perform pattern mining on the data stored in MySQL
  • Optimized data transfer and data retrieval by designing a new logical model to save lake's info using data modeling concepts
  • Used advanced SQL implementations for generating reports via views, indexes on OLAP, OLTP for the data scientists
  • Achieved target deadline by working in the cross-functional team for the release of Lake Observer app

Software Engineer

Confidential

Responsibilities:

  • Analyzed the data and ensured data warehouse was populated from crash test dummies' data with only quality entries
  • Configured the analyzed data and resultant patterns to assist upper hierarchy for making decisions on the structure of the deliverables
  • Setup and implemented a new architecture for the backend using design patterns like Singleton, Factory for reusability and scalability
  • Tested newly designed features and bug fixes using Core Java, JUnit, Maven, Eclipse, and documented using Java Docs
  • Developed SQL queries to perform data extraction from existing sources to check format accuracy. Simple Excel for data visualization

Software Developer

Confidential

Responsibilities:

  • Mastered basics of Big Data, Hadoop, ETL. Worked on building, maintaining own Hadoop cluster of 25 multi-node using Cloudera
  • Prototyped prediction model for BI team on large data set from Dublin city's smart meters storing data into HDFS using SQOOP.
  • Generated actionable insights by writing queries initially in SQL, Pig, Pig UDF and then extended it to Hive
  • Improved efficiency by integrating python libraries like numpy, matplotlib, pandas, ggplot, etc. for plotting graphs for visualization
  • Fine-tuned Hadoop cluster increased 60% performance for the MapReduce jobs coded using Java

We'd love your feedback!