Big Data Engineer Resume
2.00/5 (Submit Your Rating)
Sunnyvale, CA
SUMMARY
- Believer in data - driven decision, interested in big data engineering and penchant for solving business problems with help of technology
- 2.5+ years of professional experience using Java, SQL, MySQL, and Big Data Ecosystem including Hive, Pig, MapReduce, SQOOP
- Strong knowledge of Apache Spark, Hadoop, Object Oriented Programming, Data Structures, Design Patterns, Algorithms
TECHNICAL SKILLS
- Big Data Analytics: Hadoop, MapReduce, Spark, Kafka, NiFi, HDFS, NoSQL, Zookeeper, Hive, Pig, HBase, YARN, Flume, SQOOP, Teradata
- Programming Languages: Java, Scala, Python, Shell Scripting, C, C++
- Database: MySQL, PL/SQL (Oracle), Cassandra, HBase, NoSQL, MongoDB, SQL Server, SQLite
- Familiarity: Amazon Redshift, Weka, MLlib, MRUnit, JIRA, Spring Boot, REST, API, Jackson, JSON, GSON, XML, Avro, parquet
- Methodologies: Agile (Scrum, Lean, XP, Crystal, Kanban, FDD, DSDM) and Waterfall Certifications Spark Scala - CCA175 (ongoing), Microsoft DBA
- PROFESSIONAL EXPERIENCE
Big Data Engineer
Confidential, Sunnyvale, CA
Responsibilities:
- Implementing Spark-Scala UDFs for concrete use cases. Visualizing User Agent data using Tableau
- Developing reports by fetching, transforming streaming data and pushing data through Kafka pipelines API for analysis of client data
- Working on Spark SQL to load and transform the data, then identify and isolate, both frequent and consistent changes in the same
Software Developer Intern
Confidential, Binghamton, NY
Responsibilities:
- Worked with NoSQL Cassandra to store, retrieve and update and manage details for scheduling employees using CQL scripts
- Implemented service layer for data processing on top of Cassandra using core Java, Java Swings and Git
- Designed and tested app by functional test cases using TDD approach which saves 10% time to schedule employees and optimized cost
Summer Research Assistant
Confidential, Binghamton, NY
Responsibilities:
- Built custom ETL pipelines to populate the data warehouse and perform pattern mining on the data stored in MySQL
- Optimized data transfer and data retrieval by designing a new logical model to save lake's info using data modeling concepts
- Used advanced SQL implementations for generating reports via views, indexes on OLAP, OLTP for the data scientists
- Achieved target deadline by working in the cross-functional team for the release of Lake Observer app
Software Engineer
Confidential
Responsibilities:
- Analyzed the data and ensured data warehouse was populated from crash test dummies' data with only quality entries
- Configured the analyzed data and resultant patterns to assist upper hierarchy for making decisions on the structure of the deliverables
- Setup and implemented a new architecture for the backend using design patterns like Singleton, Factory for reusability and scalability
- Tested newly designed features and bug fixes using Core Java, JUnit, Maven, Eclipse, and documented using Java Docs
- Developed SQL queries to perform data extraction from existing sources to check format accuracy. Simple Excel for data visualization
Software Developer
Confidential
Responsibilities:
- Mastered basics of Big Data, Hadoop, ETL. Worked on building, maintaining own Hadoop cluster of 25 multi-node using Cloudera
- Prototyped prediction model for BI team on large data set from Dublin city's smart meters storing data into HDFS using SQOOP.
- Generated actionable insights by writing queries initially in SQL, Pig, Pig UDF and then extended it to Hive
- Improved efficiency by integrating python libraries like numpy, matplotlib, pandas, ggplot, etc. for plotting graphs for visualization
- Fine-tuned Hadoop cluster increased 60% performance for the MapReduce jobs coded using Java