Languages: Java, Scala, Shell Scripting, C, C++, C#
Web: HTML, XML, Bootstrap, .net RDF, Microdata
Big Data Technologies: Hadoop, Hive, Pig, Kafka, Spark, MapReduce, Yarn, Flume, Oozie, Hue, Sqoop
Visualization tools: Tableau, Zeppelin, d3.js, Datadog
Databases: MySQL, Microsoft SQL Server, PostgreSQL, Oracle DB, Hive, PrestoDB, GraphDB
Big Data Engineer
- Works closely with CTO and Product Manager to understand the requirement of retailers and onboard them to digital marketing platform efficiently.
- Works on implementing and maintaining ETL pipeline for processing raw sales and campaign data using Java 8, Shell Scripting, Hive, Spark and power of Distributed Computing. Automated integral parts of the process using Cron and Oozie to enhance productivity and provide analytical support to business team. Demonstrated problem solving capabilities that allowed timely resolution of critical bugs.
- Implemented Hive UDFs to enhance ease of manipulation for intermediate results in processing data.
- Developed Java program to transform JSON, XML files to standardized CSV format for easier data extraction.
- Implemented workflow for Data warehousing the lookup metadata for clients and automated it to update frequently.
- Worked on big data technologies like Kafka, Spark, Map Reduce and Hive to create a streaming data pipeline from an e - commerce framework to PrestoDB and HDFS.
- Implemented data visualizations on the admin page using Zeppelin, d3.js to portray past and present scope of sales.
- Intervened in the development and testing of backend Java code using Junit testing.
- Performed a comparative analysis on H1-B petitions dataset and visualized using Tableau for trends insights
- The comparative analysis helped me come up with better execution times.
- Designed a website using MVC and Bootstrap framework to search for vacancies in the locality and provision renter to broker relation portal for service requests and payments.
- Implemented effective data access from GraphDB nodes and Triples from Turtle file using SPARQL.
- Performed analysis on Confidential dataset of Vehicle collisions ranging 5 years using MapReduce, Hive and AWS EMR.
- Visualized insights using Tableau
- Presented poster at Faculty Research Day, 2017 and won second prize.
- Developed a windows standalone application to store and retrieve medicine information from relational database.
- Extended the functionality with Object-Oriented approach using Design Patterns.