2 years experienced enthusiastic learner, goal oriented and logical analyzer with strong mathematical and statistical skills coupled with Machine learning knowledge seeking an opportunity for career advancement in data analytics
Data Analytical Tools/Technologies: R, Python, Tableau, Weka, MS Excel (Pivot Tables, Power Pivot, Macros (VBA)), Jupyter Notebook, Google Analytics, Informatica, Matlab, Qlik Sense, SSIS, SSAS, SSRS, Predictive Analytics, Classification, Clustering, Ensemble Learning
Database Systems: SQL, PL/SQL, PostgreSQL, MySQL
Microsoft Office: MS Word, MS Power Point, MS Access
Associate Software Engineer
- Performed pre - processing and exploratory data analysis on raw data to get initial data insights
- Developed a platform to determine if an employee leaves the company or not with 89% accuracy
- Worked on developing several models and based on those developed ensemble models to predict housing prices
- Evaluated the models using cross validation technique and measured the models based on performance metrics
- Implemented Entity Relationship model of 3 entities Doctor, Nurses and Pharmacy and merged the databases to create an enterprise data warehouse modelled using snow flake schema
- Performed ETL using Informatica for retrieval of required data using SQL, PL/SQL and generated reports
- Inquired on the data and modeled Association Rules to identify the customer behavior which shows what products are being bought together on which day of the week.
- This was important for the company to increase their sales and get the products to be stocked on time in the store
- Analyzed dataset of 10 attributes by following cross-industry standard process in data mining (CRISP-DM).
- Solved the problem of imbalanced data by Under sampling and implemented Decision Trees by identifying the relevant features using boruta algorithm and evaluated model using Confusion Matrix
- Predicted the future earnings of students to help students make decision if joining a college is a good investment that lands them in a good job opportunity so that they can repay those loans
- Measured predicted earnings from Random Forest Regressor and Multiple Linear Regression based on Confidential
- Implemented recommender system which processes ratings to provide recommendations for a new user.
- Also, computed similarity metric (cosine similarity) for each pair of movies and determined similar movie title
- Attained approx. 70% accuracy in predicting whether there has been a shark attack or not along with gaining deeper insights about the data through Exploratory Data Analysis