Data Scientist - Intern Resume
SUMMARY:
- Solid Data Science background with data engineering skill sets applied in industry
- Machine Learning (regression, classification, clustering), ANOVA analysis, A/B testing, dimension reduction (PCA, feature selection), Time Series, Inferential Statistics, Prescriptive Analytics
- Model Selection (random forests, neural networks, support vector machines, gradient boosting, bagging, generalized linear models (GLMs), ensemble methods)
TECHNICAL SKILLS:
Business Intelligence Skills: ETL (Datastage, Informatica, MSBI (SSIS, SSAS, SSRS), Matlab, Pentaho, Knime, Ad Initio) Database (Teradata, SQL Server, Oracle, Mongo DB, T - SQL, PL/SQL, NoSQL) Reporting (Tableau, Power BI, Hyperion) Project Management (Agile, SDLC, SCRUM, Kanban, Waterfall, CRISP-DM, TAM) Documentation (Requirement Analysis, Test-Case Documents, Implementation Plan, PEP, RTT, RCA)
Skills: R-Programming (e1071, dplyr, ggplot2, shiny, reshape2), Python (numpy, pandas, scikit-learn,tkinter), scala, Java, C, C++Shell Scripting, crontab, Erwin, Microsoft Office(advance), SAS E-Miner, SAS Base,H20.ai
Big Data Skills: Hadoop, Spark, HDFS, Map Reduce, Impala, Pig, Hive, AWS
PROFESSIONAL EXPERIENCE:
Data Scientist - Intern
Confidential
Responsibilities:
- Predict and Rank suitable candidates for the Job from the data available using Natural Language Processing and other Machine Learning (ML) Techniques
- Methods/Techniques: Term Frequency, Topic Modeling, Latent Semantic Analysis, Lesk Algorithm, Text Pre-Processing, Document Parsing
Research Assistant
Confidential
Responsibilities:
- Data and Text Mining on Social Media using Python and R-Programming
- Natural Language Processing using Latent semantic analysis(LSA) on social media posts and user comments/reviews
- Marketing and Content Analysis on Social Media posts and their impact on customer segmentation
Business Analyst
Confidential
Responsibilities:
- Data and Text Mining of competitor data from publicly available data sources and subscriptions
- Design and Implementation of new framework (ETL jobs, scripting) to process the competitor data
- Data Modelling, Mapping and Definition using Teradata and ERwin for the new framework
- Design and Implementation of Complex scoring model for rating individual semiconductor parts
EIM (Enterprise Information Management) Operations DataStage(ETL)
Confidential
Responsibilities:
- Research and Creation of Technical Trade off report on Tableau, Hyperion and Ab Initio to meet the project requirements for Freescale Semiconductors
- Redevelopment and Migration of Complex Dashboards from Hyperion to Tableau
- Managed Enterprise Information Management operations and support activities
- ETL development and Data Warehouse Integration using Unix system