Data Scientist Resume

SUMMARY:

Highly motivated Data Scientist with extensive 13 + years providing enterprise solutions in Data Science, Big Data Analytics, Business Intelligence, Data Mining & Warehousing. Building robust Machine Learning models in Regression, Classification, Clustering, Anomaly Detection and Text Mining through the Confidential - DM methodology.Empowering business Decision Support Systems (DSS) by delivering data driven solutions, visualizations and statistical inferences to maximize Return on Investment (ROI), minimize costs, mitigate customer churn and accelerate business processes.
Supervised Learning - Regression problems (Linear & Stepwise), Classification problems i.e. Logistic Regression, K-Nearest Neighbor, Decision Tree, Random Forest, SVM, Time Series Analysis & Forecasting
Unsupervised Learning - K-means Clustering, Hierarchical Clustering, Association Rules
Natural Language Processing ( Confidential ) - Naïve Bayes and Text mining
Leveraging Hadoop framework (MapReduce & HDFS) to ingest, store, analyze and process large structured & unstructured datasets using Sqoop, Flume, Kafka, Hive
Exploratory Data Analysis of large datasets, descriptive statistical analysis
Data Preprocessing - cleansing, blending, transformation, imputation, aggregation, feature scaling, resolving data inconsistences, filtering, handling outliers
Dimensionality reduction using feature engineering & feature selection, Confidential, Factor Analysis, Stepwise Regression (backward & forward) to maximize information gain
Hypothesis testing; null hypothesis and alternative hypothesis
Statistical programming using R libraries; ggplot2, plyr, dplyr, tidyr, Shiny, caret and Python packages; NumPy, Pandas, Matplotlib model evaluation based on predictive performance, Confusion Matrix, accuracy, misclassification rate, sensitivity, specificity, precision, type I & II error, R-squared, MSE, ROC Curve, k-fold cross validation, p-values, statistical significance, t-test, ANOVA
Visualizations i.e. histograms, scatterplots, box plots, correlation matrix, pie charts, line graph
Business Intelligence Analytics - data acquisition and integration (ETL), data warehousing, relational databases, star schema modeling, conformed dimensions modeling. Delivering interactive dashboards, visualizations, Customer 360 & Product 360 business metrics, adhoc queries, drill down reports, KPI reports, Operations metric reports

TECHNICAL SKILLS:

BI Analytical Tools: OBIEE 12c/11g/10, Tableau, Microsoft Power BI, Alteryx, Excel, Hive

Big Data Technology: Hadoop Ecosystem, HDFS, MapReduce, Spark, Hive, Sqoop, Flume, Kafka, Cloudera Manager, HUE, YARN, Cloudera Manager

Classification Models: Logistic Regression, Decision Trees, K-Nearest Neighbor, Naive Bayes Classifier, Random Forest, SVM

Regression Models: Linear & Multiple Regression, Stepwise

Unsupervised Machine Learning: K-Means Clustering, Confidential, Hierarchical Clustering, Association Rules, Time Series Analysis

Delivered Use Cases: Customer Segmentation, Churn Analysis, Predictive Analytics, Text Mining, Anomaly Detection, Market Basket Analysis

Programming: SQL, HiveQL, R, Python, WEKA

ETL: Informatica 10/9, Client Power Center Client

Databases: MS SQL Server, MySQL, Oracle 12c/11g/10g

Agile Methodology: KANBAN, JIRA

PROFESSIONAL EXPERIENCE:

Confidential

Data Scientist

Responsibilities:

Confidential -DM approach to understanding the business objectives, business problems, gaps and areas of growth & optimization
K-means algorithm for Customer Segmentation based on geography, similar purchasing behavior, products & services, utilization, high value accounts, risk of default and attrition
Customer Churn analysis leveraging Predictive models to increase customer retention efforts by 3% by analyzing and predicting customers likelihood to churn based on current and historical behavioral patterns
Speech & Text Analytics to mine consumer sentiment through call center recorded calls, correspondence, surveys, social media platforms and media publications
Association of frequent itemsets using Apriori algorithm to derive > 0.60 % product association such as support, lift and confidence
Hypothesis testing, comparing and selecting the best performing algorithmic model based on the target variable measurement, generalization performance, training speed, statistical significance of predictors
Model evaluation based on the R-squared, Adjusted R-Squared, Mean Squared Error (MSE), Mean Absolute Error (MAE), Confusion Matrix; accuracy, misclassification rate, type I & type II errors, specificity, sensitivity, precision vs recall balance, PPV, NPV values, ROC curve, k-fold cross validation, Null Deviance and Residual Deviance
Collaborating with colleagues and SME’s in Peer reviews to ensure data products, statistical inferences and recommendations are accurate, generalizable and production-ready
Automation and deployment of robust models into production and performance tuning
Communicating and providing visualizations to empower business stakeholders to make informed and strategic goals based on the statistical inferences, insights, trends and forecasts to maximize revenue, risk optimization, accelerate business process and increase the overall quality of the customer lifecycle

Confidential

Sr Software Developer

Responsibilities:

OBIEE 11.1.1.7.1 installation and configuration on Windows OS
T- shaped contributor by rotating all 3 roles as BI Analyst, Developer and UAT Tester
Semantic layer modeling using the Administration Tool; Physical, BMM, Presentation Layers based on OBIEE best practices
Establishing physical joins, star schema dimensional modeling, defining logical tables, aggregation levels, content level, hierarchies, building subject areas, Initialization blocks and repository variables
Report development using OBIEE Analytics, BI Publisher and Tableau to build interactive dashboards, adhoc reports, Performance Tiles, Scatterplots, Heat Map, Bar charts, Line charts, Box Plots, Geographic map, Tree map, Highlight map, Gantt chart, Bubble chart etc.
Implementing data level security, configuration of user, groups & Application Roles, performance tuning, monitoring system metrics, pinging server availability, usage tracking, managing Catalog root directories, unit testing, version control and object migrations
BI Administrative functions - catalog migration, repository versioning, deployments, code migration, monitoring system health, restarting BI services, performance tuning, debugging, cache management, provisioning user access
OBIEE Training and development of both onshore and offshore resources
Providing deliverables within SLA guidelines to both internal and external Stakeholders

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship