We provide IT Staff Augmentation Services!

Data Scientist Resume

0/5 (Submit Your Rating)

Columbus, GA

SUMMARY:

  • About 10 years of strong IT experience in field of Data Analytics & Data Science focused on processing and analyzing large amount of data using Hadoop (Mahout, Hive, PIG), R, MS Excel 2010, MS Access 2010, MS SQL 2010, SAS, Matlab.
  • Proficient in Machine Learning, Data/Text Mining, Statistical Analysis & Predictive Modeling.
  • Efficient in: data acquisition, storage, analysis, integration, predictive modeling, logistic regression, decision trees, data mining methods, forecasting, factor analysis, cluster analysis, ANOVA and other advanced statistical techniques.
  • Strong experience in Data Visualization with QlikView & Tableau creating: Line and scatter plots, Bar Charts, Histograms, Pie chart, Dot charts, Box plots, Time series, Error Bars, Multiple Charts types, Multiple Axes, subplots etc.
  • Adept in Data Quality Management to get, clean, process, and cross - verify the data in multiple sources. Skilled R user with knowledge of other statistical programming languages like SAS and SPSS.
  • Good knowledge and understanding of data mining techniques like classification, clustering, regression techniques and random forests.
  • Experience with creating MapReduce programs, SQL on Hadoop using Hive and ETL using PIG scripts. Skilled in R, Java, Python, C#, SQL Server, Matlab, SAS.
  • Willing to relocate: Anywhere

TECHNICAL SKILLS:

Statistical Software: R, Matlab, SAS Studio, MS Excel 2010

Statistical Techniques: Linear Regression, Logistic Regression, Random Forests

Big Data Ecosystems: HDFS, MapReduce, Mahout, Hive, Pig, Sqoop, Flume

Database: SQL Server, MS Access 2010, MySQL

Languages: Java, Python, C#, SQL, HTML, JavaScript

WORK EXPERIENCE:

Data Scientist

Confidential - Columbus, GA

Responsibilities:

  • Analyzed individual customer behavior.
  • Segmented customers based on spending activities.
  • Categorized risky customers based on the days past due parameter.
  • Categorized active and inactive customers based on their utilization.
  • Designed, developed and deployed statistical data models R.
  • Utilized machine learning techniques for predictions & forecasting based on the data.
  • Developed data mining algorithms using Machine learning (Random Forest, Regression, Clustering) for decision making using R, Mahout on Hadoop.
  • Partnered with ETL team to extract data from Hadoop environment.
  • Prepared Dashboards using calculations, parameters in QlikView.
  • Created and assisted users in Tableau dashboard development.

Environment: R, Hadoop, Mahout, QlikView, Excel.

Data Scientist

Confidential - Basking Ridge, NJ

Responsibilities:

  • Utilized machine learning techniques for predictions & forecasting based on the Sales data. Executed overall data aggregation/alignment & process improvement reporting within the sales dept. Managed Data quality & integrity using skills in Data Warehousing
  • Databases & ETL. Monitored and maintained high levels of data analytic quality, accuracy, and process consistency. Assisted sales management in data modeling. Ensured on-time execution and implementation of sales planning analysis and reporting objectives. Worked with sales management team to refine predictive methods & sales planning analytical process. Executed and monitored the accuracy and efficiency for sales forecasts & reporting. Prepared Dashboards using calculations, parameters in QlikView. Supported consistent implementation of company reporting and sales process initiatives.

Environment: R, Excel, SAS, QlikView, MS SQL Server 2010.

Data Scientist

Confidential - Nashville, TN

Responsibilities:

  • Responsible for predictive analysis of credit scoring to predict whether or not credit extended to a new or an existing applicant will likely result in profit or losses.Primarily used R packages for the data mining tasks.
  • Participated in all phases of data mining; data collection, data cleaning, developing models, validation and visualization.
  • Data for modeling was collected using SQL by querying several tables. The extracted tables were further appended or merged to create tables for modeling using R.
  • Adopted principal component analysis. The missing values were replaced if applicable with the group average using proc means.
  • Computed Credit Risk Parameters such as Probability of Default and Loss Given Default and Exposure at Default.
  • Used logistic regression, clustering and multivariate modeling to provide valuable analytical insights.
  • Used R for generating various graphs and charts for analyzing the different features.
  • Used k-fold cross validation to avoid over fitting.
  • Used Kolmogorov-Smirnov test to measure the quality of the models.

Environment: R, MS SQL, Hadoop, Hive, Pig, Mahout.

Data Analyst/Data Scientist

Confidential - Nashville, TN

Responsibilities:

  • Worked on large data sets of structured, semi-structured, and unstructured data.
  • Developed data analysis solutions based on predictive, behavioral or other models via statistical analysis and use of relevant modeling techniques.
  • Identified opportunities for Cost saving for the customers using Data models
  • Developed data mining algorithms using Machine learning (Decision Tree, Regression, Clustering) on Historical big data for decision making using R, Mahout on Hadoop
  • Designed, developed and deployed statistical data models R, Python

Environment: R, Hadoop, Mahout, Python, MS SQL Server 2010.

BI Analyst

Confidential

Responsibilities:

  • Utilized skills in software applications such as R/Excel/SAS.Used decision tree analysis, regression analysis.
  • As an ETL Team member, Involved in Business Analysis, Business Process and Technical Design sessions.
  • Worked with business and technical staff to develop requirements document and specifications.
  • Responsible for extracting data from MS SQL Server, MS Access and Flat files, Data Warehousing & Database Design, ETL, Data reporting and query, preparing data for analytics and project management.
  • Created database from representative project data and modified procedures, tables, views and constraints.
  • Generated reports for managers using dimensional modeling and reporting tools.
  • Edited raw data and created R data sets for statistical analysis for project/business decisions. Used R package for Statistical Analysis. Created and maintained bulk data load & extract processes.

Environment: R, Excel, SAS, MS SQL Server 2008, Excel.

Business Analyst - Mobile portal commercials

Confidential

Responsibilities:

  • Worked as both developer and problem solver to develop both client side & server side code.Worked closely with Business Analysts to understand the requirements.
  • Wrote Core Java classes, JSP and HTML files
  • Worked with team to developed interactive and user friendly web pages using JSP, CSS, HTML, JavaScript
  • Involved in injecting dependencies into code using Spring core module.
  • Involved in developing code for obtaining bean s in Spring framework using Dependency Injection (DI) or Inversion of Control (IoC).

Environment: Java/ J2EE, JSP, MySQL, HTML, CSS, JavaScript, Spring.

We'd love your feedback!