We provide IT Staff Augmentation Services!

Data Scientist Resume

5.00/5 (Submit Your Rating)

GA

SUMMARY

  • 7+ years of experience in developing predictive machine learning models and building end to end data pipelines which include extracting, transforming and handling huge incoming data with the goal of discovering hidden insights, with a strong focus on improving business process, addressing business problems and ensuring customer satisfaction
  • Experienced in data mining, merging, cleaning and analyzing structured, semi - structured and unstructured data
  • Strong track record of contributing to successful end-to-end analytic solutions (clarifying business objectives and hypotheses, communicating project deliverables and timelines, and informing action based on findings)
  • Expertise in Data Integration, Data Cleaning, Data Analysis and Profiling, Data Import and Export using multiple ETL tools such as SQL Server, SSIS and SSAS
  • Proficient in Machine Learning, Statistical and Predictive Analytics using R, Python, SQL and SAS
  • Expertise writing production quality code in SQL, R, Python and Spark. Hands on experience building regression and classification models and other unsupervised learning algorithms with large datasets in distributed systems and resource constrained environments
  • Expert knowledge in supervised and unsupervised learning algorithms such as Ensemble Methods, Random forests, Linear/Logistic Regression, SVMs, Deep Neural Networks, Extreme Gradient Boosting, Decision Trees, KMeans, Gaussian Mixture Models, Hierarchical models, and time series models (ARIMA,GARCH, VARCH etc.)
  • Proficient in leveraging data from multiple sources to create reports and dashboards as a part of Data Storytelling
  • Experienced increating actionable interactive dashboards using Tableau, PowerBI, Python (Matplotlib, Seaborn), R(ggplot2, Rshiny), D3.js, Plotly
  • Extensive knowledge of Natural Language Processing and Sentiment Analysis. Developed custom sentiment scripts in R
  • Good knowledge of Hadoop architecture, MapReduce, Pig, Hive, MongoDB
  • Hands-on experience in implementing predictive analytics with Spark in Azure HDInsight
  • Good exposure to Deep Learning with TensorFlow, Keras, Theano
  • Strong business sense and abilities to communicate data insights to both technical and non-technical clients
  • Microsoft Certified Professional with certifications in programming C#, and Querying/Development of SQL Server
  • Software Engineer with knowledge of building end to end products and systems using JavaScript, Python and Django
  • Involved in all stages of Software Development Life Cycle and Data Analytics Life Cycle
  • Proficient in Developing REST API using NodeJS and ASP.NET Core
  • Knowledge of HTTP standards, API best practices, REST, web security and authentication, basics of building scalable solutions
  • Hands on experience with variety of cloud services like Azure, AWS (EC2, S3), Digital Ocean

TECHNICAL SKILLS

Programming: Python, R,HTML, CSS, JavaScript, AngularJS, C#, API development

Frameworks: ASP.NET, ASP.NET Core, Node and Express JS, Django

Database& Query: MySQL, SQL Server, SSIS, SSAS, NoSQL, MongoDB, Hadoop, MapReduce, Pig, Hive

BI & Tools: Tableau, PowerBI, SAS, Plotly, D3.JS, Rshiny, Git, Jupyter, Zeppelin

Machine Learning: R- caret, glmnet, xgboost, dplyr, survival, rpart, ggplotPython- numpy, pandas, scikit-learn, scipy, matplotlib, seaborn, tensorflow, pySpark

Data Science/Statistics: Exploratory Data Analysis using R and Python

Supervised learning: Linear/Logistic Regression, Lasso, Elastic Nets, Decision Trees, Random Forest, Support Vector Machines, Gradient Boosting, Bayesian models, Ensemble Methods

Unsupervised learning: Principal Component Analysis, Association Rules, Factor Analysis, K-means clustering, Hierarchical clustering, Gaussian Mixture Models, Market Basket Analysis, Collaborative Filtering and Low Rank Matrix Factorization

Statistical tests: T-tests, Chi-square/table analysis, Correlation tests, A/B testing, Normality tests, Residual diagnostics, Anova

Feature Engineering: Recursive Feature Elimination, Filter Methods, Relative Importance, Wrapper and Embedded methods

Time Series: ARIMA, Holt winters, Exponential smoothing, Bayesian structural Time Series

Methodologies: Software Development Life Cycle, Agile, Data Analytics Life Cycle, IT Project management

PROFESSIONAL EXPERIENCE:

Data Scientist

Confidential, GA

Responsibilities:

  • Worked as Data Scientist and developed and deployed predictive models for analyzing customer churn and retention
  • Performed Data Extraction, Data Manipulation and Data Analysis on TBs of structured and unstructured data
  • Developed machine learning models using Logistic Regression, Naïve Bayes, Random Forest and KNN
  • Performed Data Imputation using scikit-learn package of Python
  • Created interactiveanalytic dashboards using Tableau and Plotly
  • Conducted analysis on assessing customer consuming behaviors and discover value of customers with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means Clustering and Hierarchical Clustering
  • Collaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to fit the analytical requirements
  • Performed sentiment analysis and captured customer sentiments and categorized positive, negative, angry and happy customers from feedback forms
  • Ensured dat the model TEMPhas low False Positive Rate and Text classification and sentiment analysis for unstructured and semi-structured data
  • Use Principal Component Analysis in feature engineering to analyze high dimensional data
  • Perform data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from SQL Server and used ETL for data transformation
  • Developed MapReduce pipeline for feature extraction using Hive and Pig
  • Use MLlib, Spark's Machine learning library to build and evaluate different models
  • Communicate with team members, leadership, and stakeholders on findings to ensure models are well understood and incorporated into business processes

Data Analyst / BI Analyst

Confidential, Atlanta GA

Responsibilities:

  • Study and understanding of business requirements by communication with Business Analysts
  • Created Use cases, activity report, logical components to extract business process flows and workflows involved in the project using UML, Rational Rose and Microsoft Visio
  • Analyzed existing databases and data management and suggested redesign to the data management for improving performance and providing maximum insights through admin dashboards
  • Analyzed business requirements, system requirements data mapping requirement specifications and communicated it to developers effectively
  • Supported ad-hoc and standard reporting
  • Created dashboards and data stories using Tableau, Power BI and Microsoft Excel
  • Developed SQL scripts, stored procedures and views for data processing, maintenance and other database operations
  • Troubleshoot test scripts, SQL queries, ETL jobs data warehouse/data mart/data store models
  • Acquired data from various sources such as RDBMS systems, Excel sheets, CSVs and APIs using Python, R and beautiful soup
  • Data Wrangling and Manipulation using R, tidyR, dplyr
  • Designed and implemented cross-validation and statistical tests including Hypothetical Testing, ANOVA, Auto-correlation to verify the models’ significance
  • Performed statistical analysis like Linear/Logistic Regression, K-means Clustering, Table Analysis, etc., using SAS Enterprise Miner
  • Collaborated with Data Scientists in developing statistical models for analyzing sales and customer behavior. Participated in brain storming sessions for finding useful features
  • Extensively used SQL and T-SQL to write functions and triggers
  • Involved in development and implementation of SSIS, SSAS and SSRS application solutions for various business units across the organization
  • Performed Data modeling techniques employing Data Warehouse concepts like star/snowflake schema
  • Participated in Data Acquisition with Data Engineer team to extract historical and real-time data by using Hadoop,MapReduce and HDFS
  • Developed multiple MapReduce jobs in python for data cleaning and pre-processing
  • Involved in loading data from RDBMS and web logs into HDFS using Sqoop and Flume
  • Developed Hive queries for analysis, and exported the result set from Hive to MySQL using Sqoop after processing the data
  • Worked on improving performance of existing Pig and Hive Queries
  • Used JIRA, Taiga and various project management tools

Software Engineer

Confidential

Responsibilities:

  • Gathered, analyzed, documented and translated application requirements into data models and Supported standardization of documentation and the adoption of standards and practices related to data and applications
  • Involved in the design of data transformation components to support ETL processes
  • Data Extraction using MongoDB and SQL Server
  • Queried MySQL and SQL Server databases for internal reporting
  • Developed web and mobile applications using AngularJS and Ionic and NodeJS
  • Designed and Developed normalized MySQL database models for multi-platform applications
  • Used Python Django to develop backend, handled authentication, user security and database for Social Events project
  • Developed and used REST API using Python, JavaScript and ASP.NET
  • Created and managed private Git for version controlling
  • Worked actively as a part of team with managers and other staff to meet the goals of the project in the stipulated time

We'd love your feedback!