We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

Atlanta, GA

PROFESSIONAL SUMMARY:

  • Experienced data scientist with 5+ years of experience successfully implementing machine learning and artificial intelligence algorithms, creative problem solving, and mathematical ability to improve businesses in multiple industries, and utilizing various statistical methods to improve decision making processes among managers and top - level decision makers. Has a natural curiosity and desire to learn, as well as a genuine love of collaboration and team work.
  • Skilled in Machine Learning, Statistical Modeling, and Big Data
  • Creative problem-solver with strong analytical, leadership, and communication skills
  • Proficient in Python, R, Scala, Java, SQL, and C
  • Experience in Machine Learning, Data mining with large datasets of Structured and Unstructured Data, Data Acquisition, Data Validation, and Predictive Modeling
  • Data Science Specialties include: Machine Learning, Sequential Modeling, Natural Language Processing (NLP)
  • Use of Analytical Skills: Bayesian Analysis, Inference, Time-Series Analysis, Regression Analysis, Linear models, Multivariate analysis, Sampling methods, Forecasting, Segmentation, Clustering, Sentiment Analysis, Part of Speech Tagging, and Predictive Analytics
  • Experience in stochastic optimization and regression with machine learning algorithms
  • Experience formulating and solving discrete and continuous optimization problems
  • Able to research statistical machine learning, supervised learning, and classification methods
  • Strong mathematical and statistical modeling and computer programming skills in an innovative manner
  • Use of Various Analytics Tools: Classification and Regression Trees (CART), Support Vector Machine (SVM), Random Forest, Gradient Boosting Machine (GBM), Principal Component Analysis (PCA), Regression, Naïve Bayes, Support Vector Machines
  • Use of Data Integration involving SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS)
  • Data Query: Azure, Google, Amazon RedShift, SQL and noSQL, various SQL and NoSQL databases and data warehouses
  • Experience with AWS cloud computing, Spark, and capable of working with large datasets
  • Deep Learning: Machine perception, Data Mining, Machine Learning algorithms, Neural Networks, TensorFlow, Keras
  • Able to deliver presentations and highly technical reports; collaboration with stakeholders and cross-functional teams, advisement on how to leverage analytical insights
  • Development of clear analytical reports which directly address strategic goals
  • Able to identify and learn applicable new techniques independently as needed
  • Able to work comfortably and effectively within an interdisciplinary research environment
  • Experience with validation of machine learning ensemble classifiers
  • Utilized the online datasets to implement machine learning models using SparkML for building prototypes

TECHNICAL SKILLS:

Programming Languages: R, Python (NumPy, Pandas, Scikit-Learn), SQL, HiveQL, Spark, C++, Hadoop MapReduce

Knowledge of: both supervised and unsupervised learning algorithms

Machine Learning: TensorFlow, PCA, RNN, Clustering, Random Forest, Naïve Bayes, Support Vector Machine, Logistic Regression, Linear Regression, Decision Tree, Random Forest, Gradient Boosting, etc.

Experience using: k-Means Clustering within a feature space to engineer new features to enhance supervised learning projects, or to identify hidden structures within seemingly unstructured data sets

Knowledge of: relational database theory

Expert level: Python developer (7+ years)

Ability to: productionize PoC algorithms using TensorFlow and plumber

Expert level: SQL developer (7+ years)

Deep working knowledge in: practical machine learning considerations, such as outlier detection and regularization, nuisance correlation, dimensionality reduction, and generalization

Very strong in all flavors of: SQL, including MySQL, Postgre SQL, Microsoft SQL Server, and HQL/HiveQL

Expert in: Statistics and Optimization techniques

Knowledge of: theoretical underpinnings of many common machine learning algorithms, as well as professional experience in successful development, testing, and implementation of said algorithms

Expert in: Deep Learning and Neural Networks, including hyperparameter tuning, regularization, and dropout to build reliable and powerful prediction models

Experienced in: Agile Scrum project processes

Skilled in: productionizing and deploying models using Python and TensorFlow

Ability to: handle high volume, high velocity data using distributed system architectures using Spark

Expert in: building, interrogating, and interpreting Statistical Models to gain better insight into the business processes they approximate

Development Tools: Git Version Control, Jupyter Notebook, IPython Notebook, R Markdown, Unix

Visualization Tools: Tableau, Cognos, Ggplot (R), SAS, PowerBI, Matplotlib, MS Excel,Strong professional experience with the Software Development Life Cycle and in quickly delivering results, K-means clustering, RFM Analysis, DBSCAN, Affinity Propagation, Principal Component Analysis, Support Vector Machines, Naïve Bayes, Auto Regression & Moving Averages.

Statistical methods: ARIMA, ANOVA, Regression Analysis, Hypothesis Testing

PROFESSIONAL EXPERIENCE:

DATA SCIENTIST

Confidential, Atlanta, GA

Responsibilities:

  • Lead communication efforts with other engineers and project managers to gather requirements, deadlines, and other criteria
  • Lead weekly scrum meetings to assign tasks and deadlines to data science team members, tracked tasks using JIRA
  • Assisted data science team with any hurtles or problems during sprints to ensure high quality of work and that deadlines were always met
  • Used Python and various packages such as TensorFlow, Pandas, matplotlib, and Seaborn to perform initial data exploration, data cleaning, to develop proofs of concept, and perform cross validations and final model selection
  • Worked alongside data science team in researching and implementing cutting edge methodologies in IoT, BIM, and reinforcement learning
  • Developed and implemented anomaly detection algorithms for the detection of any sudden system failures using Python and TensorFlow.
  • Utilized domain knowledge and application portfolio knowledge to play a key role in defining the future state of large, business technology programs.
  • Participated in all phases of datamining, data collection, data cleaning, developing models, validation, and visualization and performed gap analysis.
  • Developed MapReduce/Spark, R modules for machine learning & predictive analytics in Hadoop on AWS. Implemented a R-based distributed random forest.
  • Utilized Spark, Scala, Hadoop, SparkStreaming, MLLib abroad variety of machine learning methods including classifications, regressions, dimensionally reduction etc. and utilized the engine to increase user lifetime by 45% and triple user conversations for target categories.
  • Developed and implemented survival regression techniques to predict impending or expected system failures to suggest timely preventative maintenance using Python
  • Tested and developed multiple recurrent neural network architectures to perform strongly data driven prediction on time parameterized target variables with Python
  • Held regular meetings with project managers to give updates on data science team progress, and to in corporate new requirements and feedback in response to challenges encountered by other parts of the team, as possible
  • Managed Git repository and implemented unit testing to ensure integrity of source code base was maintained Confidential all times
  • Utilized continuous implementation as model was tested and retrained on new data

DATA SCIENTIST

Confidential, Bellevue, WA

Responsibilities:

  • Lead a team in maintaining and updating pricing models, built in R, to ensure products were optimally priced for market conditions
  • Developed models for forecasting sales Confidential all levels of the supply chain using R
  • Documented data extraction, data warehouse design, and existing ETL processes
  • Responsible for processing and maintaining data, and developing queries using SQL Server
  • Developed stored procedures, optimized and enhanced existing SQL queries
  • Lead efforts in data analysis and investigation, data mining, and generation of automated reports using R and LaTeX with the knitr package
  • ETL process automation using SSIS and power shell
  • Built reporting dashboards for rapid dissemination of business intelligence to relevant decision makers using R and Shiny
  • Integrated reporting systems utilizing R with ETL systems utilizing SQL to automate data extraction and reporting end-to-end
  • Implemented and maintained tariff plans on a number of rating engines (Post pay, Prepay, and VOIP) using R
  • Documented tariff changes and generated product specification, configuration rules, and test plans in detail using R with LaTeX and markdown
  • Highly involved in bridging the gap between teams of differing specialties (specifically technical and non-technical teams)
  • Synchronized communication with teams in European countries to gather updates and information regarding the evolution of markets
  • Interpreted sales reports and converted them into data to be utilized in pricing and demand forecasting models with SQL and R, utilizing external packages for additional machine learning methodologies as necessary

DATA SCIENTIST

Confidential, Winston-Salem, NC

Responsibilities:

  • Participated in all phases of data acquisition, data cleaning, developing models, validation, and visualization to deliver data science solutions.
  • Retrieving data from SQL Server database by writing SQL queries like stored procedure, temp table, view.
  • Connected Database with Jupyter notebook for modeling and Tableau for visualization and reporting.
  • Worked on fraud detection analysis on loan applications using the history of loan taking with supervised learning methods.
  • Used Pandas, NumPy, Seaborn, Matplotlib, SciKit-learn in Python for developing various machine learning models and utilized algorithms such as Logistic regression, Random Forest, Gradient Boost Decision Tree and Neural Network.
  • Experienced in performing feature engineering such as PCA for high dimensional datasets, important feature selection by Tree-based models.
  • Perform model tuning and selection by using cross-validation, parameters tuning to prevent overfitting.
  • Created ecosystem models (e.g. conceptual, logical, physical, canonical) that are required for supporting services within the enterprise data architecture (conceptual data model for defining the major subject areas used, ecosystem logical model for defining standard business meaning for entities and fields, and an ecosystem canonical model for defining the standard messages and formats to be used in data integration services throughout the ecosystem).
  • Ensemble methods were used to increase the accuracy of the training model with different Bagging and Boosting methods.

DATA ANALYST

Confidential, Atlanta, GA

Responsibilities:

  • Prepared and created target market reports for franchise sales teams using R, SQL, Shiny, and LaTeX with knitr
  • Analyzed / implemented CRM enhancements and error resolutions using R
  • Designed data quality checks using R and SQL and worked with data architects and engineers to implement checks within the existing ETL systems
  • Produced profit margin reports and cost/benefit reports using R, SQLite, and Shiny
  • Performed statistical analyses of customer behavior in R to maximize sales
  • Assisted managers in identifying capabilities and processes that drive continuous improvement
  • Performed statistical analysis of competitors’ product pricing with analysis of other market factors in R to determine how competitor pricing structures might impact market conditions and recommend appropriate price adjustments
  • Coordinated and carried out validation of models in R

We'd love your feedback!