Data Scientist Resume Boulder, CO - Hire IT People

SUMMARY

Professional experience spanning 17+ years in data science, data analytics, data mining, machine learning, business analytics, business intelligence, competitive intelligence, predictive analytics, forecasting, data acquisition, data validation, predictive modeling,data visualization& project managementacross domains of biostatistics, public sector, services, retail, BFSI, econometric, academic and HR domain.
Professional and academic work experience with handling large data sets of structured and unstructured data.
Hands on experience of working with APIs and doing analytics on the same
Hands on experience in Natural Language Processing and Sentiment Analysis
Hands on experience in undertaking unsupervised, supervised and reinforced Machine learning, analytics and visualizations using various packages of R and core analytical Python libraries such as Numpy, Scipy, Pandas and Scikit - learn and data visualization in Matplotlib,
Extensive experience in performing the statistics test (ANOVA, Hypothesis testing and A/B test), Multivariate analysis and EFA/ PCA/CFA, etc
Proficient in Statistical Modeling, Data Mining and Machine Learning Algorithms in Data Science/Forecasting/Predictive Analytics such as Linear and Logistics Regression, LDA, Item and Discriminate Analysis, Apriori, Random Forest, K Means, Artificial Neural Network,Decision Tree,SVM, K-Nearest, Bayesian, Hidden Markov, etc.
Experience on developing dashboards using Tableau, PowerBI
Experience working with relational databases like MySQL, MS Access, NoSQL databases like MongoDB
Exposure to AI and Deep learning platforms/methodologies like tensor flow, RNN
Experience in statistical packages like SPSS, SAS, Crystal Ball
Experience in dealing with structured and semi-structured data inHDFS and Hive
Familiar with the concepts of Hadoop, MapReduce framework
Experience in doing data visualization and report dashboard in Tableau, Qlick view, and familiar withPower BI
Maintained version control of Python using Git, Github
Worked in Linux as well as Windows
Knowledge of the Software Development Life Cycle (SDLC), Agileand Scrum
A good team player and self-motivated by passion in data science

TECHNICAL SKILLS

Languages/Tools/Big Data: R3.X ( all major packages), Python 2.x/3.x (Numpy, Pandas, Scipy,Scikit-learn, Matplotlib)Hadoop/HDFS/Hive/Pig AWS

Machine Learning Algorithms: Linear and Logistics Regression, LDA, Item and Discriminate Analysis, Apriori, Random Forest, K Means, Artificial Neural Network, Decision Tree, SVM, K-Nearest, Bayesian, Hidden Markov, etc.

Visualizations: Ggplot,GoogleViz, Tableau, 3 D visualization, Power BI, Qlik, developing dashboards using Tableau, Power BI, etc

Statistical Tools: SPSS, Crystal Ball, SAS

PROFESSIONAL EXPERIENCE

Confidential, Boulder, CO

Data Scientist

Responsibilities:

Developed applications of Machine Learning, Statistical Analysis and Data Visualizations with challenging data Processing problems in sustainability and biomedical domain.
Compiled data from various sources public and private databases to perform complex analysis and data manipulation for actionable results.
Gathers, analyzes, documents and translates application requirements into data models and Supports standardization of documentation and the adoption of standards and practices related to data and applications.
Developed and implemented predictive models using Natural Language Processing Techniques and machine learning algorithms such as linear regression, classification, multivariate regression, Naive Bayes, Random Forests, K-means clustering, KNN, PCA and regularization for data analysis.
Designed and developed Natural Language Processing models for sentiment analysis.
Worked on Natural Language Processing with NLTK module of python for application development for automated customer response.
Used predictive modeling with tools in SAS, SPSS, R, Python.
Applied concepts of probability, distribution and statistical inference on given dataset to unearth interesting findings through use of comparison, T-test, F-test, R-squared, P-value etc.
Applied linear regression, multiple regression, ordinary least square method, mean-variance, theory of large numbers, logistic regression, dummy variable, residuals, Poisson distribution, Bayes, Naive Bayes, fitting function etc to data with help of Scikit, Scipy, Numpy and Pandas module of Python.
Applied clustering algorithms i.e.Hierarchical, K-means with help of Scikit and Scipy.
Developed visualizations and dashboards using ggplot, Tableau
Worked on development of data warehouse, Data Lake and ETL systems using relational and non relational tools like SQL, No SQL.
Built and analyzed datasets using R, SAS, Matlab and Python (in decreasing order of usage).
Applied linear regression in Python and SAS to understand the relationship between different attributes of dataset and causal relationship between them
Performs complex pattern recognition of financial time series data and forecast of returns through the ARMA and ARIMA models and exponential smoothening for multivariate time series data
Pipelined (ingest/clean/munge/transform) data for feature extraction toward downstream classification.
Used ClouderaHadoop YARN to perform analytics on data in Hive.
Wrote Hive queries for data analysis to meet the business requirements.
Expertise in Business Intelligence and data visualization using R and Tableau.
Expert in Agile and Scrum Process.
Validated the Macro-Economic data (e.g. BlackRock, Moody's etc.) and predictive analysis of world markets using key indicators in Python and machine learning concepts like regression, Boot strap Aggregation and Random Forest.
Worked in large scale database environment like Hadoop and MapReduce, with working mechanism of Hadoop clusters, nodes and Hadoop Distributed File System (HDFS).
Interfaced with large scale database system through an ETL server for data extraction and preparation.
Identified patterns, data quality issues, and opportunities and leveraged insights by communicating opportunities with business partners.
Delivered and communicated research results, recommendations, opportunities, and supporting technical designs to the managerial and executive teams, and implemented the techniques for priority projects.

Environment: Machine learning, AWS, MS Azure, Cassandra, Spark, HDFS, Hive, Pig, Linux, Python (Scikit-Learn/Scipy/Numpy/Pandas), R, SAS, SPSS, MySQL, Eclipse, PL/SQL, SQL connector, Tableau.

Confidential, Milwaukee, WI

Sr. Data Scientist SME

Responsibilities:

Applied concepts of probability, distribution and statistical inference on given dataset to unearth interesting findings through use of comparison, T-test, F-test, R-squared, P-value etc.
Applied linear regression, multiple regression, ordinary least square method, mean-variance, theory of large numbers, logistic regression, dummy variable, residuals, Poisson distribution, Bayes, Naive Bayes, fitting function etc to data with help of Scikit, Scipy, Numpy and Pandas module of Python.
Applied clustering algorithms i.e.Hierarchical, K-means with help of Scikit and Scipy.
Built and analyzed datasets using R, SAS, Matlab and Python (in decreasing order of usage).
Applied linear regression in Python and SAS to understand the relationship between different attributes of dataset and causal relationship between them
Performs complex pattern recognition of financial time series data and forecast of returns through the ARMA and ARIMA models and exponential smoothening for multivariate time series data
Pipelined (ingest/clean/munge/transform) data for feature extraction toward downstream classification.
NLP and sentiment analytics using social media API calls
Developing data mining; data analytics data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
Performed K-means clustering, Kalman filtering, Multivariate analysis and Support Vector Machines in Python and R.
Developed Clustering algorithms and Support Vector Machines that improved Customer segmentation and Market Expansion.
Worked on NOSQL databases such as MongoDB and Cassandra.
Experienced in Agile methodologies and SCRUM process.

Environment: Hadoop, Map-Reduce, HDFS, SQL, Pig,R, Python.

Confidential

Team leader Data Analyst

Responsibilities:

Multivariate business and resource forecasting using machine learning algorithms. .
Applied linear regression, multiple regression, ordinary least square method, mean-variance, theory of large numbers, logistic regression, dummy variable, residuals, Poisson distribution, Bayes, Naive Bayes, fitting function etc to data with help of Scikit, Scipy, Numpy and Pandas module of Python.
Applied clustering algorithms on market data to study the underlying data patterns. Methodologies used were PCA, Factor analysis,Hierarchial, K-means through Scikit/Scipy, R for projecting market .
Built and analyzed datasets using R, and Python.
Applied linear regression in Python and SAS to understand the relationship between different attributes of dataset and causal relationship between them
Performs complex pattern recognition of financial time series data and forecast of returns through the ARMA and ARIMA models and exponential smoothening for multivariate time series data
Developed ETL based systems for data acquisition and data consumption by stakeholders.
Developing data mining; data analytics data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
Performed K-means clustering, Kalman filtering, Multivariate analysis and Support Vector Machines in Python and R.
Developed Clustering algorithms and Support Vector Machines that improved Customer segmentation and Market Expansion.

Environment: Hadoop, Map-Reduce, SQL, R, Python, In house ETL tools.

Confidential

Data Analyst

Responsibilities:

Multivariate biostatical and environment data analytics.
Applied linear regression, multiple regression, ordinary least square method, mean-variance, theory of large numbers, logistic regression, dummy variable, residuals, Poisson distribution, Bayes, Naive Bayes, fitting function etc to data with help of Scikit, Scipy, Numpy and Pandas module of Python.
Applied decision tree and neural network based systems for forecasting and predictive analytics. Methodologies used were RNN, KnN, SMV, PCA, Factor analysis, Hierarchial, K-means through Scikit/Scipy, R, etc.
Built and analyzed datasets using R, and Python.
Applied linear regression in Python and SAS to understand the relationship between different attributes of dataset and causal relationship between them
Pipelined (ingest/clean/merge/transform) data for feature extraction toward downstream classification.
Developing data mining; data analytics data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
Performed K-means clustering, Kalman filtering, Multivariate analysis and Support Vector Machines in Python and R.
Developed Clustering algorithms and Support Vector Machines that improved Customer segmentation and Market Expansion.

Environment: MS SQL, SAS, R, Vector Machines, Python.

Confidential

Lead Consultant, Data base Management

Responsibilities:

Developed the program management, program monitoring and evaluation framework of 27 mission mode project for data acquisition, data dissemination and data interoperability
Statistical analysis and resource forecasting systems using statistical tools and packages like SPSS, SAS, etc..
Applied linear regression, multiple regression, ordinary least square method, mean-variance, theory of large numbers, logistic regression, dummy variable, residuals, Poisson distribution, Bayes, Naive Bayes, fitting function etc to data with help of Scikit, Scipy, Numpy and Pandas module of Python.
Applied clustering algorithms on market data to study the underlying data patterns. Methodologies used were PCA, Factor analysis, Hierarchial, K-means through Scikit/Scipy, R for projecting market .
Built and analyzed datasets using R, and Python.
Projecting casual - effect relationship among various components of the project using open tools.
Performing complex pattern recognition of the time series data for analytical purposes,
Developed ETL based systems for data acquisition and data consumption by stakeholders.
Developing data mining; data analytics data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.

Environment: Red Hat, MS, SAS, SPSS, SQL, Data Warehousing, R, Python, MS Access, In house ETL tools.

Confidential

Data Technology Analyst

Responsibilities:

Developing BI/CI solution for senior stakeholders
Statistical analysis and resource forecasting systems using statistical tools and packages like SPSS, SAS, etc..
Developing forecasting and resource projection solutions using open source and proprietary tools.
Projecting casual - effect relationship among various components of the project using open tools.
Performing complex pattern recognition of the time series data for analytical purposes,
Developed ETL based systems for data acquisition and data consumption by stakeholders.
Developing data mining; data analytics data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.

Environment: Red Hat, MS, SAS, SPSS, SQL, Data Warehousing & OLAP tools.

We provide IT Staff Augmentation Services!

Data Scientist Resume

Boulder, CO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship