DATA SCIENTIST Resume Atalanta, GA - Hire IT People

SUMMARY

Over 7 years of experience in Machine Learning, Datamining with large Data Sets of Structured and Unstructured Data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization, Web Crawling, Web Scraping Statistical Modeling, Data Mining and Natural Language Processing (NLP)
Adept in statistical programming languages like Python and R including Bigdata technologies like Hadoop and Hive
Proficient in managing entire data science project life cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, Engineering, features scaling, features engineering, statistical modeling (Decision Trees, Regression Models, Neural Networks, Support Vector Machine (SVM), Clustering), dimensionality reduction using Principal Component Analysis and Factor Analysis, testing and validation using ROC plot, K - fold cross validation and Data Visualization
Adept and deep understanding of Statistical Modeling, Multivariate Analysis, model testing, problem analysis, model comparison and validation.
Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing Data Mining and reporting solutions that scales across massive volume of structured and unstructured Data
Skilled in performing Data Parsing, Data Manipulation and Data Preparation with methods including describe Data contents, compute descriptive statistics of Data, regex, split and combine, remap, merge, subset, reindex, melt and reshape
Experience in using various packages in Python and R like ggplot2, caret, dplyr, Rweka, gmodels, RCurl, tm, C50, twitteR, NLP, Reshape2, rjson, plyr, pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, Beautiful Soup, Rpy2
Extensive experience in Text Analytics, generating Data Visualization using Python and R creating dashboards using tools like Tableau
Hands on experience with Big Data tools like Hadoop, Spark, Hive, Pig, Impala, Pyspark, SparkSql
Hands on experience in implementing LDA, NaiveBayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, neural networks, Principle Component Analysis
Good Knowledge in Proof of Concepts (PoC's), gap analysis and gathered necessary Data for analysis from different sources, prepared Data for Data exploration using Data munging
Good industry knowledge, analytical & problem solving skills and ability to work well with in a team as well as an individual
Highly creative, innovative, committed, intellectually curious, business savvy with good communication and interpersonal skills
Deep understanding of MapReduce with Hadoop and Spark. Good knowledge of Big Data ecosystem like Hadoop 2.0 (HDFS, Hive, Pig, Impala), Spark (SparkSql, Spark MILib, Spark Streaming)
Excellent performance in building, publishing customized interactive reports and dashboards with customized parameters and user-filters using Tableau
Good understanding of web design based on HTML5, CSS3, and JavaScript
Excellent understanding of Systmes Development Life Cycle (SDLC), Agile, Scrum and waterfall
Experience with version control tool - Git
Effective team player with strong communication and interpersonal skills, possess a strong ability to adapt and learn new technologies and new business lines rapidly
Extensive experience in Data Visualization including producing tables, graphs, listings using various procedures and tools such as Tableau

PROFESSIONAL EXPERIENCE

Languages: Python, R, C, C++

Machine learning: Linear Regression, Logistic Regression, Naïve Bayes, SVM, Decision Trees, Random Forest, Boosting, Kmeans, Bagging etc.

Machine Learning Library: pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit-Learn.

Deep Learning Frameworks: Tensor flow, Keras

Data Analysis and Visualization: Numpy, Pandas, MatPlotLib, Seaborn, Sci-kit learn, Excel, Tableau

Databases: Oracle, MySQL, SQL Management Studio

Front End Technologies: CSS, HTML, XML, JSON and jQuery

Environment: s: Jupyter, R Studio, Anaconda, Spyder, Python Console, Pycharm

PROFESSIONAL EXPERIENCE

Confidential - Atalanta, GA

DATA SCIENTIST

Responsibilities:

Analyze and Prepare data, identify the patterns on dataset by applying historical models. Collaborating with Senior Data Scientists for understanding of data
Perform data manipulation, data preparation, normalization, and predictive modelling. Improve efficiency and accuracy by evaluating model in Python and R
This project was focused on customer segmentation based on machine learning and statistical modelling effort including building predictive models and generate data products to support customer segmentation
Used Python and R for programming for improvement of model. Upgrade the entire models for improvement of the product
Develop a pricing model for various product and services bundled offering to optimize and predict the gross margin
Built price elasticity model for various product and services bundled offering
Under supervision of Sr. Data Scientist performed Data Transformation method for Re scaling and Normalizing Variables
Developed predictive causal model using annual failure rate and standard cost basis for the new bundled service offering
Design and develop analytics, machine learning models, and visualizations that drive performance and provide insights, from prototyping to production deployment and product recommendation and allocation planning
Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionality reduction etc
Worked with sales and Marketing team for Partner and collaborate with a cross-functional team to frame and answer important data questions prototyping and experimentation ML/DL algorithms and integrating into production system for different business needs
Worked on Multiple datasets containing two billion values which are structured and unstructured data about web applications usage and online customer surveys
Good hands on experience on Amazon Red shift platform
Performed Data cleaning process applied Backward - Forward filling methods on dataset for handling missing values
Design, built and deployed a set of python modelling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior prediction and support multiple marketing segmentation programs
Segmented the customers based on demographics using K-means Clustering
Explored different regression and ensemble models in machine learning to perform forecasting
Presented Dashboards to Higher Management for more Insights using Power BI
Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring
Performed Boosting method on predicted model for the improve efficiency of the model
Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom, visualization tools using R, Tableau, and Power BI
Collaborating with the project managers and business owners to understand their organizational processes and help design the necessary reports

Environment: MS SQL Server, R/R studio, SQL Enterprise Manager, Python, Red shift, MS Excel, Power BI, Tableau, T-SQL, ETL, MS Access, XML, MS office, Outlook, AS E-Miner

Confidential - West Chester, PA

DATA SCIENTIST

Responsibilities:

Used various approaches to collect the business requirements and worked with the business users for ETL application enhancements by conducting various Joint Requirements Development (JRD) sessions to meet the job requirements
Performed exploratory data analysis like calculation of descriptive statistics, detection of outliers, assumptions testing, factor analysis, etc., in Python and R
Build models based on domain knowledge and customer business objectives
Extracted data from the database using Excel/Access, SQL procedures and created Python and R datasets for statistical analysis, validation and documentation
Extensively understanding BI, analytics focusing on consumer and customer space
Innovate and leverage machine learning, data mining and statistical techniques to create new, scalable solutions for business problems
Performed Data Profiling to assess data quality using SQL through complex internal database
Improved sales and logistic data quality by data cleaning using Numpy, Scipy, Pandas in Python
Designed data profiles for processing, including running SQL, Procedural/SQL queries and using Python and R for Data Acquisition and Data Integrity which consists of Datasets Comparing and Dataset schema checks
Used R to generate regression models to provide statistical forecasting
Conducted data/statistical analysis, generated Transaction Performance Report on monthly and quarterly basis for all the transactional data from U.S., Canada, and Latin America Markets using SQL server and BI tools such as Report services and Integrate services(SSRS and SSIS)
Used drill downs, filter actions and highlight actions in Tableau for developing dashboards in Tableau
Implemented Key Performance Indicator (KPI) Objects, Actions, Hierarchies and Attribute Relationships for added functionality and better performance of SSAS Warehouse
Applied Clustering Algorithms such as K-Means to categorize customers into certain groups
Performed data management, including creating SQL Server Report Services to develop reusable code and an automatic reporting system and designed user acceptance test to provide end with an opportunity to give constructive feedback
Used Tableau and designed various charts and tables for data analysis and creating various analytical Dashboards to showcase the data to managers.
Create a model for forecast revenue
Applied association rule mining & chain model to identify hidden patterns and rules in remedy ticket analysis which aid in decision making
Segmenting ABO population and developing demographic profile against each fragment
Isolating customer behavioral patterns by analyzing millions of customer data records over a period of time and correlating multiple customers' attributes
Empowered decision makers with data analysis dashboards using Tableau and Power BI

Environment: R/R Studio, SAS, SSRS, SSIS, Oracle Database 11g, Oracle BI tools, Tableau, MS-Excel, Python, Naive Bayes, SVM, K- means, ANN, Regression, MS Access, SQL Server Management Studio, SAS E-Miner

Confidential - Dallas, TX

DATA SCIENTIST

Responsibilities:

Involved in complete Software Development Life Cycle (SDLC) process by analyzing business requirements and understanding the functional work flow of information from source systems to destination systems
A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL,, Unix Commands, NoSQL, Hadoop
Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK in Python for developing various machine learning algorithms
Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python
Analyzed sentimental data and detecting trend in customer usage and other services
Analyzed and Prepared data, identify the patterns on dataset by applying historical models
Collaborated with Senior Data Scientists for understanding of data
Used Python and R scripting by implementing machine algorithms to predict the data and forecast the data for better results
Used Python and R scripting to visualize the data and implemented machine learning algorithms
Experience in developing packages in R with a shiny interface
Used predictive analysis to create models of customer behavior that are correlated positively with historical data and use these models to forecast future results
Predicted user preference based on segmentation using General Additive Models, combined with feature clustering, to understand non-linear patterns between user segmentation and related monthly platform usage features (time series data)
Perform data manipulation, data preparation, normalization, and predictive modeling
Improve efficiency and accuracy by evaluating model in Python and R
Used Python and R script for improvement of model
Application of various machine learning algorithms and statistical modeling like Decision Trees, Random Forest, Regression Models, neural networks, SVM, clustering to identify Volume using scikit-learn package
Performed Data cleaning process applied Backward - Forward filling methods on dataset for handling missing values
Developed a predictive model and validate Neural Network Classification model for predict the feature label
Performed Boosting method on predicted model for the improve efficiency of the model
Presented Dashboards to Higher Management for more Insights using Power BI and Tableau
Hands on experience in using HIVE, Hadoop, HDFS and Bigdata related topics

Environment: R/R studio, Python, Tableau, Hadoop, Hive, MS SQL Server, MS Access, MS Excel, Outlook, Power BI

Confidential

Data Analyst

Responsibilities:

Participated in the test environment setup and in ensuring that the facilities, test tools and scripts are in place to successful perform the required testing effort
Acted as a liaison between the Oracle deployment team and the business finance group.
Interviewed various personnel including broker dealers and traders to understand the current process and the future requirements
Tested user interface and navigation controls of the application using QuickTest Pro
Handle exceptional situations in test scripts using Recovery Scenario Manager in QuickTest Pro
Developed Base-line scripts in VBScript for performing regression testing on future releases of the application
Developed test scripts in VBScript for data-driven testing. Executed the test scripts and analyzed the results
Developed PL/SQL Functions, Procedures, Oracle PL/SQL Programs
Verified the application’s functionality on different Configurations with QuickTest Pro
Handled dynamic Objects using regular expression in QuickTest Pro
Involved in both manual testing and developed automated test scripts using VBScript in QuickTest Pro
Maintained various versions of Test Scripts and performed various testing strategies
Backend testing using database checkpoints in QuickTest Pro
Created and maintained SQL Scripts and Unix Shell scripts to perform back-end testing
Responsible for communicating with a team of 10 people working offshore
Involved in Data Analysis, Data Modeling and Logical Data Specification
Involved in the Data Movement between Systems validated the Business Requirements
Experienced working with Agile Scrum and Waterfall Models

Environment: Oracle, Windows, UML, MS-Visio, Toad, QC 11.0, QTP 11.0, VS 2008, HP ALM

We provide IT Staff Augmentation Services!

Data Scientist Resume

Atalanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship