We provide IT Staff Augmentation Services!

Data Scientist Resume

3.00/5 (Submit Your Rating)

Dallas, TX

PROFESSIONAL SUMMARY:

  • As a Data Scientist I have 6+ years of working experience in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scale across a massive volume of structured and unstructured data and expertise working in a variety of industries including Energy, Retail, Airline and e - commerce.
  • Expert in Data Science process life cycle: Data Acquisition, Data Preparation, Modeling (Feature Engineering, Model Evaluation) and Deployment.
  • Equipped with experience in utilizing statistical techniques which include hypothesis testing, Principal Component Analysis (PCA), ANOVA, sampling distributions, chi-square tests, time-series analysis, discriminant analysis, Bayesian inference, multivariate analysis
  • Efficient in preprocessing data including Data cleaning, Correlation analysis, Imputation, Visualization, Feature Scaling and Dimensionality Reduction techniques using Machine learning platforms like Python Data Science Packages (Scikit-Learn, Pandas, NumPy).
  • Expertise in building various machine learning models using algorithms such as Linear Regression, Logistic Regression, Naive Bayes, Support Vector Machines (SVM), Decision trees, KNN, K-means Clustering, Ensemble methods (Bagging, Gradient Boosting).
  • Experience in Text mining, Topic modeling, Natural Language Processing (NLP), Content Classification, Sentiment analysis, Market Basket Analysis, Recommendation systems, Entity recognition etc.
  • Applied text pre-processing and normalization techniques, such as tokenization, POS tagging, and parsing. Expertise using NLP techniques (TF-IDF, Word2Vec) and toolkits such as NLT K.
  • Experienced in tuning models using Grid Search, Randomized Grid Search, K-Fold Cross Validation.
  • Strong Understanding with artificial neural networks, convolutional neural networks, and deep learning
  • Skilled in using statistical methods including exploratory data analysis, regression analysis, regularized linear models, time-series analysis, cluster analysis, goodness of fit, Monte Carlo simulation, sampling, cross-validation, ANOVA, A/B testing, etc.
  • Working experience in Natural Language Processing (NLP) and Deep understanding of Statistics/Linear Algebra/Calculus and various optimization algorithms like gradient descent.
  • Familiar with key data science concepts (statistics, data visualization, machine learning, etc.). Experienced in Python, R, MATLAB, and SAS programming for statistic and quantitative analysis.
  • Knowledge on Time Series Analysis using AR, MA, and ARIMA models.
  • Experience in building production quality and large-scale deployment of applications related to natural language processing and machine learning algorithms.
  • Exposure to AI and Deep learning platforms such as TensorFlow, Pytorch and Keras .
  • Working knowledge of Big Data tools such as Hadoop - Hive QL, Sqoop, and Pig Latin .
  • Extensive experience working with RDBMS such as SQL Server, MySQL, and NoSQL databases such as MongoDB .
  • Generated data visualizations using tools such as Tableau, Python Matplotlib, Python Seaborn, R.
  • Knowledge and experience working in Agile environments including the scrum process and used Project Management tools like Jira and version control tools such as GitHub/Git.

TECHNICAL SKILLS:

Languages: Python, R, SQL, HTML, JavaScript, Scala

Operating Systems: Microsoft Windows, Linux

Databases: SQL Server, MongoDB

Development Tools: Anaconda, Jupyter, RStudio, SSIS, Hive, Sqoop, Pig, SAS

Productivity Software: Microsoft Excel, Word, PowerPoint, STATA, ERWin

Visualization Platforms: Tableau

Supervised: Linear Regression, Logistic Regression, Decision Trees, Random Forest, KNN, Support Vector Machine

Unsupervised: K-means, PCA, Hierarchical clustering Deep Learning, Natural Language Processing

PROFESSIONAL EXPERIENCE:

Confidential, Dallas, TX

Data Scientist

Responsibilities:

  • Responsible for analyzing the data and partnering with the business to generate best in class outcomes.
  • Tackle highly imbalanced dataset using sampling techniques like under-sampling and oversampling with Near miss and SMOTE using Python Scikit-learn & imblearn.
  • Responsible for data mining from disparate data sources and finding insights to promote business metrics.
  • Involved in machine learning model development and fine tuning them.
  • Organize and control of regular check data, performing comprehensive analysis of the data, and facilitating the on-time delivery of report to clients.
  • Performed ETL for different e-commerce websites for data gathering using SQL Server Integration Services.
  • Ability to work in team-oriented environment with strong aptitude for problem solving and collaboration.
  • Assisted teams in gathering and organizing unstructured data to help assess and improve systems.
  • Worked closely with cross functional partners to develop the right training sets for new features.

Confidential - Dallas, TX

Data Scientist

Responsibilities:

  • Identified proper data sources necessary for projects and ensure they are accurately imported and joined.
  • Constructed and evaluated various types of datasets by performing machine learning models using algorithms and statistical modeling techniques such as clustering, classification, regression, decision trees, support vector machines, and anomaly detection from Python libraries (scikit-learn).
  • Leverage statistical methods including exploratory data analysis, regression analysis, regularized linear models, goodness of fit, Monte Carlo simulation, bootstrapping, sampling, cross-validation, ANOVA.
  • Effectively communicated with Business and IT partners to plan and achieve project initiatives.
  • Performed regular and ad-hoc analysis of data to optimize response accuracy and prioritize identified improvements.
  • Attained knowledge of A/B testing while working with developers and testers.
  • Identified proper analytic and visualization methodology and ensure analytic efforts are executed correctly.
  • Ensure all projects have proper documentation considering potential regulatory, legal, or business concerns.
  • Lead analysts functionally including division of tasks/resources across projects and ensuring analysts are productive and growing.
  • Lead communication of project results/challenges to business partners in ways they can understand.
  • Ensured project tasks are fully planned to include understanding of how implementation affects other units, how to measure results, and how to achieve the expected benefit.

Confidential

Senior Data Analyst

Responsibilities:

  • Performed statistical analysis for customer and application interactions on different platforms.
  • Responsible for data aggregation and create different views from disparate data sources and finding insights to promote business metrics.
  • Knowledge of manufacturing, sales and financial data through SQL operations on them.
  • Exposed to ETL from different data sources for data gathering using SQL Server Integration Services.
  • Promoted safe monitoring and quick decision making by adding parameters' trends visualizations along with daily reports and dashboards using Tableau.
  • Ability to work in team-oriented environment with strong aptitude for problem solving and collaboration.
  • Exposed to basic NoSQL technologies like MongoDB while working with a team of data scientists.
  • Effectively communicated with Business and IT partners to plan and achieve project initiatives.
  • Performed regular and ad-hoc analysis of data to optimize response accuracy and prioritize identified improvements.
  • Involved in the design, model, validate and testing of multiple Machine Learning models against various data sets including behavioral data and help deploy models in the backend.
  • Collaborated with the business analyst on the requirements of the project and explored the data from the database querying (SQL) search techniques.

Confidential

Production Data Analyst

Responsibilities:

  • Promoted safe monitoring and quick decision making by adding parameters' trends visualizations along with daily reports using Tableau.
  • Delivered accurate production time series analysis while ensuring gas condensate timely processing guided by the international standards.
  • Increased on-field preparedness by 200% by production rate monitoring and predictive analysis using MS Excel resulting in smooth human resources allocation at each station.
  • Enhanced monitoring and data analysis 500% by pioneering customization of graphics control on Centum VP (Distributed Control Systems by Confidential Co. ).
  • Involved in the design, model, validate and testing of multiple Machine Learning models against various data sets including behavioral data and deploy models in the backend.
  • Collaborated with the business analyst on the requirements of the project and explored the data from the database querying (SQL) search techniques.
  • Exposed to entire Data science Life Cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, feature scaling, feature engineering and statistical modeling.
  • Read the different data formats like API (JSON), XML, CSV, Rich Text Format (.rtf), Open Document Text (. odt), HTML (.htm, .html), parquet, Avro.
  • Utilized Sqoop to fetch field data for querying on Hive to generate local reports for management.

Confidential

Data Analyst

Responsibilities:

  • Prepared probabilistic model of quote conversion using logistic regression on SAS.
  • Utilized decile-analysis on revenue model to prepare new segments.
  • Imported dictionary and JSON format files after cleaning into Hive tables using SerDe.
  • Reported insights after running relevant queries for scoring.
  • Extracted manufacture and revenue data from database and transformed in relevant format.
  • Reported insights after running models on sales data.
  • Supported systems administration for Linux systems including system upgrades, user account setup and security administration, file permissions and access, and created SSH key pair for Linux (Ubuntu) virtual machines on windows 10 using putty.

We'd love your feedback!