We provide IT Staff Augmentation Services!

Data Scientist Resume

5.00/5 (Submit Your Rating)

Dallas-tX

SUMMARY:

  • Advanced data analytics professional with 6+ years of experience in all phases of diverse technology projects specializing in data modeling, predictive analytics, data visualization, ETL, data quality, machine learning, Unit &A/B testing, Python, R, SQL, NoSQL, Hadoop, Hive, SparkML, Sqoop and Tableau.
  • Team builder with excellent communications, time & resource management & continuous client relationship development skills.
  • Involved in the design, model, validate and testing of multiple Machine Learning models against various data sets including behavioral data and deploy models in the backend.
  • Experience with data ETL both from SSIS and Sqoop on big data platforms, and performed data mining both on SSAS and SparkML platforms to feed in the data pipeline architecture.
  • Constructed and evaluated various types of datasets by performing machine learning models using algorithms and statistical modeling techniques such as clustering, classification, regression, etc.
  • Promoted safe monitoring and quick decision making by adding parameters' trends visualizations and segmentation, anomaly detection along with daily reports using Tableau dashboards.
  • Executed entire Data science Life Cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, feature scaling, feature engineering and statistical modeling.
  • Obtained better predictive performance matrices using ensemble methods like Bootstrap aggregation (Bagging) and Boosting (Boosted Decision Tree).
  • Performing predictive analytics and machine learning algorithms especially supervised (SVM, Logistic Regression, Boosting), unsupervised (K - Means, LDA, EM) and Reinforcement learning (Random Forests) methods.

TECHNICAL SKILLS:

Languages: Python, R, SQL, C++, HTML, JavaScript, Scala

Operating Systems: Microsoft Windows, Linux

Databases: SQL Server, MongoDB

Development Tools: Anaconda, Jupyter, RStudio, SSIS, Hive, Sqoop, Spark, Hadoop, Pig, SAS

Productivity Software: Microsoft Excel, Word, PowerPoint, STATA, ERWin

Visualization Platforms: Tableau, Power BI

Supervised: Linear Regression, Logistic Regression, Decision Trees, Random Forest, KNN, Support Vector Machine

PROFESSIONAL EXPERIENCE:

Confidential -Dallas-TX

Data Scientist

Responsibilities:

  • Responsible for analyzing the data and partnering with the business to generate best in class outcomes.
  • Tackled highly imbalanced Safety dataset using sampling techniques like under sampling and oversampling with Near miss and SMOTE using Python Scikit-learn & imblearn.
  • Responsible for data mining from disparate data sources and finding insights to promote business metrics.
  • Involved in machine learning model development on big data platforms with SparkML lib.
  • Working with the Hadoop ecosystem, including HDFS, MapReduce, Hive, and Spark for managing data processing and storage for big data applications running in clustered systems.
  • Organize and control of regular check data, performing comprehensive analysis of the data, and facilitating the on-time delivery of report to clients.
  • Performed ETL for different e-commerce websites for data gathering using SQL Server Integration Services.
  • Ability to work in team-oriented environment with strong aptitude for problem solving and collaboration.
  • Assisted teams in gathering and organizing unstructured data to help assess and improve systems.
  • Worked closely with cross functional partners to develop the right training sets for new features.

Confidential - Dallas, TX

Data Scientist

Responsibilities:

  • Identified proper data sources necessary for projects and ensure they are accurately imported and joined.
  • Constructed and evaluated various types of datasets by performing machine learning models using algorithms and statistical modeling techniques such as clustering, classification, regression, decision trees, support vector machines, anomaly detection, and text mining from Python libraries (scikit-learn).
  • Working experience with deep learning and neural network using NLTK, Keras and Tensorflow in python.
  • Effectively communicated with Business and IT partners to plan and achieve project initiatives.
  • Performed regular and ad-hoc analysis of data to optimize response accuracy and prioritize identified improvements.
  • Attained knowledge of A/B testing while working with developers and testers.
  • Identified proper analytic and visualization methodology and ensure analytic efforts are executed correctly.
  • Ensure all projects have proper documentation considering potential regulatory, legal, or business concerns.
  • Lead analysts functionally including division of tasks/resources across projects and ensuring analysts are productive and growing.
  • Lead communication of project results/challenges to business partners in ways they can understand.
  • Ensured project tasks are fully planned to include understanding of how implementation affects other units, how to measure results, and how to achieve the expected benefit.

Confidential

Senior Data Analyst

Responsibilities:

  • Performed statistical modeling for customer and application interactions on different platforms.
  • Responsible for data mining from disparate data sources and finding insights to promote business metrics.
  • Working with the Hadoop ecosystem, including HDFS, MapReduce, Hive, and Spark for managing data processing and storage for big data applications running in clustered systems.
  • Performed ETL for different IoT probes on pipeline for data gathering using SQL Server Integration Services.
  • Promoted safe monitoring and quick decision making by adding parameters' trends visualizations along with daily reports using Tableau.
  • Ability to work in team-oriented environment with strong aptitude for problem solving and collaboration.
  • Exposed to basic NoSQL technologies like MongoDB while working with a team of data scientists.
  • Effectively communicated with Business and IT partners to plan and achieve project initiatives.
  • Performed regular and ad-hoc analysis of data to optimize response accuracy and prioritize identified improvements.
  • Involved in the design, model, validate and testing of multiple Machine Learning models against various data sets including behavioral data and deploy models in the backend.
  • Collaborated with the business analyst on the requirements of the project and explored the data from the database querying (SQL) search techniques.

Confidential

Production Data Analyst

Responsibilities:

  • Promoted safe monitoring and quick decision making by adding parameters' trends visualizations along with daily reports using Tableau.
  • Delivered accurate production time series analysis while ensuring gas condensate timely processing guided by the international standards.
  • Increased on-field preparedness by 200% by production rate monitoring and predictive analysis using MS Excel resulting in smooth human resources allocation at each station.
  • Enhanced monitoring and data analysis 500% by pioneering customization of graphics control on Centum VP (Distributed Control Systems by Confidential Co. ).
  • Involved in the design, model, validate and testing of multiple Machine Learning models against various data sets including behavioral data and deploy models in the backend.
  • Collaborated with the business analyst on the requirements of the project and explored the data from the database querying (SQL) search techniques.
  • Executed entire Data science Life Cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, feature scaling, feature engineering and statistical modeling.
  • Working with the Hadoop ecosystem, including HDFS, MapReduce, Hive, and Spark for managing data processing and storage for big data applications running in clustered systems.
  • Read the different data formats like API (JSON), XML, CSV, Rich Text Format (.rtf), Open Document Text (. odt), HTML (.htm, .html), parquet, Avro.
  • Utilized Sqoop to fetch field data for querying on Hive to generate local reports for management.

Confidential

Analyst

Responsibilities:

  • Prepared probabilistic model of quote conversion using logistic regression on SAS.
  • Utilized decile-analysis on revenue model to prepare new segments.
  • Imported dictionary and Json format files after cleaning into Hive tables using SerDe.
  • Reported insights after running relevant queries for scoring.
  • Extracted manufacture and revenue data from database and transformed in relevant format.
  • Reported insights after running models on sales data.
  • Supported systems administration for Linux systems including system upgrades, user account setup and security administration, file permissions and access, and created SSH key pair for Linux (Ubuntu) virtual machines on windows 10 using putty.

We'd love your feedback!