We provide IT Staff Augmentation Services!

Data Scientist / Data Analyst Resume

4.00/5 (Submit Your Rating)

Oak Ridge, NJ

PROFESSIONAL SUMMARY:

  • Overall 12 years working experience in IT consulting service of various industries, in which over 6+ years experience specialized in data science/quantitive analytics by building interpretable machine learning models, building end to end data pipelines which included extracting, transforming and combining all incoming data with the goal of discovering hidden insight, involving complex IT projects business analysis in life/annuity insurance products, and project management with an eye to improve business processes, address business problems or result in cost savings by developing exceptional relationships with clients, peers and senior management.
  • Experience with Hadoop ecosystem HDF, Hive and Apache Spark.
  • Experience with A/B test design, deployment and evaluation.
  • Involved in various phases of Software Development Life Cycle (SDLC) such as requirements gathering, modeling, analysis, design and development with experience in Agile methodologies and SCRUM process.
  • Proficient in writing functional specifications, translating business requirements to technical specifications, created/maintained/modified database design document with detailed description of logical entities and physical tables.
  • Good Experience in using various Python libraries NumPy, Pandas, SciPy, matplotlib, Sklearn, Beautifulsoup, Statismodel, TensorFlow and Keras.
  • Proficient in data visualization tools such as Tableau, Python Matplotlib, Python Seaborn, R Shiny, R ggplot2 to create visually powerful and actionable interactive reports and dashboards.
  • Effective interpersonal skills to interact professionally with a diverse group, including executives, managers, and subject matter experts.

TECHNICAL SKILLS:

Programming: Python (NumPy, SciPy, matplotlib, Sklearn, Beautifulsoup, Statismodel, TensorFlow and Pandas), MySQL, Hive, Pyspark, R, SAS

Databases: SQL Sever, Oracle DB, Access, DBA/DB2, Sybase, Mongo DB, CSV

Business Intelligence Tools: Tableau, SAP BO, SAP BW

Methodologies: Agile, SCRUM, Waterfall, SDLC

Cloud Platform: Google, AWS, Databricks

PROFESSIONAL EXPERIENCE:

Confidential, Oak Ridge, NJ

Data Scientist / Data Analyst

Responsibilities:

  • Gathered, analyzed, documented, and translated application requirements into data models and supported standardization of documentation and the adoption of standards and practices related to data and applications.
  • Merged required features from different sources of databases and cleasened the data with the datatype accepted by model algorithms with more than 500k records.
  • Developed Spark Python modules for machine learning and predictive analytics, and implemented Python - based distributed algorithms via PySpark.
  • Developed and implemented predictive models using machine learning algorithms such as logistic regression, classification, Naive Bayes, Random Forests, K-means clustering, KNN, PCA, and regularization for data analysis.
  • Evaluated the models’ accuracy by ROC, MSE and learning curve to identify the model overfitting or underfitting issues, so that to find appropriate approach to correct and improve the model accuracy.
  • Generated report based on predictive analytics using Python and Tableau including visualizing model performance and prediction results.
  • Interacted with the other departments to understand and identify data needs and requirements and work with other members of the IT organization to deliver data visualization and reporting solutions to address those needs.

Confidential, Dallas, TX

Data Scientist / Data Analyst

Responsibilities:

  • Queried and retrieved conference attendance and history from Oracle database to get the sample dataset.
  • In preprocessing phase, merged conference attendance table with table on ID, used Pandas to remove or replace all the missing data and balanced the dataset with over-sampling the minority label class (conference attendance as a member) and under-sampling the majority label class.
  • Scraped the demographics data as well as the public data of competitors and the industry from website using Beautifulsoup as the secondary dataset to develop an understanding of member behavior, location p, demographics and lifecycle. Presented data that helped guide decisions of the company.
  • Implemented twitter sentiment analysis with Tweepy to mine the opinions towards the conference held by the company and the similar conference held by the competitors, as well as to find most concerning of a conference participant. The findings presented actionable insight to company’s web market campaign.
  • Used Scikit-learn PCA and other feature engineering, feature normalization and label encoding preprocessing techniques to reduce the high dimensional data (>150 features) merging entire history, conference registration history, and demographic data.
  • Experimented with predictive models including Logistic Regression, Support Vector Machine (SVC), Gradient Boosting and Random Forest using Python Scikit-learn to predict whether a member might register a conference.
  • Improved data mining processes, resulting in a 15% decrease in time needed to infer insights from data used to develop marketing strategies.
  • Used agile methodology and scrum process for project development.

Confidential, Dallas, TX

Data Scientist / Data Analyst

Responsibilities:

  • Collaborated with Business Analysts across departments to gather business requirements and identify workable items for further development.
  • Maintained and developed complex SQL queries, stored procedures, views, functions and reports that meet customer requirements.
  • Used different feature engineering methods in Python to cleanse high dimensional datasets and prepared the datasets for data modeling.
  • Developed customer segmentation, elasticity modelling and Market Basket Analysis to provide marketing insight and maximize the revenue.
  • Worked with the engineering team to design, deploy and evaluate A/B testing and generate other hypothesis-based experiments and statistical significance so that to test if the additional online purchase feature would be robust or not.

Confidential

Data Analyst / Business Analyst

Responsibilities:

  • Analyzed structural requirements for new life business system, cleaned and mapped the data type in the legacy system to the new system;
  • Analyzed the data flow of the complexed business systems and data migration requirement, helped technical staff test and improve the performance of the data migration engine.
  • Validated the migrated data with testing tools.
  • Worked with data team to develop and implement predictive models to research the customer p by using machine learning algorithms such as linear regression, classification, multivariate regression, Naive Bayes, Random Forests, K-means clustering, KNN, PCA, and regularization for data analysis.
  • Set up routine data extraction process to get data from different data sources used for reporting by sales, management and marketing department by reducing 20% the data mining time.

Confidential

Data/Business Analyst

Responsibilities:

  • Connected with business users to understand the requirement, analyzed the user requirement to identify the tables needed for building the report for Life/Annuity product module.
  • Cooperated with the SAP developer to build SAP Universe and WebI report.
  • Tracked and analyzed the production issue reported by users and got technical staff to fix it.
  • Created and delivered operational and financial models, dashboard, management reports to translate the functional and business requirement, together with KPI of project delivery, including resource utilization report, P&L report, revenue forecasting vs. recognition, financial variance report, EAC/ETC/DCM models, etc.

Confidential

PMO Lead

Responsibilities:

  • Delivered effective, accurate and consistent verbal and written communication to project directors, project teams, senior management, external clients and vendors including, but not limited to project status reporting, timelines, task logs, risk logs, billable expenditures, budgets, and project closure documentation.
  • Extracted, transformed and cleansed the data from the transactional systems (e.g. SAP, Hyperion, and COGNOS), reviewed and governed the data quality on timely basis.
  • In charge of vendors selection and procurement process for the projects.
  • Successfully delivered the service under SLA by managing a project team with 10 team members scattering in different locations/countries and maintaining the good cooperative relationship with client.
  • Defined the project scope, managed changing request, prioritized the tasks and delivered multiple deliverables before deadline so that to keep the rate of quarterly customer satisfaction survey above 90%.
  • Achieved 85% department resource utilization by forecasting and allocating resources to the project with specific skill-sets & providing effective and timely resources tracking report to regional resources/projects owners.

We'd love your feedback!