We provide IT Staff Augmentation Services!

Data Scientist Resume

San Jose, CA

SUMMARY:

  • Data Scientist experienced in modeling Multivariate Linear and Logistic Regression, Classification Trees, Machine Learning algorithms using statistical tools like Python. Proficient in writing SQL.
  • Experienced in providing end - to-end analytical solutions from data acquisition and manipulation, exploratory analysis, building and validating predictive models to generating actionable insights from the analysis.

SKILLS:

Machine Learning: Regression analysis, Ridge, Lasso Regression, K-NN, Decision Tree, Support Vector Machine (SVM), Ensembles method like Bagging, Boosting, Stacking, K Means clustering, Deep Learning.

Python Libraries: Jupyter Notebook, NumPy, Pandas, Sci-kit, SciPy, TensorFlow, NLTK, Keras, RE.

Platforms: Unix, Windows, MacOS.

Databases: MySQL, SQL Server, Oracle, Informix.

Programming: Python, SQL, Hive, GitHub/Git

Data Pipeline / ETL Tools: Airflow

Cloud Services: AWS (S3, EC2, RDS, ECS)

Reporting & Visualization Tools: Seaborn, Matplotlib, ggplot2, Tableau Desktop, Tableau Online.

Statistics: Hypothetical Testing, ANOVA, Chi-Square, Confidence Intervals, MLE, Principal Component Analysis (PCA), Cross-Validation, Correlation.

EXPERIENCE:

Data Scientist

Confidential, San Jose, CA

Responsibilities:

  • Developing machine learning tools to provide industrial solutions.
  • Implemented data models, algorithms to implement machine learning solutions as member of data analyst/data scientist team.
  • Conducted exploratory analysis and feature engineering to fit the best models using SciKit Learn, Pyhton.
  • Optimized clients’ product portfolios by evaluating market segmentation data queried from the SQL Server
  • Proposed strategies to increase clients’ market shares by performing brand competition data analysis.
  • Explored the factors that influence the cancellation of client subscription,
  • Feature engineered, handled with missing values and created new features by using existed ones,
  • Used Matplotlib and Seaborn libraries in Python for visualization,
  • Used PySpark in Python by initializing Spark and loading the Big data to retrieving RDD information, sorting, filtering and sampling the data,
  • Applied ANOVA, chi-square statistical tests and ensemble methods in order to understand the prediction power and relationship among the categorical features,
  • Selected influenced features to develop less complex but more robust machine learning models,
  • Designed Logistics Regression, Random Forest, SVC, and Neural Networks, compared model’s performance, and tuned hyper-parameters to get better results,
  • Developed classification model to make classification whether the client will cancel the service or not.
  • Predicted the turnout of Fenerbahce games by using various surveys (implicit) and historical games ticket sales data
  • Optimized settlement plan of Fenerbahce Stadium and raised annual benefits of the club with predicting combine ticket sales
  • Optimized the fan club goods and raised revenue of the club by classifying the supporters according to their social status and attitudes
  • For prediction we used linear regression, Random forest regression and SVM using Python.
  • Developed ensemble techniques to predict the value of transactions for potential customers

Data Engineer

Confidential

Responsibilities:

  • Participated in Turkey's largest historical archives digitization project whose Phase 1 was successfully completed in 18 months, including a 3-month preliminary work. This phase had 30M pages and used up 4.5 Petabytes of storage.
  • Designed and built data driven archive applications aimed at optimizing business and operational efficiency.
  • Performed database design, tuning and optimization on MS SQL Server.
  • Optimized complex SQL queries, index and database objects.
  • Processed, cleansed and verified the integrity of data used for analysis.
  • Developed machine learning algorithms for document classification. Tools: python, Scikit Learn, NLTK, SVM.
  • Followed agile methodologies to manage the life-cycle of the project.

Data Architect & Project Manager

Confidential

Responsibilities:

  • Designed and built data driven archive applications aimed at optimizing business and operational efficiency.
  • Managed numerous web projects that have won national and international awards.
  • Providing vision to company in product improvement matters (document digitization, e-commerce, B2B, B2C, web content management, e-government, archive management, open innovation platform).
  • Turkey's first major historical and top-secret document digitization project, targeting digitization of 5.000.000 pages and 1 Petabytes of data, was completed on time.
  • Turkish Presidency’s project had numerous stages consisting of document ontology, classification, document scanning, digitization, data entry, data tagging, digital signature, secure sharing platform, searching platform, a new technical approach for Ottoman handwritten document character recognition using NLP, Turkish character recognition.
  • The assumed role in this project was the project management and data architecture design.
  • Development and revision management of Content Policy Document with our client team.

Software Development Specialist

Confidential

Responsibilities:

  • Performing system development activities for the structuring of information technology systems in newly created airport
  • Technical and administrative management in planning, establishment and operating of systems such as IGCS, SITA, FPL, BRS, METAR, FIDS, ATIS, ATCS, AODB, Finance Systems, Intranet.
  • Developed Intranet, Airport Operational Database Application and Aircraft Maintenance Application.

Tools: Oracle, ASP, Delphi, IIS, Windows.

Hire Now