We provide IT Staff Augmentation Services!

Data Scientist Intern Resume

5.00/5 (Submit Your Rating)

SUMMARY:

Seeking the Data Scientist position.

SKILLS:

Programming Language: Python (Scipy/Numpy/scikit - learn/Pandas/Pyspark), Java, SQL (proficient), Bash, Matlab, NoSQL, R (familiar), C, C++, Ruby (working knowledge)

Tools and Techniques: Hadoop, Hive, Spark, Map Reduce, Git, Docker, MongoDB, Tensorflow, Keras, AWS

WORK & INTERNSHIP EXPERIENCE:

Data scientist intern

Confidential

Responsibilities:

  • Built models to predict GMV and item volume by SKU category and imporved the accuracy by 8%
  • Used Spark, Hive and Pandas in feature engineering and promotion analysis, Pyplot and Seaborn in data visualization, XGBoost, LR, RNN and ensemble in modeling and cross validation in evaluation and tuning 0 Research in feature learning “SKU2vec” using relation between items in users purchase history.

Software engineer intern

Confidential

Responsibilities:

  • Developed an easy-to-use CPU design test and benchmark web application with various benchmark test cases, adaptive process control, results uploading and archiving using Python, Shell Script and C.
Data scientist

Confidential

Responsibilities:

  • Developed a Confidential that supports basic conversations and trained on data from Reddit, Microsoft and others 0 Implemented seq2seq (Sequence to Sequence) model with word2vec word embedding, multi-layer bidirectional LSTM and additive attention mechanism using tensorflow
  • Used classification of questions to ensemble models for different situations.
  • Developed a movie recommender system can provide personalized recommendations that suit users taste 0 Implemented MapReduce jobs to process the large movie rating dataset from Netflix using Hadoop and Java
  • Used the item-based collaborative filtering algorithm, obtained and merged users rating matrix and the items co-occurrence matrix to get recommendation results.
Data scientist

Confidential

Responsibilities:

  • Built latent-factor models, applied SVM, PCA and Adaboost methods in prediction and got top 15% of the class
  • Processed more than 8G datasets of Amazon reviews and items
  • Utilized ANOVA, chi-square test and Pearson correlation statistics in data analysis and feature selection.
Data scientist

Confidential

Responsibilities:

  • Constructed a n-gram word library using MapReduce from Wikipedia corpus and connected the model to interfaces to implement real-time word auto-complete on a webpage.
  • Built the language model based on relative n-gram frequencies and stored it in MySQL database
  • Lead the design and development of a wearable device which can record and analyze human motion data and coach users’ motion (2 published paper and a patent)
  • Developed the interfaces of TI MCU with sensors and screen, implemented Kalman filter algorithm using C.

We'd love your feedback!