We provide IT Staff Augmentation Services!

Data Scientist Intern Resume

Austin, TX


Programming Languages: Java, C, C++, R, Python, Scala

Big Data tools: Hadoop Map Reduce, Pig, Hive, Apache Spark, Spark MLlib

Data Mining: SQL, T SQL, Teradata, Oracle PL SQL, Cassandra, HBase, Mongo DB

ETL & Reporting Tools: Integration Services (SSIS), Reporting Services (SSRS), Tableau, Qlikview, Matlablib, MS Excel

Web Technologies: HTML, XML, CSS, JSON, JQuery, JavaScript, PHP, Ajax


Data Scientist Intern

Confidential, Austin, TX


  • Used Natural Language Processing techniques like LDA, LSA, K means clustering on top of Word2vec vectors to extract meaningful topics from huge set of documents and performed sentiment analysis to give multi aspect review rating
  • Used topic probabilities for a document as features and performed random forest model to detect bad actors.
  • Performed Data Analysis on crew salary data using Teradata and Python and visualized using Tableau.
  • Worked on SSIS package to load train schedules and paths into database timely and scheduled it to run daily and used Python to optimize blocking sequences for the trains based on shortest path algorithms
  • Developed various SSIS (Integration Services) and SSRS (Reporting Services) packages and jobs to handle huge amounts of data received from different sources and generate reports accordingly.
  • Experience in creating tables, views, triggers, stored procedures, Cursors and other complicated T - SQL statements for various applications and was involved in query optimization and performance tuning of SQL queries and procedures.
  • Installed Hadoop standalone cluster and performed Sentiment analysis on twitter data extracted using Streaming API

Hire Now