We provide IT Staff Augmentation Services!

Jr. Data Scientist Resume

Sunnyvale, CA

SUMMARY:

  • Certified Data Scientist with over 1+ years of experience in Data Science with Artificial Intelligence, Machine Learning, Deep Learning, Data Mining, Data Analytics, Data Visualization.
  • Experience in Design, model, validate and test statistical algorithms using Python and R, against various real - world data sets including behavioral data.
  • Develop, build, test analytics applications using iterative and agile-like development processes or practices such as test-driven development, continuous integration.
  • Working experience in Machine Learning algorithms such as Linear Regression, Logistic Regression, Decision Trees, K-Means Clustering and Association Rules.
  • Experience with analyzing online user behavior, Conversion Data (A/B Testing) and customer journeys.
  • Experience using technology to work efficiently with datasets such as scripting, data cleansing tools, statistical software packages.
  • Knowledge of writing Packages, Stored Procedures, Functions, Views using SQL.
  • Working experience of statistical analysis using R, MATLAB and Excel.
  • Proficient in the integration of various data sources with multiple relational databases like Oracle/, MS SQL Server, Flat Files into the staging area.
  • Good Knowledge in implementing deep learning models and numerical Computation with the help of data flow graphs using Tensor Flow Machine Learning.
  • Good experience in Text mining to transposing words and phrases in unstructured data into numerical values.
  • Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
  • Good knowledge in statistics, mathematics, machine learning, recommendation algorithms and analytics with excellent understanding of business operations and analytics tools for effective analysis of data.

TECHNICAL SKILLS:

Programming & Scripting Languages: R, Python.

Database: SQL, MS Excel, Oracle.

Tools: TensorFlow, Keras.

Development Tools: R Studio, MS Office, Notepad++, MS Excel.

Visualization: Tableau

Techniques: Machine learning, Regression, Clustering, Data mining Text mining.

EXPERIENCE:

Confidential, Sunnyvale, CA

Jr. Data Scientist

Responsibilities:

  • Developed applications of Machine Learning, Statistical Analysis and Data Visualizations with data Processing problems in sustainability and finance domain.
  • Compiled data from various sources public and private databases to perform complex analysis and data manipulation for actionable results.
  • Unearthed the raw data by doing the Exploratory Data Analysis (Classification, splitting, cross-validation).
  • Used predictive modeling with tools using Python.
  • Used NLP methods for information extraction, topic modeling, parsing, and relationship extraction.
  • Worked with NLTK library for NLP data processing and finding the patterns.
  • Worked on NLP and ML techniques to analyze Twitter feeds, streaming news to determine the product reviews.
  • Worked on development of data warehouse and ETL systems using relational and non-relational tools like SQL.
  • Built and analyzed datasets using R, MATLAB and Python.
  • Worked on bootstrapping methods such as Decision Tree and Random Forests to reduce the variance.
  • Developed visualizations and dashboards using ggplot.
  • Applied linear regression in Python to understand the relationship between different attributes of dataset and causal relationship between them.
  • Applied concepts of probability, distribution and statistical inference on given dataset to unearth interesting findings through use of comparison, T-test, F-test, R-squared, P-value etc.
  • Interfaced with large scale database system through an ETL server for data extraction and preparation.
  • Utilized various techniques like Histogram, Bar plot, Pie-Chart, Scatter plot, Box plot to determine the condition of the data.
  • Created pivot tables and charts using worksheet data and external resources, modified pivot tables, sorted items and group data, and refreshed and formatted pivot tables.
  • Administered user, user groups, and scheduled instances for reports in Tableau.
  • Converted metric insight reports to tableau reports.
  • Used VLOOKUP to match source and destination address of the user data.
  • Wrote SQL queries for Data Manipulation.
  • Applied clustering algorithms like K-means and Hierarchical with help of Scikit and Scipy.
  • Created pivot tables and ran VLOOKUP's in MS Excel as a part of data validation.
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, SVM, clustering to identify volume using scikit-learn package in Python.
  • Extract random samples and perform comparison on the measurements conducted on samples of the dataset.
  • Achieved 50% cost savings, advanced commercial product development by building and optimizing machine learning models using XGboost, TensorFlow and Keras.

Environment: Machine learning, HDFS, Linux, Python (Scikit-Learn/Scipy/NumPy/Pandas), R, SQL, MS Excel.

Confidential, Santa Monica, CA

Jr. Data Scientist

Responsibilities:

  • Statistical Modeling to drive values from customer data, avoid churn.
  • Prepared regular data reports by collecting samples of data sets using Excel spreadsheets.
  • Cleaned data by analyzing and eliminating duplicate and inaccurate data outliers using R.
  • Compared data with source documents and re-entered data in verification format to detect errors.
  • Generated reports by running SQL queries against current databases to conduct data analysis.
  • Evaluated and optimized performance of models tuned parameters with K-Fold Cross Validation.
  • Analyzing transaction data to cluster users into segments and develop different marketing strategies for each cluster.

Hire Now