Data Scientist Resume
5.00/5 (Submit Your Rating)
MA
SUMMARY
- 16+ years’ experience that includes 7 years Customer facing Data Scientist, Deep Learning, Imaging, Algorithm’s development, ML for Drone development, 5 yrs as Data Engineer, 3 yrs as TV Correspondent.
- Machine Learning algorithms like Classification, Clustering, Time Series, Text Analytics, Deep Learning models using Tensorflow. Code in Python, Scala, R, JavaScript, Rust, Spark, Hadoop including build servers, Data pipeline, ETL, work with Cloud services AWS, IBM .
- Work with Senior Directors, Directors, VPs at various institutions, hired by CIO at Confidential, build Data Science implementation in respective institutions, at times from scratch, understand operational bottlenecks, ML use cases & interpretation for management & non - technical audience.
PROFESSIONAL EXPERIENCE
Confidential, MA
Data Scientist
Responsibilities:
- For Customer's financial wellness ranking, customer marketing campaigns, customer website traffic analysis, customer rollover (attrition - in or out of Confidential ), perform model building, model interpretation, visualization for business audience.
- Model building in AWS Sagemaker, model finalisation based model evaluation, use Python to source data from various databases onto single server on Snowflake cloud environment, pre-process data, perform ETL, feature engineering, present models to business stakeholders
Confidential, NJ
Data Scientist
Responsibilities:
- Work with various management teams in various lines of business, stakeholders to understand problem, present ML use cases, perform feasibility analysis & if approved, develop algorithms.
- Use Machine Learning for loss reduction for Incorrect Claims Center (Classification), reduce fraud in Prior Authorization of a Claim requiring Doctor's signature (Tensorflow CNN for PDF, image segmentation), Bridge Shipment of drug ahead of main drug's shipment (Classification) & thereby reduce monetary losses, streamlining operational efficiencies, Drug Pricing (Time Series using RNN).
- Build Random Forest, Boosting, RNN, CNN & model finalisation based model validation using PySpark, Spark Scala, Python, R. Use Clustering, Anomaly Detection algorithm for identifying group of customers based on disease (ICD codes), medical provider, cost, region.
- Build data visualization for communication with the Director, Managers, Operation's team & members from non-technical, non-machine learning savvy members of various teams.
- Post success of first use case in production as batch application, work with Devops, Hadoop, Unix platform team, DBAs, Spark Streaming, Kafka, Front end team to build application for near real time implementation.
- Work with my Director to convince other team Directors, Management teams about potential use case, benefits ML, DL to their strategic objectives, operational improvements, pain points.
Confidential, NY
Data Scientist
Responsibilities:
- Hired by the CIO at Confidential, by Directors at other institutions. Work with Senior Director, Architect, VP to understand use case for Machine Learning, time involved for implementation, feasibility analysis, benefits, potential risk challenges in ML implementation, benefits to a given line of business, high level technology stack required, Data Pipeline Architecture, discuss ML solutions for non-technical stakeholders.
- Write Algorithms in Time Series, Text Analytics, Clustering, Classification using Python, R.
- Forecast and visualize 30 day Sales Trends of Securities based on Bank’s Historical Sales Data. Decompose Trends, spot & visualize Seasonal & Non Seasonal charts, forecast using Time series.
- For Client Analytics, perform Sentiment Analysis, discover patterns in Client Meeting notes, meeting minutes. Classify meetings into High, Medium, Low in terms using Python NLTK, Scikit Learn, Name Entity Recognition, Recurrent Neural Network, WordCloud in R. Discover similarly situated Clients of the Bank based on features, use K Means Clustering, Hierarchical Clustering. Classify Client into Diamond, Gold, Silver partner, forecast new incoming Client.
- Experimented with Deep Leaning using TensorFlow, Keras, Pytorch for Text Analytics, reading image data using OpenCV, Convolution Neural Network (POC), perform data cleaning, pipe-lining in Spark, storing in HDFS, exploratory data analysis. Data pre-process, brainstorm Data Lake using Python, R. Visualize Client Data build histogram, barplot using R ggplot, Python seaborn, matplotlib, work with PowerBI myself & team.
- Use ETL tool Pentaho to migrate data from RDBMS to Cloud services, AWS S3, AWS EMR, EC2 to run PySpark, SparkR. IDE like Jupyter, Zepplin Use AWS’s Python SDK, SAS, SPSS R API.
- Create Hive tables, ingest data in Hive tables, PySpark for performing ETL, store data in S3. Use Git, Sourcetree, Bitbucket to store scripts, share code for production, import from from NoSQL databases, RDBMS to HDFS, RODBC, RJDBC for connections, Build RedHat Server.
- Write Python based unit test for data cleaning, data preprocessing and for building algorithm & validate them. Based on team review & feedback migrate the code into Development, Testing & Production environment.
Confidential
Developer
Responsibilities:
- ML, DLAlgorithms using Tensorflow, Pytorch, Keras for Convolution Neural Network, Recurrent Neural network, Spark, Big Data, use Cloudera, Hortonworks stack, H2O.
- POC for Blockchain using Solidarity, Javascript, NodeJS, Rust for Ethereum, PolkaDot.
- Data / Systems Consultant: (Mastech & Randstad) Confidential Bank, Pittsburgh, Confidential Global Advisors, Boston, T. Rowe Price, Owings Mills, State Farm Insurance, Bloomington, Confidential, Baltimore, Confidential, Washington DC, Confidential, VA03/10 - 06/15
- Data Analysis for Trade Compliance, surveillance, portfolio risk, mortgage, anti-money laundering risk, commercial banking, analyze data, ETL in multiple systems, run Unix scripts, errors, lineage.
- Migrate data to Cloud services, store data in Confidential Cloud, run PySpark, Scala Spark Big Data script, ETL script using Python using Jupyter, Pycharm or Zepplin based on particular team needs.
- Write Python based unit test for data cleaning, data pre-processing, logic in Hive. Validate the unit test case based on requirements and seek feedback. Based on team review & feedback migrate the code into Development, Testing & Production environment.
- Proof of Concept Algorithm, visualisations using data from RDBMS, Hadoop eco-system, exchange between RDBMS & HDFS, Source Target Data mapping, liaison with Business, Technical globally.