We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • 6.5 years of experience in Machine Learning, Predictive Analytics, Data Mining and ETL Development
  • Had architected solutions for problems in Machine Learning, Advanced Predictive Analytics, Business Optimization and Text Mining
  • Strong understanding of Big Data concepts: HDFS, Map - Reduce and Hadoop Eco- system
  • Good knowledge of SQL/PLSQL
  • Knowledge in Object Oriented Programming concepts
  • Proven ability as a quick learner of new skills and technologies
  • Effective team player with excellent communication skills and an insight to determine priorities, schedule work and meet critical deadlines
  • Ability to rapidly troubleshoot and resolve complex technical issues
  • Strong analytical, problem solving, programming and debugging skills

TECHNICAL SKILLS

Languages: R, Python, Core Java, C, PL/SQL

Platform/Technologies: HDFS, Map-Reduce, Hive, HBase, Mahout, nltk

Tools: Talend 4.1.2, Quick Rec, Tortoise SVN, Putty, Cloudera Manager

Essential Skills: Text Mining, Data Mining, Data Modeling, Predictive Analytics, Statistical Modeling, Big Data, Advanced Analytics Algorithms

PROFESSIONAL EXPERIENCE

Confidential

Data Scientist

Responsibilities:

  • Collection of Data, Dividing the data into Training, Testing and Evaluation
  • Designing, Architecting and Choice of Algorithms
  • Model Building using Random Forest and Gradient Boosting methods like the XGBoost
  • Feature Selection from the fitted model
  • Evaluation of the Model built
  • Visualization of the Output
  • Analyzing the False Negatives, False Positives

Environment: R

Analytics Engineer

Confidential

Responsibilities

  • Data collection: Scraping the Client Web site for posts
  • Analyzing the data, performing Text Pre-processing Steps
  • Model Building
  • Cross check the end results
  • Share the results with Customer
  • Visualize the output

Environment - Revo R, DeployR

Confidential

Analytics Engineer

Responsibilities:

  • Build the flow of the Project
  • Coordinate with the Business Analyst and get the requirements understood
  • Process data in a way the model could read
  • Build the Prediction Model
  • Score the Model and identify Risky Customers
  • Visualize the Output

Environment - R, Java, Revo R, DeployR

Confidential

Analytics Engineer

Responsibilities:

  • Deterministic Chain Ladder
  • Chain Ladder as Weighted Mean
  • Chain Ladder using weighted linear regression
  • Poisson Regression
  • Quasi Poisson Regression
  • Bootstrapped Chain Ladder
  • Mack Chain Ladder
  • Log Linear Model
  • Clark LDF Method
  • Build the flow of the Project
  • Coordinate with the Business Analyst and get the requirements understood
  • Process data in a way the model could read
  • Build the Prediction Model using different Chain Ladder Techniques
  • Build the Prediction Model using Time Series Method
  • Compare and Visualize the Output

Text Classification

Confidential

Responsibilities:

  • First being a probability based approach, The NaïveBayes Classifier, which uses Bayes Theorem to predict the probability that a given feature set belongs to a particular label.
  • Second being a linguistic based knowledge incorporation model for identification of context and then incorporating Nearest-Neighbor Classifier for text classification. The model represents text in terms of synsets in the WordNet- a lexical knowledge base of English words along with the semantic relations. WordNet similarity is measured between the word(from the input, the tweet) and words from a manually prepared medical dictionary. Having captured the relatedness of a tweet to a medical term in the med dictionary, the data is set, to run the K- Nearest-Neighbor Classifier.:
  • Collection of Data, Dividing the data into Training, Testing and Evaluation
  • Manual labeling of the class for Training and Testing
  • Designing, Architecting and Choice of Algorithms
  • Model Building in Python
  • Evaluation of the Model built
  • Visualization of the Output in Python
  • Analyzing the False Negatives, False Positives

Confidential

ETL Developer

Roles & Responsibilities:

  • Offshore lead role for E2E AML RECON
  • Involved in requirement gathering and analysis of the raw data
  • Involved in Development of the ETL job in Talend for application of rules on raw transaction files for various Product Processors
  • Involved in Unit testing of the ETL jobs created
  • Involved in preparing Unit Test Case documents for each of the ETL jobs created
  • Involved in running the QuickRec tool to verify the differences in the Expected and the Actual results
  • Involved in running the QuickRec tool to verify the differences in the Expected and the Actual results
  • Involved in Break Analysis of the output to determine whether the correct rules have been applied
  • Interact with the clients on Break analysis

Technology - Core Java, SQL, UNIX, Talend 4.1.2, Quick Rec, Tortoise SVN, Putty

We'd love your feedback!