We provide IT Staff Augmentation Services!

Data Scientist Resume

2.00/5 (Submit Your Rating)

Cleveland, OH

SUMMARY:

  • 10+ Years of IT Experience with strong background of object oriented programming, Data warehousing and SQL.
  • 4+ years for experience with data science using data mining, Machine learning and advance statistical techniques.
  • Strong knowledge of computer science fundaments, writing robust algorithms and data structure concepts, developed reusable applications objects using different programming languages.
  • Worked on variety of analytical products and model like patient matching and risk prediction, Customer Retention analysis, Sentimental analysis of product reviews, Job recommendation engine(Collaborative Filtering).
  • Expertise in employing techniques for Supervised and Unsupervised (Clustering, Classification, PCA, Decision trees, KNN, SVM) learning, Predictive Analytics, Optimization methods and Natural Language Processing(NLP), Time Series Analysis.
  • Experience with Basic and advance statistical modeling techniques like AROMA, Hypothesis Testing, Non Parameterized Testing, Regression, Logistic Regression, Missing Value analysis, Survival Analysis using Kaplan - Meirer and Cox proportional hazard models.
  • Hands on experience of Data Science libraries in Python such as Pandas, NumPy, SciPy, scikit-learn, Matplotlib, Seaborn, BeautifulSoup, Orange, Rpy2, LibSVM, neurolab, NLTK.
  • Hands on experience on R packages and libraries like ggplot2, Shiny, h2o, dplyr, reshape2, plotly, RMarkdown, ElmStatLearn, caTools etc.
  • Hands on Experience in working with Hadoop ecosystem componnets like Hive, Pig, Sqoop, Map Reduce, Flume, OoZie and Excellent understanding of HDFS and Spark framework.
  • Experience in performing operations like union, join, group, filter, collate join, grouped data in bags using pig and retrieve data from Hbase using Hive and Imapla queries.
  • Extensive experience in using unix and writing shell scripts to schedule, automate jobs.
  • Sound knowledge of Data warehousing and database concepts. Expertise in Performance tuning, Optimization, Data integrity and Statistics by using SQL Profiler.
  • Extensive experience in data analysis, mapping, extracting transforming and loading data using various business intelligence technologies.
  • Extensive experience in using SQL Server, Oracle, Sybase and DB2 databases. Writing complex SQL queries, procedures, triggers etc.
  • Experience in healthcare, retail, insurance, corporate banking and erp domains.

TECHNICAL SKILLS:

Operating System: Windows, Unix, Mac

Machine Learning Python: pandas, numpy, scikit-learn, scipy, statsmodels, ggplot2

Databases: Vertica, Oracle, MS Sql Server, MySql, Sybase ASE 12.5, DB2

Big Data & Analytical: R, Python, SAS EMiner, NodeXL, XLMiner, Stata, Analytical Solver, Hadoop Pig, Hive, Impala, Ruby

Visualization: Tableau, ggplot2 and matplotlib, seaborn

Business Intelligence: SSIS, SSRS

PROFESSIONAL EXPERIENCE:

Confidential, Cleveland, OH

Data Scientist

Responsibilities:

  • Responsible for creating new and maintain existing statistical, machine learning models for patient matching, risk prediction and grouping.
  • Develop algorithms for data manipulation, abstraction, and standardization.
  • Use logistic regression on EMR patient data over ten thousand variable automatically extracted from EMR data.
  • Use 10 year risk estimates model based on the multivariate regression equations. Measure of calibration to accurately measure the absolute level of risk.
  • Use Likelihood ratio test to measure the model fit.
  • Perform risk reclassification analysis to access the utility of risk prediction models.
  • Evaluate model and train the model using variable cases, performed hypothesis testing for patient matching models.
  • Use Python NumPy, SciPy, Pandas packages to perform dataset manipulation.
  • Use R to write multivariate regression models and Risk predication algorithms.
  • Heavily used complex Sql queries to retrieve, validate data from Hbase and vertica database.
  • Following agile methodology, weekly sprint and daily scrum for status updates on task.

Tools: and Technologies used: R, Python, Hadoop/Hive/Pig/Impala, Vertica, Statistics & Machine Learning

Confidential, Cincinnati, OH

Data Scientist

Responsibilities:

  • Performed exploratory data analysis and removed the unwanted symbols, tags, punctuation, spaces etc and stemming of words from reviews and
  • Figured out the industry specific positive/negative lexicons and created data dictionary to analysis the people opinions.
  • Generated the Document Term Matrix and Term frequency matrix on corpus.
  • Generated word clouds and other plots to analyze the sentiments trends.

Tools: and Technologies used: Text Mining, R, SQL, Excel

Confidential

Data Scientist

Responsibilities:

  • Performed exploratory data analysis on job seekers and postings dataset.
  • Analyzed clusters by plotting them using various plots like entropy plot, uncertainty plots, Cluster Density Plots etc
  • Use Collaborative filtering for recommending the appropriate jobs to job seeker .
  • Use cosine similarity to measure the similarity between skill pairs, created job seekers beharvioiur matrix.
  • Written algorithm using R and generated interactive plots using ggplot2.
  • Written heavy Sql queries using MySQL, Hive and impala to analyze data.

Environment: R, Machine Learning, Data Mining, Hadoop, Hive, Impala

Confidential

Data Scientist

Responsibilities:

  • Cleaned & Identified sample data into a single view of the customer to start the analysis.
  • Applied clustering technique know customer segment.
  • Build Propensity model to predict the likelihood of an outcome, how likely a customer is to default, churn or lapse,
  • Calculated customer value metric to priorities customer based on probability and value.
  • Created Dashboards using Tableau to exposed the analysis in Dashboards.

Tools: and Technologies used: Python, SQL Server, Tableau, Excel

Confidential

Data Analyst

Responsibilities:

  • Involved in analyzing user specifications for workability, completeness and business flow.
  • Extracted data from various sources like SQL Server 2008/2005, DB2, .CSV, Excel and Text file from Client servers and through FTP.
  • Developed, deployed and monitored SSIS Packages for data warehouse created for project.
  • The packages created included a variety of transformations, for example Slowly Changing Dimensions, Look up, Aggregate, Derived Column, Conditional Split, Fuzzy Lookup, Multicast and Data Conversion.
  • Reviewed & Tested packages, Reports and Cubes, fixing bugs (if any) using SQL 2008 Business Intelligence Development Studio.
  • Created cubes and dashboards to show to Zurich assets trends to higher management.

Environment: Windows XP, SQL 2008, SSIS, SSRS

Confidential

Data Warehousing Consultant

Responsibilities:

  • Converted existing Base SAS packages into SSIS 2005.
  • Analyze and document the SAS code logic and prepare the database design for SSIS packages.
  • Developed SSIS package to pull the data from various heterogeneous sources like Excel, Flat files, Db2, Nettezza and Sql server.
  • Converted reports generated during the process into SSRS technology.
  • Developed Stored Procedures, Triggers, and SQL scripts for performing automation tasks.
  • System testing of bug fixes / enhancements and Support of the application RcardETL application.

Environment: Windows XP, SQL 2005, SSIS 2005, SSRS 2005

Confidential

ETL Developer

Responsibilities:

  • Evaluates existing manual process in application and proposed automation of process to business in order to reduce manual intervention and save the customer cost.
  • Enhance and maintain the application using power builder 10.
  • Estimate, Design, develop and implement new ETL process based.
  • Developed Stored Procedures, Views, and T-SQL scripts.
  • Developed SSIS packages to pull data from mainframe flat file.
  • Developed reports using SSRS and in excel format generated as part of during ETL process.
  • System testing of bug fixes.

Environment: Windows XP, SQL 2005, SSIS, SSRS

Confidential

Consultant - Application Development

Responsibilities:

  • Migrated application frontend from Power Builder 6.5 to 12.5 and backend from Sybase ASE 12.5 to MS Sql Server 2008.
  • Converted mainframe/Unix jobs into SSIS technology.
  • Managed individually all migration and development activities of the project.
  • Presented Demos/Prototypes for the migrated or new developed application part to customer time to time.
  • Participated in Client review and status meetings.
  • System testing of bug fixes / enhancements.
  • Production Support of the application.

Environment: Windows XP, SQL Server 2008, SSIS, Sybase ASE 12.5

Confidential

Senior Software Engineer

Responsibilities:

  • Involved in Database design and development.
  • Developed various Reports using SSRS, Customized existing reports according to the functional specifications.
  • Performed Tuning on the SQL queries, making the Procedures runs faster and more efficiently.
  • Client communication.
  • Developed SSIS packages extracting data from various sources oracle, Flat file and sql server.

Environment: Windows XP, SQL 2005, SSIS

Confidential

Senior Software Engineer

Responsibilities:

  • Analyzed Requirement, Designed and developed.
  • Involved in Database design and development.
  • Developed Complex reports using SSRS.
  • Created SSIS package to extract the data from Oracle database and Developed complex SSRS reports.

Environment: Windows XP, SQL 2005, SSRS, SSIS

Confidential

Software Engineer

Responsibilities:

  • Developed complex Stored Procedures, Triggers, functions and SQL scripts.
  • Troubleshoot database related issues in a timely fashion.

Environment: Windows XP, SQL 2005

We'd love your feedback!