We provide IT Staff Augmentation Services!

Data Scientist Resume

0/5 (Submit Your Rating)

Baltimore, MD

SUMMARY

  • Over 6 years of experience in Machine Learning, Deep Learning, Datamining with large datasets of Structured and Unstructured Data, Data Acquisition, Data Validation, Predictive Modeling.
  • Experience with AWS cloud computing, Spark (especially AWS EMR), Kibana, Node.js, Tableau, Looker.
  • Experience in the healthcare domain.
  • Strong technical communication skills; both written and verbal.
  • Ability to understand and articulate the “big picture” and simplify complex ideas.
  • Strong problem solving and structuring skills.
  • Ability to identify and learn applicable new techniques independently as needed.
  • Ability to create new methods of solutions through a combination of foundational research and collaboration with ongoing initiatives.
  • Experience in stochastic optimization, which has resulted in utilization by commercial applications and open - source algorithms.
  • Experience formulating and solving discrete and continuous optimization problems.
  • Experience developing and applying novel methods of stochastic optimization and optimization under uncertainty algorithms for large scale problems, including mixed integer type problems.
  • Expertise with design optimization methods with computational efficiency considerations.
  • Able to research statistical machine learning methods including forecasting, supervised learning, classification, and Bayesian methods.
  • Conduct complex, advanced research projects in areas of interest to Business Units.
  • Develop new and advanced cutting-edge techniques and algorithms
  • Transfer and implement results and technology in hard- and software prototypes and demo systems relevant to the businesses
  • Survey relevant technologies and stay abreast of latest developments
  • Draft and submit papers and patents based on research
  • Contribution to several research projects that combine new data sources and computational tools
  • Wrote efficient code and working with large datasets
  • Exceptional mathematical and statistical modeling and computer programming skills
  • Use of mathematical and statistical modeling and computer programming skills in an innovative manner.
  • The ability to work comfortably and effectively within an interdisciplinary research environment.
  • Able to advance the technical sophistication of solutions using machine learning and other advanced technologies.

TECHNICAL SKILLS

Data Science Specialties: Natural Language Processing, Machine Learning, Internet of Things (IoT) analytics, Social Analytics, Predictive Maintenance

Analytic Skills: Bayesian Analysis, Inference, Models, Regression Analysis, Linear models, Multivariate analysis, Stochastic Gradient Descent, Sampling methods, Forecasting, Segmentation, Clustering, Sentiment Analysis, Predictive Analytics

Analytic Tools: Classification and Regression Trees (CART), Support Vector Machine, Random Forest, Gradient Boosting Machine (GBM), TensorFlow, PCA, RNN, Regression, Naïve Bayes

Analytic Languages and Scripts: R, Python, HiveQL, Spark, Spark SQL, Scala, Impala, MapReduce

Languages: Java, Python, R, JavaScript, SQL, MATLAB

Version Control: GitHub, Git, SVN

IDE: Jupyter, Spyder

Data Query: Azure, Google, Amazon RedShift, Kinesis, EMR; HDFS, RDBMS, SQL and noSQL, data warehouse, data lake and various SQL and NoSQL databases and data warehouses.

Deep Learning: Machine perception, Data Mining, Machine Learning algorithms, Neural Networks, TensorFlow, Keras

Soft Skills: Able to deliver presentations and highly technical reports; collaboration with stakeholders and cross-functional teams, advisement on how to leverage analytical insights. Development of clear analytical reports which directly address strategic goals.

PROFESSIONAL EXPERIENCE

Data Scientist

Confidential, Baltimore, MD

Responsibilities:

  • Applied advanced analytics skills, with proficiency at integrating and preparing large, varied datasets, architecting specialized database and computing environments, and communicating results.
  • Developed analytical approaches to strategic business decisions.
  • Performed analysis using predictive modeling, data/text mining, and statistical tools.
  • Collaborated cross-functionally to arrive at actionable insights.
  • Synthesized analytic results with business input to drive measurable change.
  • Assisted in continual improvement of AWS data lake environment.
  • Identifying, gathering, and analyzing complex, multi-dimensional datasets utilizing a variety of tools.
  • Performed data visualization and developed presentation material utilizing Tableau.
  • Responsible for defining the key business problems to be solved while developing, maintaining relationships with stakeholders, SMEs, and cross-functional teams.
  • Used Agile approaches, including Extreme Programming, Test-Driven Development, and Agile Scrum.
  • Provided knowledge and understanding of current and emerging trends within the analytics industry
  • Participated in product redesigns and enhancements to know how the changes will be tracked and to suggest product direction based on data patterns.
  • Applied statistics and organizing large datasets of both structured and unstructured data.
  • Use of algorithms, data structures and performance optimization.
  • Worked with applied statistics and/or applied mathematics.
  • Facilitated the data collection to analyze document data processes, scenarios, and information flow.
  • Determined data structures and their relations in supporting business objectives and provided useful data in reports.
  • Promoted enterprise-wide business intelligence by enabling report access in SAS BI Portal and on Tableau Server.

Data Scientist

Confidential, San Jose, CA

Responsibilities:

  • Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per cycle in both Waterfall and Agile methodologies.
  • Worked in Git development environment.
  • Experienced in Data Integration Validation and Data Quality controls for ETL process and Data Warehousing using MS Visual Studio, SSIS, SSAS, SSRS.
  • Adept at using SAS Enterprise suite, Python, and Big Data related technologies including knowledge in Hadoop, Hive, Sqoop, Oozie, Flume, Map-Reduce
  • Proficient in Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical Testing, and Normal Distribution.
  • Tasks involved in preprocessing content (normalization, POS tagging, and parsing) in the realm of natural language processing, as well as named entity recognition, opinion mining, and event extraction.
  • Utilized spaCy for industrial strength natural language processing.
  • Solved problems related to social media analysis with NLP techniques involving supervised techniques with word-level features, sometimes combined with social media and social network metadata.
  • Transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scales across massive volume of structured and unstructured data.
  • Professional competency in Statistical NLP /Machine Learning, especially Supervised Learning- Document classification, information extraction, and named entity recognition in-context.
  • Worked with Proof of Concepts (Poc's) and gap analysis and gathered necessary data for analysis from different sources, prepared data for data exploration using data wrangling.
  • Implementing neural network skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, neural networks, Principle Component Analysis and good knowledge on Recommender Systems.
  • Strong SQL Server and Python programming skills with experience in working with functions
  • Efficient in developing Logical and Physical Data model and organizing data as per the business requirements using Sybase Power Designer, ER Studio in both OLTP and OLAP applications
  • Experience in designing star schema, Snow flake schema for Data Warehouse, ODS architecture.
  • Experience and Technical proficiency in Designing, Data Modeling Online Applications, Solution Lead for Architecting Data Warehouse/Business Intelligence Applications.
  • Worked with languages like Python and Scala and software packages such as Stata, SAS and SPSS to develop neural network and cluster analysis.
  • Designed visualizations using Tableau software and publishing and presenting dashboards, Storyline on web and desktop platforms.
  • Used dplyr in R and Pandas in Python for performing Exploratory data analysis.
  • Use of statistical programming languages like R and Python including Big Data technologies like Hadoop 2, HIVE, HDFS, MapReduce, and Spark.
  • Use of Spark 2, Spark SQL and PySpark.
  • Responsible for Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
  • Interacted with data from Hadoop for basic analysis and extraction of data in the infrastructure to provide data summarization.
  • Created visualization tools like Tableau, ggplot2 and d3.js, Plotly, R Shiny, for creating dashboards.
  • Worked with and extracted data from various database sources like Oracle, SQLServer, and DB2.

Data Scientist

Confidential, McLean, VA

Responsibilities:

  • Identifying and executing process improvements, hands-on in various technologies such as Oracle, Informatica, and Business Objects.
  • Developed large data sets from structured and unstructured data. Perform data mining.
  • Partnered with modeling experts to develop data frame requirements for projects.
  • Performed Ad-hoc reporting/customer profiling, segmentation using R/Python.
  • Created statistical models for the collected data, exploratory, pre-processing, to provide conclusions with decision guides.
  • Programmed a utility in Python that used multiple packages (scipy, numpy, pandas)
  • Implemented Classification using supervised algorithms like Logistic Regression, Decision trees, KNN, Naive Bayes.
  • Validated the machine learning classifiers using ROC Curves and Lift Charts.
  • Extracted data from HDFS and prepared data for exploratory analysis using data munging.
  • Updated Python scripts to match training data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification.
  • Assisted with exploring the use of NLP to enhance the analytics frameworks and to gain a better understanding of their clients and their broader operational environments.
  • Identified use of NLP to verify the consistency between company reports and financial statements.

Data Analyst

Confidential, Atlanta, GA

Responsibilities:

  • Generated SQL scripts to retrieve data from multiple tables and to load data into UAT and production environment.
  • Handling large databases in Dev, UAT and Production environments.
  • Maintained high security while sharing the data within internal, external team.
  • Performed data analysis to measure individual performance and analyzed priorities of tickets to draw insights.
  • Developed reports, dashboards for daily/weekly/monthly performance metrics by using SQL, MS Excel, MS PowerPoint and share point.
  • Recognizes the connection between business operations and analytics to influence business strategies and solutions.
  • Recommended business for reporting and analytic views. Simplified the process which helped clear backlogs which reduced the SLA breach by 36%.
  • Gathered and prepared data from multiple sources to support information analytics.
  • Monitored and tracked data reporting details for all data sources.
  • Document requests for data enhancements.
  • Proactively review data for areas requiring improvement, change, or infill.

We'd love your feedback!