Data Science, Machine Learning Resume
5.00/5 (Submit Your Rating)
SUMMARY
- Analytical and resourceful professional with experience working with data and applying machine learning, predictive analytics, mathematical modeling, statistical inference, operations research and engineering. Experience with multiple information technology tools to work in a variety of projects across multiple industries.
- Experience in applying data analysis methodologies to extract knowledge from operational data to help decision making for the enterprise business development. Business development interests include the identification of business problems that could be solved with advanced analytics via the extraction of valuable information from data to inform and optimize decision making.
- Process includes the decomposition of the business problem in pieces, the use of Artificial Intelligence, Machine learning other methodologies (operations research, statistics, mathematical modeling, databases, etc.) to process and analyze the data to extract valuable information, and the identification of business solutions and value.
- Research interests include: public health, environmental sciences, infectious diseases and chronic diseases; water resources, water quality, solid waste management, infrastructure resilience, and energy consumption and efficiency; optimized shipping and transportation, predictive maintenance.
- He has demonstrated abilities in the following competencies:
- Experience with Big Data platforms and analytics: AWS, Hadoop, Spark.
- Strong mathematical background, analytical and problem - solving skills.
- Experience in the implementation of machine learning algorithms and data mining systems.
- Knowledge of Operations Research techniques to analyze complex decision and solution spaces in multi-objective problems to find optimal and near optimal solutions.
- Knowledge of Statistical Inference to leverage synergy between statistics and data mining and transform data mining into knowledge discovery.
- Use of statistical analysis, simulation and optimization to create predictive models that extend the benefits of descriptive analysis and statistics.
- Knowledge of Systems Dynamics to explain the characteristics and inter-dependencies of the different components in a business process and to analyze business dynamics.
PROFESSIONAL EXPERIENCE
Data Science, Machine Learning
Confidential
Responsibilities:
- Led data science team to build a model for Confidential to predict the levels of Mercury in premium products (naphtha and natural gas) before shipment to help the trader target the right customers and negotiate better deals. Modeling was performed to generate a predictive model for the target variable mercury contamination in tanks containing natural gas. Available predictors included variables from upstream sources and from the refinery.
- Built recommender system for brick and mortar retailer using collaborative filtering. Item to item datasets were created by calculating the frequency of Point-of-Sale (POS) transactions containing any 2 items. Advanced matrix decomposition techniques included Stochastic Singular Value Decomposition algorithms.
- Created models to help retailer re-rank results returned from queries entered by users looking for a specific product. Retailer used common available products to index Retail Store product catalogs and to search for an item based on user entered queries. Documents returned from a query are ranked by using ranking systems based on scores generated by functions of features derived from the terms (i.e. TF-IDF and derivations). The new models added to these functionalities providing new and interesting recommendations.
Operations Research
Confidential
Responsibilities:
- Led data science team to help the Confidential optimize decision making in product trading. Tradeco is responsible for the trading of Confidential products after domestic demand has been met. This mandate includes identifying trade opportunities and arranging shipment schedules that optimize profitability. A preliminary design has been identified with many variables (e.g. ship speed/cost calculations, port waiting times, third party leasing prices, tanker capacities).
- Led data science efforts to improve current route planning in Maersk Inland procurement. Created a solution that allows a Maersk Inland procurement representative to input origin and destination for the route and be provided a list of optimized (by price) routing options of intermodal transport for a set of criteria specified by the user. Helped Maersk solve current problems of non-integrated systems, dispatcher inefficiency by using manual processes selecting routes and updating systems, and lack of transparency of existent routing systems.
- Developed a quantitative framework to aid in decision making for integrated municipal solid waste (MSW) management. The MSW Decision Support Tool (MSW DST) uses a flexible framework to represent many site-specific issues and considerations. It incorporates both cost and environmental objectives. The environmental objectives are defined in terms of life cycle inventories of energy and emissions (of carbon monoxide, fossil- and biomass-derived carbon dioxide, etc.) associated with MSW management strategies. The MSW DST has an optimization module that selects the best group of technology options based on cost or environmental criteria. Developed the mathematical model that constitutes the optimization core in the tool. This mathematical model is represented by a set of linear equations that constitute the input of a linear programming (LP) solver. The first version of the MSW DST uses the powerful commercial LP solver CPLEX. The MSWDST includes multi-objective optimization capabilities to choose the objective function among competing objective functions such as cost, environmental emissions, energy consumption and recycling levels.
- Led research team to create a probability-based model to help brick and mortar retailer associate purchases with events triggered using a mobile application while customer was in-store shopping. The model established a relationship between the mobile app events and the actual purchases. The model helped answer the question of whether there is a relationship between the mobile app event and the Point-of-Sale event.
- Performed experimental designs for a project to study the metalorganic chemical vapor deposition (MOCVD) growth of ultrathin (≤ 300nm) Bi0.1Sb1.9Te3 thin films. It is a unique semiconductor, which was being explored for its potential as an efficient thermoelectric material for refrigeration or portable power generation. In this project, a series of statistically designed experiments (SDEs) were conducted to optimize the Bi0.1Sb1.9Te3 growth process. In these experiments, several materials’ properties were tracked (mobility, resistivity, carrier concentration, Seebeck, film thickness, growth rate, elemental percentage, surface morphology); however, the primary focus was power factor (measured in µW/K2-cm). Tools used included cluster analysis, and multivariate regression models.
- Led team to develop reliability model and accelerated life testing (ALT) methodologies for predicting the lifetime of integrated Solid-State luminaires (SSL). Standard SSL test methods were used to evaluate luminaire and component performance. An initial reliability model based on assumed Arrhenius behavior was built. Temperature, relative humidity, particle ingress, and atmospheric pollutant exposure were used as environmental stressors. Statistically valid sample sets, based on the assumed Arrhenius behavior, were used in this initial study. A multivariable reliability model based on measured statistical distributions of experimental values and degradation factors was created. This model was created by statistical analysis of the experiment data obtained and includes the effects of environmental stressors on system reliability. Designed and developed reliability models, Kaplan-Meier models, and Arrhenius models. Used multivariate regressions models, statistical learning and cluster analysis.
TECHNICAL SKILLS
- programming languages
- Java, Visual Basic.NET, python, R
- RDMS
- Oracle, MySQL, PostgreSQL, SQL Server, MS Access
- HPC/Big Data
- AWS, Hadoop, Spark, MATLAB