- Overall 16+ years of software development experience in Quantitative Data Analysis, Machine Learning, and Artificial Intelligence in Finance, Technology, Retail and Product based industries
- Almost 6 years of experience in Data Science and Artificial Intelligence
- Experience extracting and manipulating large data sets
- Experience translating unstructured, structured and semi - structured data into actionable insights for decision-making
- Knowledge of an array of research methods and the ability to analyze/interpret complex research findings
- Experience writing code (SQL, Python) to extract, clean, analyze and visualize data
- Statistical Knowledge of Regression models, Chi-squared test, Forecasting and Predictive modeling
- Great understanding of the Agile development process
- Sound knowledge of building and deploying models/Applications through Confidential Cloud Environment
- Good working experience on Big Query, Big Query ML, Kubeflow Pipelines, SQL, Python in GCP environment
- Python/ Java Developer with demonstrated capacity to implement innovative web, and business solutions
- Experience in Business Intelligence tools
- Written and verbal communication skills with the ability to translate complex issues for a wide audience
- Data Science
- Machine Learning
- Data Analysis
- Decision Science
- Model Development & Validation
- Digital Analytics
- Portfolio Management
- Financial Analytics
- Data Visualization
- Python Programming
Data Scientist/ML Engineer
Confidential, Dallas, TX
- Prioritized Data Sources from variety of sources and Created a PPE time series forecasting model that supports data-driven decision at the county level of California state. Used data generated insights, for expedited, forecast accuracy for PPE demand, based on 14 days moving average. Aligned with team members to set up GCP/python environment at client site
- Used state-of-art Confidential Cloud service and Analytics Platform including a “what If” based visualization dashboard.
- Established GCP infrastructure and analytics foundation including GCP infrastructure support, Ingestion of pipelines for subset of public and PPE specific date sets, ML driven forecast predictive model for 14 days sliding average of the PPE, Big Query tables and views for visualization and dashboards
- Code to run data pipelines, aggregate Customer-provided data (including Customer inventory/supply chain data and Covid-19 forecasts), generate predictions, and write all data and PPE demand predictions to a flat file or into a query able format ingestible by Customer
Environment: Confidential Cloud Platform, DBT, Big Query ML, Kubeflow pipelines, Python, SQL Queries, Time Series Forecasting (ARIMA MODEL, FACEBOOK PROPHET, XGBOOST), Time series univariate analysis on BQML
Confidential, Edison, NJ
- Collaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to fit the analytical requirements.
- Performed data analysis by using HiveQL to retrieve the data from Hadoop cluster, SQL to retrieve data from RedShift.
- Explored and analyzed the customer specific features by using Spark SQL.
- Performed univariate and multivariate analysis on the data to identify any underlying pattern in the data and associations between the variables.
- Performed data imputation, feature engineering using Scikit-learn package in Python.
- Used Python 3.X (NumPy, scipy, pandas, scikit-learn, seaborn) and Spark 2.0 (PySpark, MLlib) to develop variety of models and algorithms for analytic purposes.
- Conducted analysis on assessing customer consuming behaviors and discover value of customers with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means Clustering and Hierarchical Clustering.
- Built regression models include: Lasso, Ridge, SVR, XGboost to predict Customer Lifetime Value.
- Built classification models include: Logistic Regression, SVM, Decision Tree, Random Forest to predict Customer Churn Rate.
- Used F-Score, AUC/ROC, Confusion Matrix, MAE, RMSE to evaluate different Model performance.
- Designed and implemented recommender systems which utilized Collaborative filtering techniques to recommend course for different customers and deployed to AWS EMR cluster.
- Designed rich data visualizations to model data into human-readable form with Tableau and Matplotlib.
Environment: AWS RedShift, EC2, EMR, S3, Spark (Pyspark, MLlib, Spark SQL), Python 3.x (Scikit-Learn/Scipy/NumPy/Pandas/Matplotlib/Seaborn), Tableau Desktop (9.x/10.x), Tableau Server (9.x/10.x), Machine Learning (Regressions, KNN, SVM, Decision Tree, Random Forest, XGboost, Collaborative filtering, Ensemble), Git 2.x, Agile/SCRUM
Senior Software Engineer
- Responsible for business growth through statistical data driven strategy development
- Aligned analytical efforts with company strategic objective of building customer centric a solution
- Analytical efforts in 2015 generated +300K incremental accounts and 1st year assets inflow of $100M in the 1st half of year with a focus on affluent customers
- Improved ROI by helping the company shift from traditional paid media to digital marketing focus
- Provided expert technical guidance to the team on modeling, data analysis, and programming
- Managed and mined direct to client customer database for one of the large retail clients for insights to drive better business decisions and optimize marketing ROIs
- Maintained customer database in partnership with IT/Platforms Director ensuring database growth (new customer info capture online and DTC channels), completeness and accuracy
- Used statistical analysis and modeling efforts to uncover insights from large complex data sets (i.e., customer database, web and channel analytics), data storytelling, and implementing new data-driven strategies
- Analyzed customer data to drive in-depth understanding of demographic, purchase, and behavioral trends and profiles to inform key business and marketing decisions
- Built customer cohorts based on key purchase behavior, relevant product affinity, and other relevant groupings to inform targeting programs
- Collaborated with Digital Marketing team to develop segments/cohorts and audiences to effectively target relevant groups for prospecting and maximize RFM (recency, frequency, monetary spend)
- Developed and executed customer retention strategies to enhance customer growth and loyalty
- Managed website tagging and tracking to inform ecommerce performance and strategies
- Partnered with IT/Platforms Director to develop relevant business and marketing intelligence reporting for ecommerce properties
- Developed & supervised development of Models focusing on pricing and offers optimization
- Design of Experiments to measure prospects behavior across different products and offers
- Supported model implementation, validation and scoring of the prospects for marketing campaigns
- Worked closely with portfolio managers to mitigate risk and implemented new segmentations tools generating superior risk-adjusted performance and collaborate with risk management team for testing
- Architected an XML based highly customizable product
- Led product integration with WebSphere Portal
- Oversaw product customization for number of Fortune 100 clients
- Adopted Extreme Programming and other agile technology to cut down product release cycle by 20% and cost by 15%
- Consulted VP engineering on offshoring processes and distributed development practices
- Architected Struts/EJB based 3-tiered architecture
- Led product integration with WebSphere Portal
- Designed database model for multi-tenant deployment
- Advocated use of numerous productivity improvement tools like Cruise Control, Clover, Code checker
- Designed the stock trading interface for mobile devices
- Authored design patterns on a leading IT information portal
- Led a team of 2 for application performance tuning and analysis
- Designed a custom adapter to interface open systems with mainframe
- Designed a proprietary versioning system to maintain service backward compatibility
- Developed a reporting system to measure and monitor customer service request time