Lead Data Scientist/consultant Resume
Tulsa, OK
PROFESSIONAL SUMMARY:
- Total 14+ years of IT experience with a proven track record of developing and implementing large - scale algorithms in Predictive analytics and EPM that have significantly impacted business revenues and user experience in a diverse set of industries: Financial Services, Banking, Retail, healthcare, Oil Gas and utilities.
- 5 + years of experience in Predictive analytics in algorithms development, AI research projects, data modeling, applying statistical methods, and data visualization tools. Proficient in solving real-time problems by applying several Machine learning, Deep learning, NLP and Statistical methods. hand-on experience inArtificial Intelligence andMachine learning including Statisticalmethods, Reinforcement Learning, and Deep Learning (RNN/LSTM, Autoencoder and CNN). Deep familiarity with statistical/ML and Deep learning models & methods (time-series analysis, regression analysis, experimental design, hypothesis testing, etc.). Used Amazon Sage Maker platform (AWS) to train, test and deployed deep neural networks/ML models.
- Lead successful implementation cycles of a full Hyperion environment and multiple Planning / Ess base applications (including EIS/Essbase Studio applications), Planning, and HFM, as well as the supporting tools such as financial Reports (FR), MSBI, and ETL.
TECHNICAL SKILLS:
Advance analytics skills: SciKit-Learn, Pandas, NumPy, SciPy, Keras, Matplotlib, Seaborn, TensorFlow, OpenCV, Grid search, PyTorch, NLP
Classic Algorithms: Linear/Logistic Regression, CART,SVM, KNN, LDA/QDA, Naive Bayes, Random Forest, Gradient Boosting, K-means Clustering,Hierarchical clustering, PCA, Feature Selection, Collaborative Filtering, Neural Networks, Ensemble methods andNLP.
Deep learning: CNN, RNN (LSTM, GPU), CUDA, OpenCV, Keras, Autoencoder and decoders, Reinforcement learning.
Models Experience: Classification, Regression, Time series (Forecasting), Market-basket analysis and NLP, Convolution, Recurrent Neural Networks.
Stats Methods: Over and Under Sampling, Dimensionality Reduction, Statistical Features, hypothesis statistics, Bayesian Statistics, Probability Distributions.
EPM Skills: Planning, Essbase, Reporting and EPM Infrastructure.
Reporting Tools: Power BI, OBIEE and Financial reporting.
Programming/Scripting Languages: Python, SQL, Shell scripting, PowerShell, JavaScript, C, C+ +, HTML
Big Data/ETL Tools: Hadoop, Hive, Sqoop, Spark, Informatica and IBM data stage.
Database: Oracle 9i, 10g, 11g, 12c, Teradata, MySQL, DB2
Operating System: Windows, Linux, UNIX, GPU CUDA
Cloud experience: Azure, AWS Sage maker, AWS S3, AWS EMR
PROFESSIONAL EXPERIENCE:
Confidential, Tulsa, Ok
Lead Data Scientist/Consultant
Responsibilities:
- Created a classification model to predict Pipeline leakage detection with high an accuracy of 98% of true +ve and false -ve which is saved Approx. $ 1 M (it’s basically avoided regular routine checks). created a classification model to predict meter freezes detection with high an accuracy of 97% which was saved almost $ 1.5 M per annum.
- Created a WHS (Williams Drivers/workers health and safety) predicative model to predict no of accidents by monthly by the field office.
- Created supply-chain forecasting model to forecasting monthly sales vs supply.
- Created Expenses and revenue forecast by using SARIMA, RNN (Time series analysis) with an accuracy of 97 % -- which is saved almost $2 M per annum.
- Worked on Anomaly detection problems by applying different sampling technics.
- Led Wipro AI research engagements like building commercial A.I chatbots and other confidential AI research projects.
- Grouped commonly used items by using Market Basket analysis with unsupervised technics.
- Perform Data profiling, preliminary data analysis and handle anomalies such as missing, duplicates, outliers, and imputed irrelevant data.
- Remove outliers using Proximity Distance and Density based techniques
- Used RMSE score, Confusion matrix, ROC, Cross validation and A/B testing to evaluate model performance in both simulated environment and real world.
- Developed interactive dashboards, created various Ad Hoc reports for users in power bi by connecting various data source.
- Design built and deployed a set of python modeling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior predictions.
- Segmented the customers based on demographics using K-means Clustering Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring.
Confidential, NY
Lead Data Scientist
Responsibilities:
- Managed team in Integration of Hadoop, Python, and Tableau integration.
- Developed, designed, and maintained predictive model in the areas of Fraud customers who are claimed the benefit.
- Designed and analyzed claims analytics skills for structured and unstructured data for decision support.
- Used machine learning techniques such as CART, Random Forest, and Deep Learning to predict Frauds on large noisy dataset
- Performed fraudulent claims detection model using logistic regression and machine learning techniques and to reduce the false positive rate.
- Data pulled from Big Data through HUE IDE, model developed, tested and deployed in Python.
Confidential, PA
Senior Data Scientist
Responsibilities:
- Convince Management during the meetings the need for Open Source, present an overview on Machine Learning, diagrams, Build & implement Open Source Machine Learning, Big Data, Python, AWS tools.
- Use Classification, Regression, Clustering Algorithms models using Python to solve Demand Forecasting, Text Analytics for Log Data, segregate Fixed Income Data into groups using Clustering. Visualize Client Data, build histograms, bar plot using Python Pandas, seaborn, matplotlib. Use Tensor Flow, Keras for Deep Learning - RNN.
- Use Python to connect to SQL Server, Oracle, Sybase to import data from RDBMS for building Machine Learning Algorithm. Use ETL tool Pentaho for data cleaning, pipe lining in Spark, storing data in Hive, Powerbase visualization.
- Perform Data Cleaning, Data Analysis using Python, AWS S3 and connect S3 to AWS EMR and EC2 to run PySpark, Spark. Connect AWS S3 with IDE like Jupiter, Zeppelin to run PySpark, Python scripts. Use Boto3 Python SDK.
- Use ETL tool Pentaho to migrate data from RDBMS to Cloud services, Create Hive tables, ingest data in Hive tables, PySpark for performing ETL on AWS EMR, store data S3. Use Git, SourceTree, Bitbucket to store script share.
Confidential
Principal consultant
Responsibilities:
- Involved requirement gathering & analysis and built interface table and SSIS package to processes HR data into Hyperion Env.
- Worked on complex SQL, view to process metadata into EPMA and built Hierarchies into Shared libraries for planning application.
- Re-engineeredworkforce application as per new requirements and created new web forms and business rule.
Confidential
Senior technical analyst
Responsibilities:
- As a senior technical analyst, I am responsible for various aspects of infrastructure architecture set up: project planning, requirements gathering, scope management, change control, execution oversight, issue management, and risk mitigation for all on demand customers.
- Involved in day to day actives like RFC execution, working escalated SR and always make sure that I own the SR/RFC to work on the issue till resolution, in the process make sure to update the RFC/SR in that regards and follow all the processes to ensure to avoid any defects.
Confidential
Application specialist
Responsibilities:
- Implemented a complex and very large data model (cube) for William’s financial systems providing key insights.
- Implemented custom integration process from cloud applications to on- premises applications.
- Re-engineer operational planning expense cube and fine tune settings in dense/Sparse, Cache, outline formulas and Calc scripts.
- Created complex SQL and PL/SQL code to extract data/metadata from Oracle Finance applications to DWH. Design STAR schemas for financial, HR and sales data marts.
Confidential
Associate consultant
Responsibilities:
- Implemented slowing changing dimension (SCD TYPE -2) in Informatica for all dimensions and then implement change data capture during data load.
- Developed mappings in Informatica to load data including facts and dimensions from various sources into Data Warehouse, using different transformations like Source Qualifier, JAVA, Expression, Lookup, Aggregate, Update Strategy and Joiner.
- Created customized data rule files in Essbase to build Account, Product, year, cost center, operating hierarchies in Essbase.
Confidential
Associate
Responsibilities:
- Created custom dimensions, data sources, applications, and web forms for Planning. Moved the metadata from planning application to Essbase cube and Updated the web forms offline using Hyperion Smart View. Created complex Business Rules for the Planning applications. Developed MaxL scripts to automate the process of loading and swapping the ASO cubes.