We provide IT Staff Augmentation Services!

Data Science Consultant/tableau Architect/tableau Admin Resume

3.00/5 (Submit Your Rating)

San Francisco, CA

SUMMARY:

  • Over 10+ Years of experience in the field of Information Technology with an emphasis on Data Science Machine Learning using Python, Scala, R & SQL and Business Intelligence Tools such as Tableau 2018.1, 10.5/10.1/9.3/9.1/8.1, SAPBO and ETL
  • Involved in Installation and configuration of Tableau Software in AWS cloud services and configuring VPC, ELB, EC2, SES, S3 & Inbound rules.
  • Experience in statistical modelling, analyzing data and building data visualizations.
  • Expertise in extracting, manipulation and validating data to apply statistical algorithms to unearth the opportunities that effect business.
  • Extensive experience in Statistics, Machine Learning and Quantitative in Services and Finance.
  • Statistical and Machine Learning algorithms: linear / logistic regression, PCA, SVD, Naïve Bayes, Neural Networks, Support Vector Machines, Decision Trees, Random Forest, Bagging, Boosting, CART, k - means, Collaborative Filtering, Association Rules, Reinforcement Learning, Hidden Markov, MDP, etc.
  • Extensive experience with Deep Learning artificial neural networks. Experience with diverse neural network architectures (Feedforward/MLP, RNN, LSTM, GRU, autoencoders, RBM, DBN, Convolutional, NLP, ensembles, seq2seq, etc.)
  • Experience with Reinforcement Learning: value and policy-gradient based methods, Q-learning, deep Q-learning, actor-critic methods, A3C, GPU optimized A3C.
  • Dimensionality reduction and feature engineering techniques using advanced machine learning algorithms like Clustering, Weight of Evidence (WOE), Information Value (IV), PCA etc.
  • Performed Exploratory Data Analysis, Predictive Modeling and segmentation to identify, analyze and interpret the patterns/trends in the data
  • Working Knowledge on applying statistical models to identify patterns in business data by utilizing modelling techniques like clustering, SVM.
  • Installed and Configured Tableau in AWS using ELB, VPC, Security Groups and S3.
  • Experience in communicating the data driven insights to sales, marketing and product teams.
  • Working as an individual and being team player by maintaining a clear communication with team members to complete projects on time.
  • Detailed expertise in algorithmic trading practices, exchanges, market microstructure, data feeds and technologies.
  • Proven track record of leveraging analytics and large amounts of data to drive significant business impact.
  • Extensive Tableau Experience in Enterprise Environment and Tableau Administrator.
  • Involved in Installation and configuration of Tableau Software.
  • Implemented permission rules to control access to specific content on a site
  • Created extracts, published data sources to tableau server, refreshed extract in Tableau server from Tableau Desktop.
  • Experience in Migrating Tableau from older version to 2018.1 version.
  • Experience in Tableau Administration for configuration, adding users, managing licenses, and data connections, scheduling tasks, embedding views by integrating with other platforms.
  • Deployment of Tableau Server in clustered environment and performing upgrades
  • Clusters corresponding to developed, developing, and underdeveloped countries based on the metrics used as the clustering criteria.
  • Experience in setting up different authentication modes SAML/SSL in Tableau server.
  • Experience in Integrating Tableau with SAP BI using WDC
  • Experience Implementing Proactive Tableau Environment Health Checks and Performance Threshold Alerting
  • Experience with Tableau application performance monitoring, capacity planning, and tuning for multiple members on multiple projects
  • Proficient in building compelling, interactive dashboards in Tableau that answer key business questions.
  • Worked on creation of users, groups, projects, workbooks and the appropriate permission sets for Tableau server logins and security checks.
  • Expertise in Data Analysis, Data Extraction, Data Marts. Perform Grant, Revoke table access to Database Analysts, Business Analyst, Operational Analysts and Project Manager
  • Experience in TABCMD & TABADMIN, Email alerts, scheduling, data backup, SAML and drivers setup activities

TECHNICAL SKILLS:

Programming & Scripting languages: R (Packages: Stats, Zoo, Matrix, data, table, OpenSSL), Python, SQL, C, C++, JAVA, JCL, COBOL, HTML, CSS, JSP, Java Script, Scala

Database: SQL, MySQL, TSQL, MS Access, Oracle, Hive, MongoDB, Cassandra, PostgreSQL

Statistical Software: SPSS, R, SAS

Algorithms Skills: Machine Learning, Neural Networks, Deep Learning, NLP, Bayesian Learning, Optimization, Prediction, Pattern Identification, Data / Text mining, Regression, Logistic Regression, Bayesian Belief, Clustering, Classification, Statistical modeling

Data Science/Data Analysis Tools & Techniques: Generalized Linear Models, Logistic Regressions, Boxplots, K-Means, Clustering, SVN, PuTTY, WinSCP, Redmine (Bug Tracking, Documentation, Scrum), Neural networks, AI, Teradata, Tableau

Development Tool: R Studio, Notepad++, Python, Jupiter, Spyder IDE

Python Packages: Numpy, SciPy, Pandas, scikit-learn, Matplotlib, seaborn, statsmodels, Keras, TensorFlow, Theano, TensorFlow, NLTK, Scrapy

Techniques: Machine learning, Regression, Clustering, Data mining

Machine Learning: Naïve Bayes, Decision trees, Regression models, Random Forests, Time-series, K-means

Cloud Technologies: AWS (EC2, S3, RDS, EBS,VPC, IAM, Security Groups), Microsoft Azure, Rackspace

Operating Systems: Windows, Linux, Unix, Macintosh HD, Red Hat

PROFESSIONAL EXPERIENCE:

Confidential, San Francisco, CA

Data Science Consultant/Tableau Architect/Tableau Admin

Responsibilities:

  • Ride-Share Built an interpretable model logistic regression to discover factors most predictive of customer retention. Presented possible actions to improve customer retention.
  • Fraud Detection: Developed an end-to-end data product to detect fraudulent events. Defined a fraudulent event. Built a gradient-boosted tree model to predict the likelihood of fraud. Used MongoDB to store incoming JSON data. Created a web app with Flask to display live predictions.
  • Retail Analytics: Designed a predictive modelling framework in python to understand the likelihood of a customer
  • Leveraged disparate data sources that provide deep customer insight including online transactional data, webdata, payment and orders history and marketing campaigns exposure data.
  • Performed data discovery and build a stream that automatically retrieves data from multitude of sources (SQL databases, external data such as social network data, user reviews) to generate KPI's using Tableau.
  • Used Pandas, NumPy, seaborn, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine learning algorithms and utilized machine learning algorithms such as linear regression, multivariate regression, naive Bayes, Random Forests, K-means, & KNN for data analysis.
  • Installed and configured PostgreSQL databases and optimized postgresql.conf for the performance improvement.
  • Created multiple visualizations Scatter plots, Bar charts, Histograms & Line plots in Python using MatPlotLib, GGPlot & GeoPlotLib
  • Handled class imbalance using re-sampling techniques. Utilized Logistic/Polynomial regression in Python to identify the factors
  • Setup the Tableau in AWS EC2 environment with Failover mechanism
  • Upgraded the Tableau environment from 10.5.1 -> 2018.1
  • Created ELB Load balancing Technique
  • Created VPC/Security groups and other things
  • Installed Tableau in Multi-node and Enterprise Deployments
  • Extensively used Tabadmin and Tabcmd commands in creating backups and restoring backups of Tableau repository.
  • Implemented SSO/SSL/SAML on QA and Prod
  • Experience with BI Solutions using Data Appliances like Teradata, IBM Netezza, HP Vertica, Oracle Exadata and Big Data implementations using Hadoop Cloudera Impala clusters
  • Involved in publishing of various kinds of live, interactive data visualizations, dashboards, reports and workbooks from Tableau Desktop to Tableau servers.
  • Developed predictive statistical models using statistical analysis and other predictive modeling techniques.

Environment: Python, Scala, AWS, Tableau 2018.1, 10.5 Server and Desktop, Hive, Presto, RedShift.

Confidential - San Francisco, CA

Data Scientist

Responsibilities:

  • Gathered, analyzed, documented and translated application requirements into data models, supported standardization of documentation and the adoption of standards and practices related to data and applications.
  • Queried and aggregated data from Amazon Redshift to get the sample dataset.
  • Identified patterns, data quality issues, and leveraged insights by communicating with BI team.
  • In preprocessing phase, used Pandas to remove or replace all the missing data, and feature engineering to eliminate unrelated features.
  • Balanced the dataset with Over-sampling the minority label class and Under-sampling the majority label class.
  • In data exploration stage used correlation analysis and graphical techniques to get some insights about the claim data.
  • Applied machine learning techniques to tap into new markets, new customers and put forth my recommendations to the top management which resulted in increase in customer base by 5% and customer portfolio by 9%.
  • Build multi-layers Neural Networks to implement Deep Learning by using Tensorflow and Keras.
  • Perform hyperparameter tuning by doing Distributed Cross Validation in Spark to speed up the computation process.
  • Export trained models into Protobuf to be served by Tensorflow Serving and performed integration job with client's application.
  • Analyzed customer master data for the identification of prospective business, to understand their business needs, built client relationships and explored opportunities for cross-selling of financial products. 60% (Increased from 40%) of customers availed more than 6 products.
  • Improved fraud prediction performance by using random forest and gradient boosting for feature selection with Python Scikit-learn.
  • Tested classification algorithms such as Logistic Regression, Gradient Boosting and Random Forest using Pandas and Scikit-learn and evaluated the performance.
  • Worked extensively with data governance team to maintain data models, Metadata and dictionaries.
  • Developed advanced models using multivariate regression, Logistic regression, Random forests, decision trees and clustering.
  • Applied predictive analysis and statistical modeling techniques to analyze customer behavior and offer customized products, reduce delinquency rate and default rate. Lead to fall in default rates from 5% to 2%.
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, neural networks, SVM, clustering to identify Volume using scikit-learn package in python, Matlab.
  • Implemented, tuned and tested the model on AWS EC2 with the best algorithm and parameters.
  • Set up data preprocessing pipeline to guarantee the consistency between the training data and new coming data.
  • Deployed the model on AWS Lambda, collaborated with develop team to build the business solutions.
  • Collected the feedback after deployment retrained the model to improve the performance.
  • Discovered flaws in the methodology being used to calculate weather peril zone relativities; designed and implemented a 3D algorithm based on k-means clustering and Monte Carlo methods.
  • Observed groups of customers being neglected by the pricing algorithm; used hierarchical clustering to improve customer segmentation and increase profits by 6%.

Confidential - Atlanta, GA

Data Scientist

Responsibilities:

  • Identify and develop optimization products
  • Define and execute the optimization product roadmap
  • Build data sets from multiple data sources, both internally and externally.
  • Partner with Information Technology to optimize and enhance the database environment for optimal efficiency and best practices
  • Provide technical leadership to the Delta's business units
  • Leverage emerging technologies and identify efficient and meaningful ways to disseminate data and analysis in order to satisfy the business' needs
  • Assists with leadership of process improvement and project management engagements for both individual business units and cross-divisional initiatives.
  • Interface with business unit to develop and maintain internal customer relationships
  • Practices safety-conscious environment resulting in employee safety and well-being.

Confidential - Atlanta, GA

Lead Data Science and Operation Strategy (Consultant)

Responsibilities:

  • Analyzed customer's historic calls data and other network metric that is measured every two hours, so as to see correlation between customer troubles calls that leads to truck rolls to individual customer houses or business places, using some advance statistical R packages such as Time Series, GLM and NNET, result was deliver that make it possible to save $60,000.00 daily on truck rolls.
  • Worked with team to identify gap within the data while working on Node Congestion, as it is one of the issues experience in telecommunication industries, detect discrepancy and ensure recommendation was given for future data usage in the analytics that improve data quality by 30%
  • Lead presentation of prepared dashboard presentation to the upper management in Tableau and PowerPoint
  • Provide expert advice to the Sr. Director on the analytics results and implementation of recommendations.
  • Help in building Hive, HBase tables in Hadoop to help improve data analytic process.
  • NLP (text mining and analysis, topic modeling, Ngram, and Emotion analysis) using survey data to understand customer reactions and relating finding to build Attrition Model and recommendation help improve customer relationship by 25%.
  • Analytics Skills and ToolBig data Technology: Hadoop ecosystem, HDFS, Map Reduce, Hive, Hive QL, Pig, Sqoop, Map Reduce, text analytics with NLP, NoSQL Enterprise Architecture, Data Modeling with R, SAS, Alteryx, SPSS, Excel, Minitab, Access, Oracle ERP, Hadoop, Cloudera, and Python.

Confidential

Data Analyst

Responsibilities:

  • Collaborated with database engineers to implement ETL process, wrote and optimized SQL queries to perform data extraction and merging from SQL server database.
  • Gathered, analyzed, and translated business requirements, communicated with other departments to collected client business requirements and access available data.
  • Responsible for Data Cleaning, features scaling, features engineering by using NumPy and Pandas in Python .
  • Conducted Exploratory Data Analysis using Python Matplotlib and Seaborn to identify underlying patterns and correlation between features.
  • Used information value, principal components analysis, and Chi square feature selection techniques to identify.
  • Applied resampling methods like Synthetic Minority Over Sampling Technique ( SMOTE) to balance the classes in large data sets.
  • Designed and implemented customized Linear regression model to predict the sales utilizing diverse sources of data to predict demand, risk and price elasticity.
  • Experimented with multiple classification algorithms, such as Logistic Regression , Support Vector Machine (SVM) , Random Forest , AdA boost and Gradient boosting using Python Scikit-Learn and evaluated the performance on customer discount optimization on millions of customers.
  • Used F-Score, AUC/ROC , Confusion Matrix and RMSE to evaluate different model performance
  • Performed data visualization and Designed dashboards with Tableau , and generated complex reports, including charts, summaries, and graphs to interpret the findings to the team and stakeholders.
  • Used Keras for implementation and trained using cyclic learning rate schedule.
  • Over fitting issues was resolved by batch norm, dropout helped to overcome the issue.
  • Conducted in-depth analysis and predictive modelling to uncover hidden opportunities; communicate insights to the product, sales and marketing teams.
  • Built models using Python and Pyspark to predict the probability of attendance for various campaigns and events.

We'd love your feedback!