- Consolidating structured and unstructured data from disparate data sources to build data products and eventually deploy or integrate solution wif other applications in production systems.
- Tuning, fitting and optimizing models; features engineering and deploying models via web service or BI tools.
- Design, architecture & development of analytic solutions to solve business problems.
- Implement these solutions and train analysts, developers to use the analytic tools and / or data products.
- Exploring data sets to generate patterns using time series models, cross - correlations wif time lags, signal processing & filtering techniques, spectral analysis and correlograms.
- Configuring HIVE on HDFS, Horton works, Impala for big data analytics.
- Using python scripts to read & move flat files, web scraping using APIs to extract data in JSON & XML.
- Cleaning, rescaling and munging data using R, SAS & python; dimensionality reduction & normalization.
- Building models for machine learning algorithms by using several techniques to solve business problems. dis would include: K-nearest neighbor, Naïve Bayes, Simple Linear Regression, Multiple Regression, Logistic Regression, Decision Trees & Neural Networks for supervised learning algorithms.
- Utilizing clustering and reinforcement algorithms for unsupervised learning models.
- Refactoring map reduce code in python and java to optimize query performance.
- Using enhanced techniques to reduce overfitting or underfitting including; Random forest.
- Leveraging natural language processing, recommender systems & network analysis to build custom data products and solve complex business problems.
- Utilizing NumPy, pandas, scikit-learn and other python and R libraries to build data products and decision engines.
- Using statistical and probability techniques, linear algebra, gradient descent, hypothesis and inference to build models for machine learning algorithms.
- Build data products by extracting data from IoT devices and do complex event processing for decision engines, predictive models and live streaming dashboards for monitoring.
- Rapid prototyping of products & solutions after analyzing business problem and going through iterations and simulations of possible solutions; thinking outside the box and challenging status quo techniques for problem solving.
- Critical thinking & domain knowledge in several industries including: Banking, Finance, Telecommunications, Oil & Gas, Pharmaceuticals, Healthcare technology, Supply chain & logistics, Marketing, Consulting, Professional Services and Information Technology.
Tools: VMware ESX, VSphere, Windows Server 2008, Windows XP to Windows 8 Troubleshooting, Active Directory, and Microsoft Exchange & Microsoft Office 2003 - 2013. Configuring LAN/WAN technologies, VLANs, DHCP, DNS, VPN, Routing, Switching and RAS.
Pentaho Business Analytics / Platfora / Zoomdata / Arcadia Data
- Building use cases, proof of concepts and opportunity assessments for big data business intelligence tools including Platfora, Zoomdata, Arcadia, Looker, Bime, Burst, IBM Watson analytics, Pentaho Business Analytics, Spotfire, Tableau, Qliksense, Power BI & similar tools.
- Leveraging new big data business intelligence visual analytics tools designed to handle big data wif simplicity and almost real time analytic capabilities.
- Connecting to hadoop (HIVE & PIG), Kafka subscriptions integrated wif Spark & Storm to deliver real time dashboards and data products.
- Utilizing appropriate charts and custom visualizations in dashboards and reports to answer business questions and tell user stories.
- Installing, configuring and deployment pentaho server cluster.
- Setting up shared data sources (structured & unstructured) on the pentaho repository.
- Building reports on dashboards using dashboard designer, analyzer, interactive reports and dashboard reports.
- Setting up folders on repository library and environment to environment migrations.
- Performing all administrative tasks; access management, LDAP integration, resolving tickets and troubleshooting.
- Training developers to use the tools and features and setting up Pentaho Business Analytics center of Excellence.
- Embedding dashboards in other applications using API’s.
- Designing & implementing self-service and data discovery business intelligence and analytics semantic layer.
- Proficient wif advanced custom expressions.
- Scripting OVER, Statistical, Spatial, Ranking, Math, Logic, Binning, Conversion, Date & Time, Text and Property functions.
- Creating and registering custom data functions in R and/or S-Plus. Running SAS and MATLAB scripts through Spotfire.
- Embedding spotfire analytic data products in web portals, websites and SharePoint.
- Data visualization best practices, interactive dashboards and guided analytics.
- Advanced geomapping configuration wif multilayer integration.
- Building elements, joins, procedures & infolinks in info model layer.
- Library administration, Information Designer and Administration Manager Proficiency.
- Managing licenses, setting up and configuring spotfire users (5000+).
- Deploying cluster of spotfire servers including web player server, load-balancing servers, automation services server, statistical services server and spotfire servers.
- Upgrading, patching, monitoring, ldap integration, installations and all other administration duties.
- Configuring Spotfire Application Data Services for multiple environments (Composite) including Netezza, Teradata, MS PDW, Oracle and other big data sources.
- Scheduled updates and automation services xml jobs.
- Server monitoring using geneous, splunk and creating alerts for exceptions.
- Spotfire infrastructure design and platform configuration for clusters, high availability of web player servers.
- Configuring spotfire information model by designing and developing back end stored procedures & complex queries for spotfire server information links.
- Deploying web server based dashboards in Spotfire 4.x, Spotfire 5.x., Spotfire 6.x and Spotfire 7.x
- Spotfire center of excellence standards domain knowledge.
- Training Analysts, building knowledge base, and documentation.
- Connecting wif data; using the Tableau interface to effectively create powerful visualizations.
- Create calculations including string manipulation, advanced arithmetic calculations, custom aggregations and ratios, date math, logic statements and quick table calculations.
- Build advanced chart types and visualizations: bar in bar charts - bullet graphs, box and whisker plots - pareto charts, build complex calculations to manipulate data, using statistical techniques to analyze data, using parameters and input controls to give users control over certain values, implement advanced geographic mapping techniques and using custom images and geocoding to build spatial visualizations of non-geographic data, combine data sources by joining multiple tables and using data blending, make visualizations perform as well as possible by using the data engine, extracts, and using connection methods correctly, build better dashboards using techniques for guided analytics, interactive dashboard design, and visual best practices, implement efficiency tips and tricks.
- Using groups, bins, hierarchies, sorts, sets, and filters to create focused and effective visualizations.
- Using Measure Name and Measure Value fields to create visualizations wif multiple measures and dimensions.
- Tableau Administrator: Windows Server monitoring of Tableau Servers (externally), and internally using Tableau Administrative view workbook.
- Tableau directory service integration using Active Directory.
- Full utilization of TABCMD and TABADMIN to do server-side auditing and administration of groups, users, sites, server status through batch scripting or simple DOS prompt commands.
- Implementing end-to-end workbook, database, trusted security strategies by leveraging ISMEMBEROF, FULLNAME, etc. so as to achieve the desired level of security required by end-users.
- Implementing user or core-based licensing strategies.
- Design and deployment of high availability, failover, and distributed Tableau configurations across multiple domains.
- SAML implementation wif reverse proxy.
- Tableau JOLT for stress testing.
- Configuration of Tableau VIZQL, Background, and Data Engine processes to adjust for performance across distributed configurations.
- Using F5 load-balancing for very active Tableau servers.
- Dashboard performance recording and tuning. DirectX and browser compatibility in improving user desktop performance.
- Full utilization of Tableau in-built Postgres database server to monitor user, browser, and server activity.
Principal Data Scientist / Business Intelligence Consultant-SME
- Antares Capital: Deployed Azure machine learning models on data residing in hadoop cluster. Designed, architected and implemented spotfire environment and developed operational insights dashboards for go live. Migrated reports from Power BI to spotfire from different lines of business. Setting up infolinks and data connections to disparate data sources including Hive, Sql Server, Redshift, Teradata and other external data sources.
- HMS: Migrated reports from on premise BI tools (Tableau, Crystal Reports, Power BI) to Platfora and Spotfire on a hybrid cloud platform. Designed and enhanced reports and data products due to the data intensive and data driven nature of the business.
- McKesson: Implemented contract and provider predictive analytics data product. dis data product included a series of data products tracking contract and provider metrics, KPI’s, regression analysis wif contract coverage, clinical studies, SLA’s, drug manifestations, etc using the Pentaho Analytics platform & Spotfire.
- Bristol Myers Squibb: Designed and developed operational business insights and data products for real world research data on Pentaho business analytics and hybrid cloud platform. End to end implementation of product; data modeling, data mining, ETL, database environment deployment and administration, data wrangling, data product consolidation and release wif modeling algorithms for predictive and prescriptive analytics. Compliance and governance wif HIPAA and other healthcare industry codes like ICD-9, 10, CPT, etc.
- Monsanto: Designed and developed a global supply chain perfect order data product and other finance, marketing & research data products. Redesigned and configured spotfire server and web player server clustered High-Availability platform to accommodate a 1000+ user base across North America, South America, Europe, and Asia & Africa. dis included a hybrid architecture scaling up and scaling out wif cloud integration.
- Redesigned and developed a global supply chain dashboard and guided analytics operational intelligence tool using spotfire.
- Designed a reporting new data model from data-warehouse (Teradata), other big data sources (structured & unstructured) and data virtualization layer (Info model-spotfire) to feed the spotfire analytic data products and dashboards wif optimum performance.
- Setting up and configuring automation services for data refreshes and migrations to different spotfire server environment and deploying models to production.
- Training of developers and analysts, knowledge base documentation, setting up a spotfire BI standards center of excellence wif best practices use cases well documented.
- Collaborated wif data science team to integrate machine learning algorithms and predictive analytics data products wifin the spotfire platform.
- Planned out and executed spotfire patch and upgrade for 7.0 & 7.5
- ConocoPhillips: Designed and developed operations real-time analytic data products and dashboards for production engineering support, finance, water disposal, production optimization and consolidated data products using spotfire for business unit.
- USDA: Designed and developed predictive analytics data product using tableau for managing business loans to small business, farmers, etc. Product included weather data, census data, loan status data, etc.
- Frontier Communications: Redesigned and architected spotfire platform integrating it wif AS400 systems via data virtualization platform.
- Bank of America: Integrating SAS & R into Spotfire & Tableau to build dashboards and analytic products for different lines of business. Developed automated workflows and data products end to end for Governance and compliance department. Built robust sanity check workflows and automated test using Veritas Data Insight and other custom in house tools for file usage audits and security. Also worked wif offshore administration teams maintaining spotfire and tableau server environments. Pioneer member of team dat established ETL and BI center of excellence. Transitioned to data science team building machine-learning algorithms, text mining and other data products to support different LOB’s and enable bank to meet SLA’s wif client agreements. Served as production DBA for Oracle and MS SQL reporting databases.