- A Collaborative results - driven Bigdata & Data Scientist professional with hands on experience using algorithms for solving complex business problems, strategic initiatives and real time analytics.
- Strong Hands on coding & application architecture experience using HADOOP 2.x/1.x, AWS and integrating with Bigdata, Data warehouses and AWS DevOps, Lambda and Kappa architectures.
- Extensive experience in MPP based Parallel computing application architecture & coding, testing and deployments using IBM Datastage for 10+ years.
- Architected and implemented 100+ TB Healthcare analytics data warehouse.
- Proven strengths in problem analysis, critical & analytical thinking, mentoring development teams to realize project objectives.
- CloudEra CDH 5.4/4.0
- Apache Hadoop 2.x/1.x
- Spark MLlib
- Spark Streaming
- SAS Visual Analytics
- Open Stack
- Amazon EC2
Sr. Bigdata Architect & Data Scientist Consultant
- Consulting with clients for Bigdata Architecture & implementation for data analytics needs, extend existing data warehouses with bigdata, advocate strategy to implement. Designed and implemented AWS based cloud strategies & solutions for US Govt and research analytics client.
- Bigdata Architect experience in research analytics using Spark, Hive, Pig, Kafka, Elastic Search (ELK), Cassandra, Parquet, Avro, Python, MapReduce for ETL transformations & metrics generations in batch & real-time using Lambda architecture. Analyzed data, data munging using R, Matlab & GraphX for visualizations for developing new generation metrics.
- Implemented machine learning algorithms-based research & literature data products for recommendations engine to suggest recommended research articles based on user search history using Spark MLlib, Mahout for knowledge graphs for research network nodes, tf-idf, Lavanshien, SVM, LDA, Neural Networks, Bayesian, Regression, classification & Clustering, Deep Learning, Supervised Learning, Decision tree algorthms/maps, circular clusters and word clouds.
- Designed big data applications, lead DevOps with continuous development & integrations methodology in AWS with Jenkins, NetFlix/AsGard, Netflix OSS components in Hadoop eco system environment. Developed code and tuned Spark, Spark Sql, Hive code for performance and tuned for SAS Visual Analytics. Integrated SAS with Cloudera in AWS Cloud.
- Ran POCs & developed migration strategies and implemented for existing data warehouse from oracle Exadata to Hadoop ElasticSearch & Kudu.
- Installed and configured multi-nodes fully distributed Cloudera 5.x/4.x Hadoop cluster, HDFS for analyzing multi format structured and unstructured data from different sources.
Sr. Technical Manager/Architect
- Managed a team of BI & ETL developers, data modelers. Staff hiring and performance management, resource management, mentoring, risk management, stake holder management, directing teams for program objectives & goals. Managed data migration & feeds development to enable platform for rapid custom development cycles.
- Responsible for retail business analytics, User activity metrics and marketing & campaign applications.
- Responsible for ETL applications architecture by leveraging reusable building blocks, end-end integration, developed ETL best practices, capacity planning, data analysis, data quality and remediation, dimensional data modeling, ETL performance tuning and ETL audit.
- Delivered Data Analytics & ETL projects on-time, on-budget and with a high-degree of quality and consistency.
- Stakeholders management for project integrity, quality and reduce delivery risks.
Sr. ETL, MDM & DW Architect, Manager/Tech Lead Consultant
- Consulting, leading ETL teams, managing projects in ETL, DW, BI solution architecture and implementation. Consulting services in data integration & quality, data modeling, High Performance Parallel Computing, replication technologies.
- Extensively used regression, classification, and hypothesis testing techniques for revenue forecasting. Trusted advisor for client managements in solution architecture, project management, time & budget estimates.
- Developed geocoding & spatial analysis solutions. Used SAS for mortgage data analysis, data extraction to support analysis for predictive data modeling and reporting @ FDIC.
- Business KPIs, dashboards design & implementation, integrating with cloud-based solutions salesforce.com.
- Consulted on MDM tool selection for client requirements, data analysis, data quality, data profiling, data governance and MDM implementation.
- Extensively worked on Master data modeling, master data de-duplication, Name & Address data standardization, taxonomies, ontology, MDM solution integration, data syndication. Implemented data governance with security access control rules, ETL & MDM best practices.
- Implemented MDM based enterprise business analytics using SAS, R.
- Hands on architect role in IBM MDM / WCC, DataStage, Information server products implementations as per project needs.
- Developed SAP MDM to IBM MDM migration solution architecture & strategy for top automobile customer.
Sr. ETL & MDM Architect/Developer Consultant
- Implemented SAS & ETL applications for Yield Management for Avis & Budget rentals using optimizations, pricing models based on demand factors, customer retention rules and market conditions. Extensively used regression, classification, and hypothesis testing and inference techniques for revenue forecasting. Extensively used SAS, DataStage, SAS JMP, Shell scripting, DB2, Oracle, Teradata, flat files.
- Extensively coded using Parallel Computing algorithms, UNIX shell & AWK scripting, Perl scripting in SMP, MPP systems, Clusters and GRIDS.
- Hands on experience in data analysis, data profiling, data cleansing, and data harmonization using SQL, SAS, SAS/STAT, SAS/GRAPH, SAS/ACCESS, SAS/SQL, Shell scripting, DataStage tools.
- Extensively used SAS 7 & 8.1 editions, Base, SAS/STAT, SAS/SQL, SAS/ACCESS, and CICS as hands on developer.