- Passionate data scientist with over 8 Years of overall experience & 3 years in Statistical solutions, Mathematical modelling, Machine learning.
- Experienced at analyzing, cleaning, wrangling big data along with implementing, testing, and maintaining engineering pipelines handling huge data sets .
- Proficient in Statistical Modeling and Machine Learning techniques in Forecasting / Predictive Analytics, Segmentation methodologies, Regression based models, Hypothesis testing.
- Involved in entire data science project life cycle, including Data Acquisition, Data Cleansing, Data Manipulation, Feature Engineering, Modelling, Evaluation, Optimization, Testing and Deployment
- Experienced in creating dashboards handling huge datasets using visualization tools like matplotlib, ggplot2, d3.js
- Proficient with Python including Numpy, Scikit - learn, Pandas, Matplotlib and Seaborn.
- Hands-on with data analytics, OLAP reporting and machine learning models like Linear, Logistic regression, Decision trees, Random Forest, SVM, K-Nearest neighbors, Clustering: K-means and Hierarchical, Bayesian etc.
- Strong experience in wrangling very large data sets to understand and identify patterns .
- Proficient in SQL with hands-on in Spark SQL, HIVEQL, PIG, PySpark.
- Experienced in Amazon Web Services (AWS) such as AWS EC2, EMR, S3, RD3, and Redshift, Confidential Azure Services such as Web and Mobile Apps, Azure Functions, Storage, Cognitive Services, Data lake
- Proficient in requirement gathering, writing, analysis, estimation, use case review, scenario preparation, test planning and strategy decision making.
- Working in agile fashion for 7+ years.
- Strong business sense and abilities to communicate data insights to both technical and non-technical stakeholders.
Programming: Python, Java, Scala, .Net, Spark, Spark SQLBusiness Intelligence Tools: Power BI, MS Excel - Analytical Solver
Database & Skills: SQL Server, Mongo DB, PostgreSQL, DB2, Oracle 11g
Machine Learning: Linear, Lasso, Ridge Regression, Logistic Regression, Random Forest, Support Vector Machine, Neural Networks, Decision Tree, Time Series Analysis, PCA & Filtering, Analysis, Clustering, Text Mining, Collaborative Filtering, Naïve Bayes
Confidential, Seattle, WA
Data Scientist II
- Responsible for data aggregation, data pre-processing, missing value imputation, data enrichment, end user data quality.
- Developed efficient and intelligent data pipelines using Spark for Bing.com
- Architected the data pipeline for ingesting and processing multi million records to Bing.
- Used machine learning techniques to conflate data from various providers and implement multi-level rankers to serve search queries.
- Developed several dashboards that handle huge data sets for business reports, quality control of Bing local data.
- Implemented text-mining from user feedback to automatically enrich data.
- Designed and deployed a classifier for identifying junk data and businesses closed in the real world with accuracy of > 90
- Achieved the goal of bringing up the quality of top entities across 4 markets from ~80 to 98% in just 14 months that directly contributed to a hike in customer satisfaction.
- Produced high quality datasets that are used to train many new models.
- Hands on with deploying cloud services and REST API endpoints on Azure.
- Mentored new hires and managed a team of vendors
Confidential, Boston, MA
Data Scientist / Sr. Software Engineer
- Architected cloud integration and intelligent decision making capabilities of Pega
- Designed and developed various core functional modules of the rules engine and exposing API for the numerous features to be used by many Pega products.
- Implemented ‘next best action’ prediction capabilities for CRM offering.
- Replaced legacy object pools with self learnt and usage aware alternatives that fixed critical and long standing performance issue in the product.
- Tracked and fixed performance bottle necks in the core engine.
- Responsible for the API layer for the core engine that is exposed to all outer layers of the product and integration with external applications
- Built an OCR recognition capabilities to integrate and onboard legacy systems.
- Added capabilities of integrating external systems with modern protocols like OAuth.
- Participated in functional design discussions to produce most efficient algorithms.
- Provided engineering support and provided solutions to real time problems faced by production systems of world’s leading corporates
- Took the initiative and spearheaded the task to create a framework for comprehensive live test suite that provides capability to run customer test suites on latest updates of the product. This has doubled the customer adaptability of new versions of the software.
- Lead a team of engineers, enable them to achieve their goals.
- Presented the platform capabilities to business partners and customers in tech conferences.
- Successfully completed ‘Career Accelerator Program’ which is a year-long commitment of rapid learning and delivery.
- Recipient of ‘Star Performer’ award for multiple times and three consecutive quarters once.
Confidential, Plano, TX
- Developed various functional modules of the digital platform using ATG framework.
- Responsible for “My Account” section of the website which handles millions of user accounts, their orders.
- Developed store locator with integrated Google maps for hundreds of the retailer’s stores.
- Built a comprehensive test suite that targets maximum code coverage and enables continuous deployment
- Participated in technical walkthroughs and code reviews of other team members’ components, test plans and results and help them with gaps.
- Conducted and participated in knowledge sharing sessions with the team.
Environment: Java, ATG, J2EE, JQuery, AJAX, JAXB, Web sphere App. Server, SoapUI, SQL Developer
- Designed the data model and create DB access objects.
- Developed the front end and business middle tier of the application.
- Integrated with third-party data sources, and on-boarded projects from legacy systems.
- Developed from scratch and deployed to prod within 6 months and used by critical projects company wide
Environment: J2EE, JSP, EJB, JQuery, AJAX, and MS-SQL