- A Data Scientist professional with 7+ Years of progressive experience in Data Analytics, Statistical Modeling, Visualization and Machine Learning in Health Care, Banking, Retail, Insurance etc.,
- Excellent capability in collaboration, quick learning and adaptation.
- Experience in Data mining with large datasets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization.
- Experience in integrating data, profiling, validating and data cleansing transformation and data visualization using R and Python.
- Theoretical foundations and practical hands - on projects related to (i) supervised learning (linear and logistic regression, boosted decision trees, Support Vector Machines, neural networks, NLP), (ii) unsupervised learning (clustering, dimensionality reduction, recommender systems), (iii) probability & statistics, experiment analysis, confidence intervals, A/B testing
- Experience in migration from heterogeneous sources including Oracle to MS SQL Server.
- Hands on experience in design, management and visualization of databases using Oracle, MySQL and SQL Server.
- Intensive hands-on Boot camp on Data Analytics course spanning from Statistics to Programming including data engineering, data visualization, machine learning and programming in R, SQL.
- Experience in data analytics, predictive analysis like Classification, Regression, Recommender Systems.
- Experience in Descriptive Analysis Problems like Frequent Pattern Mining, Clustering, and Outlier Detection.
- Good Exposure on SAS analytics.
- Good Exposure in deep learning with Tensor flow in python.
- Good Knowledge on Natural Language Processing (NLP) and Time Series Analysis and Forecasting using ARIMA model in Python and R.
- Good knowledge in Tableau, Power BI for interactive data visualizations.
- Good exposure in creating pivot tables and charts in Excel.
- Experience in developing Custom Report and different types of Tabular Reports, Matrix Reports, Ad hoc reports and distributed reports in multiple formats using SQL Server Reporting Services (SSRS).
- Excellent Database administration (DBA) skills including user authorizations, Database creation, Tables, indexes and backup creation.
- Involved in various projects related to Data Modeling, System/Data Analysis, Design and Development for both OLTP and Data warehousing environments.
- Facilitated data requirement meetings with business and technical stakeholders and resolved conflicts to drive decisions.
- Experienced in Requirement Analysis, Test Design, Test Preparation, Test Execution, Defect Management, and Management Reporting.
- Strong understanding of Data Modeling in data warehouse environment such as star schema and snow flake schema.
- Hands on experience in developing PL/SQL, Stored Procedure and UNIX Scripting.
- Experienced in BI Reporting tools such as Business Objects, COGNOS and Oracle Discoverer.
Databases: MS SQL 2016, 2012, 2008 R2/2008, 2005, 2000, MySQL, Oracle SQL/11g, Teradata, MS-Access, Hive
Programming Languages: R, Python, SAS Enterprise Guide 5.1,SAS Enterprise Miner 12.1, SAS Sentimental Analysis, SAS Forecast Studio, SAS/Base
Data Analysis Tools: SAS, MS-Excel, MicroStrategy, Tableau, Hadoop
Reporting Tools: MS Office (Word/Excel/Power Point/ Visio), Tableau, Crystal reports XI, Business Intelligence, SSRS, Business Objects 5.x/ 6.x, Cognos7.0/6.0.
Operating System: Windows 8/XP/NT/ 2000/2003/2007, UNIX and Linux
BI Tools: Tableau, Tableau Server, Tableau Reader, SAP Business Objects, OBIEE, QlikView, SAP Business Intelligence, Amazon Redshift, or Azure Data Warehouse
Confidential, Danville, PA
Sr. Data Analyst/ Scientist
- Perform Data Profiling to learn about behavior with various features such as traffic pattern, location, Date and Time etc.
- Extracted the data from hive tables by writing efficient Hive queries.
- Performed preliminary data analysis using descriptive statistics and handled anomalies such as removing duplicates and imputing missing values.
- Involved in complete Software Development Life Cycle (SDLC) process by analyzing business requirements and understanding the functional work flow of information from source systems to destination systems.
- A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL, Unix Commands, NoSQL, Hadoop.
- Designed and developed data models, reports and dashboards.
- Supported ETL mapping activities as per established business requirements.
- Implemented validation of source data systems along with data mappings.
- Managed assigned project deliverables and prepared dimensional data marts.
- Generated reports to business users with Business Objects, SAS and other applications.
- Analyzed and reviewed reporting data, data warehouses and ETL solutions.
- Formulated specifications and requirements for reporting and analytical purposes.
- Prepared and implemented test cases, data architecture and file formats.
- Conducted group-effort in documenting the Pre-Business Requirements documentation for current AS-IS and future TO-BE scenarios to assist new Project Managers in processes and methodologies orientation
- Extracted data from Oracle and SQL Server and DB2 using Informatica to load it into a single data warehouse repository.
- Created entity relationship diagrams and multidimensional data models, reports and diagrams for marketing.
- Developed and implemented measurements for various marketing campaigns.
- Created a net loss model which analyzes the impact of loan defaults and recoveries.
- Created a relational model and dimensional model for online services such as online banking and automated bill pay.
- Applied data cleansing/data scrubbing techniques to ensure consistency amongst data sets.
- Developed logical data models and physical data models using ER-Studio.
- Developed and deployed, in-house MS Access Database, constructing data tables, queries and reports, for Project and Program review, elements updates and FDR and Unisys report aligning processes, on capturing common artifacts for TO-BE scenario.
- Participated in the entire life cycle of projects related with initial situational gap analysis, change management, operation design procedures, overall project scoping and QA testing procedures
- Facilitated user interviews, workshops and task analysis to gather requirements.
Environment: Python 2.x, R, HDFS, Hadoop 2.3, Hive, Linux, Spark, IBM SPSS, Tableau Desktop, SQL Server 2012, Microsoft Excel, Matlab, Spark SQL, Pyspark.
Confidential - Manhattan, NY
Sr. Data Scientist/ Analyst
- Conducted a hybrid of Hierarchical and K-means Cluster Analysis using IBM SPSS and identified meaningful segments of customers through a discovery approach.
- Categorize comments into positive and negative clusters from different social networking sites using Sentiment Analysis and Text Analytics.
- Analyze traffic patterns by calculating autocorrelation with different time lags.
- Ensure that the model has low False Positive Rate and Text classification and sentiment analysis for unstructured and semi-structured data.
- Use Principal Component Analysis in feature engineering to analyze high dimensional data.
- Create and design reports that will use gathered metrics to infer and draw logical conclusions of past and future behavior.
- Perform Multinomial Logistic Regression, Random forest, Decision Tree, SVM to classify package is going to deliver on time for the new route.
- Implemented different models like Logistic Regression, Random Forest and Gradient-Boost Trees to predict whether a given die will pass or fail the test.
- Perform data analysis by using Hive to retrieve the data from Hadoop cluster, Sql to retrieve data from Oracle database and used ETL for data transformation.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Developed and updated charts and graphics and prepared SQL queries
- Created and updated Business Requirement documents and Functional Requirement documents.
- Conducted walk-through, requirements gathering meetings and worked with the stakeholders to ensure the right solution
- Evaluated information gathered from multiple sources, reconciled conflicts, decomposed high-level requirements
- Communicated with different Stakeholders, Business Group, and User Group to elicit and to analyze Business Requirements
- Wrote Business Specification Document, Business Rules, and Activity Diagrams, State Diagrams and Sequence Diagrams
- Executed Test Cases on Rational Test Manager to validate the business requirements
- Reported defects along with QA team and follow up with the Development lead to resolve any outstanding defects
- Created Process Flow diagrams using Microsoft Visio.
- Managed Project Milestones through the entirety of the SDLC process.
- Assisted the Project Manager and QA Team, coordinating between team members according to the business requirements.
- Involved in project planning and coordination.
- Analyzed User Requirement Document, Business Requirement Document (BRD), Technical Requirement Specification and Functional Requirement Specification (FRS).
Environment: MS Excel, Agile Scrum, SQL, IBM Cognos, IBM SPSS Modeler, Tableau, MS Excel: Python 2.x, R, HDFS, Hadoop 2.3, Hive, Linux, Spark, IBM SPSS, Tableau Desktop, SQL Server 2012, Microsoft Excel, Matlab, Spark SQL
Confidential, Minneapolis, MN
Sr. Data Analyst
- Implemented Data Exploration to analyze patterns and to select features using Python SciPy.
- Built Factor Analysis and Cluster Analysis models using Python SciPy to classify customers into different Confidential groups.
- Built predictive models including Support Vector Machine, Random Forests and Naïve Bayes Classifier using Python Scikit-Learn to predict the personalized product choice for each client.
- Designed and implemented cross-validation and statistical tests including Hypothetical Testing, ANOVA, Auto-correlation to verify the models’ significance.
- Designed an A/B experiment for testing the business performance of the new recommendation system.
- Worked on loading the data from MySQL to HBase where necessary using Sqoop.
- Developed Hive queries for Analysis across different banners.
- Developed Hive queries for analysis, and exported the result set from Hive to MySQL using Sqoop after processing the data.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Created HBase tables to store various data formats of data coming from different portfolios.
- Worked on improving performance of existing Pig and Hive Queries.
- Utilize SQL, Excel and several Marketing/Web Analytics tools (Google Analytics, AdWords) in order to complete business & marketing analysis and assessment.
- Used Git 2.x for version control with Data Engineer team and Data Scientists colleagues.
- Used agile methodology and SCRUM process for project developing.
- Worked on multiple projects with different business units including Insurance actuaries
- Gather the various reporting requirement from the business analysts.
- Gather all the Sales analysis report prototypes from the business analysts belonging to different Business units; Participated in JAD sessions involving the discussion of various reporting needs.
- Partnered with QA team to create and Test scenarios, Test Plans and Test Cases based on user requirements and functional specifications.
- Performed Manual Front end testing to check all the functionalities incorporated with the different modules.
- Interacting extensively with end users on requirement gathering, analysis and documentation
- Involved with key departments to analyze areas and discuss the primary model requirements for the project
- Documented methodology, data reports and model results and communicated with the Project Team / Manager to share the knowledge
- Performed sorting and merging techniques on the input data sets for Data Preparation.
- Acted as a liaison between business units, QA and technology teams
- Used SAS to analyze large datasets and worked on projects using Data Mining, Predictive modeling and segmentation techniques. Experience in utilizing SAS procedures, Macros and other SAS applications
Environment: HDFS, Hive, Scoop, Pig, Oozie, Amazon Web Services (AWS), Python 3.x Tableau 9.x, SQL, SAS Base 9, Tableau, MS Excel, MS Project, SVM, A/B experiment, Agile/SCRUM.
Confidential, Louisville, KY
Sr. Data/ Business Analyst
- Created Business process models for the Confidential for various Applications using MS Visio.
- Supported Care radius application, on-call duties and interaction with business users.
- Experience in Care radius risk Analytics and data Report creation data.
- Worked in care radius platform for regulatory compliance.
- Coordinated the upgrade of Transaction Set 834 to HIPAA compliance.
- Analysis and Design of existing transaction sets, and modification of these transaction sets to ensure HIPAA compliance
- Entering budgets and allocation details for the projects in Clarity.
- Created detailed guide describing data file content and structure for business partners; created quick reference guide for HIPAA EDI ASC X12 834 transaction set for HCR enrollment.
- Study and understanding of the business and its functionalities by communication with Business Analysts.
- Analyzed the existing database for performance and suggested methods to redesign the model for improving the performance of the system.
- Supported ad-hoc, standard reporting and production projects.
- Designed and implemented many standard processes that are maintained and run on a scheduled basis.
- Created reports using MS Access and Excel. Applying filters to retrieve best results.
- Developed the Stored Procedures, SQL Joins, SQL queries for data retrieval, accessed for analysis and exported the data into CSV, Excel files.
- Developed Data mapping specifications to create and execute detailed system test plans. The data mapping specifies what data will be extracted from an internal data warehouse, transformed and sent to an external entity.
- Analyzed business requirements, system requirements data mapping requirement specifications and communicated it to developers effectively.
- Documented functional requirements and supplementary requirements in Quality Center.
- Setting up of environments to be used for testing and the range of functionalities to be tested as per technical specifications.
- Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to Confidential tables.
- Wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Troubleshoot test scripts, SQL queries, ETL jobs data warehouse/data mart/data store models.
- Responsible for different Data mapping activities from Source systems to Teradata.
- Developed SQL scripts, stored procedures, and views for data processing, maintenance etc., and other database operations.
- Performed the SQL Tuning and optimized the database and created the technical documents.
- Imported the Excel Sheet, CSV, Delimited Data, advanced excel features, ODBC compliance data sources into Oracle database for data extractions, data processing, and business needs.
- Designed and optimized the SQL queries, pass through query, make table query, joins in MS-Access 2003 and exported the data into Oracle database server.
- Developed views, functions, procedures, and packages using PL/SQL & SQL to transform data between source staging area to Confidential staging area.
- Wrote SQL queries to perform Data Validation and Data Integrity testing.
Environment: SAS Enterprise Guide 4.0, OLAP Cube studio, Stored Processes, SAS Management Console, 8.1, MS Project, Teradata SQL Assistant, Enterprise Miner, MS Access, MS Excel. SQL, SPSS, PL/SQL, Oracle 10g.
Data/ Business Analyst
- Performed Data Analysis and Data validation by writing complex SQL queries using TOAD against the ORACLE database.
- Facilitated (JAD) Joint Application Development sessions to resolve issues relating to difference between business requirements and technical design.
- Created Business Requirements and converted them into detailed Use Cases, Report Specifications and Non Functional Requirements.
- Developed business flow diagrams, Activity/State diagrams and Sequence diagrams using MS Visio so that developers and other stakeholders can understand the business process.
- Created and executed SQL queries (using Rapid SQL and MS Access) and scripts to validate data movement and generate expected results for UAT.
- Worked with the developers on resolving the reported bugs and various technical issues.
- Involved in preparing a simple and detailed User manual for the application, for an intended novice user.
- Prepared use case documents and utilized MS Visio to create UML diagrams including use case, activity and class diagrams to extract business process flows and workflows, thereby assisting development and quality assurance teams in understanding the requirements
- Reviewed and approved project specifications and technical inputs on the project, analysis and design documents prepared by the design and development teams.
- Reviewed test plans for unit test, system test and acceptance test.
- Monitored development work and test results and coordinated effectively for timely completion of projects.
- Ensured timely and defect-free delivery of the assigned deliverables adhering to the quality procedures and standards Training, knowledge capturing, knowledge sharing and knowledge transferring.
- Effectively coordinated with the off-shore team and successfully worked on on-site - off-shore model
- Assisted in developing test scenarios, test scripts & test data to support unit & system integration testing
- Chaired and participated in JAD sessions, providing constructive feedback on process flows, business and technical requirements and aligning existing reports to concurrent AS-IS reports being generated live.
- Mentored and trained on-boarding Data Analysts and Requirements Analysts to the processes, documentation, corporate culture and facilitated greater project and program - specific exercises
- Optimized the performance of the application by analyzing SQL queries, tables, and views
Environment: SAP BO, SQL, Oracle 11g, MS Excel, MS Outlook, MS Visio, Agile, Oracle 11g, ETL, T-SQL, Agile/SCRUM