- Highly analytical and process - oriented data analyst with 8+ years of experience in data analysis and data management has proven ability to work efficiently in both independent and teamwork environments.
- Experienced in Requirement Analysis, Test Design, Test Preparation, Test Execution, Defect Management, and Management Reporting.
- An excellent understanding of both traditional statistical modeling and Machine Learning techniques and algorithms like Regression, clustering, ensembling (random forest, gradient boosting), deep learning (neural networks), etc.
- Experience in SAS, SQL, SSRS, Tableau, Python, MS EXCEL (VLOOKUP, Pivot charts, Macros).
- Expertise in Data Manipulations using TABULATE, UNIVARIATE, Append, Array, DO loops, Macros and Merge procedures
- Knowledge in Statistical Modeling and Machine Learning techniques (Linear, Logistics, K-Nearest Neighbours, Bayesian) in Forecasting/ Predictive Analytics
- Experience in presenting the solutions to business challenges for executive management through insightful findings.
- Skilled in performing data parsing, data manipulation and data preparation with methods including describe data contents, compute descriptive statistics of data, regex, split and combine, Remap, merge, subset, reindex, melt and reshape.
- Experience in using various packages in R and python-like ggplot2, caret, dplyr, Rweka, gmodels, twitter, NLP, Reshape2, rjson, dplyr, pandas, NumPy, Seaborn, SciPy, Matplotlib, sci-kit-learn, Beautiful Soup.
- Expertise in using Linear&LogisticRegression and ClassificationModeling, Decision-trees, Principal Component Analysis (PCA), Cluster and Segmentation analyses, and have authored and co-authored several scholarly articles applying these techniques.
- Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data mining solutions to various business problems and generating data visualizations using R, Python, and Tableau.
- Excellent communication skills and understanding of business operations and analytical tools for effective analysis of data.
- Expert in python libraries such as NumPy, SciPy for mathematical calculations, Pandas for data pre-processing/wrangling, Mat plot, Seaborn for datavisualization, Sklearn for machine learning, Theano, TensorFlow, Keras for Deep leaning and NLTK for NLP.
- Experience and Technical proficiency in Designing, Data Modelling Online Applications, Solution Lead for Architecting Data Warehouse / Business Intelligence Applications.
- Performed Data Analysis and Data validation by writing complex SQL queries using SQL against the SQL server database.
- Power-user of advanced T-SQL queries with multi-table joins, window functions, sub-queries, set operations, triggers, stored procedures, variables, and cursors.
- Expert in MS Excel including date functions, text calculations, lookups, Pivot tables, advanced summations.
- Knowledge of various reporting objects like Facts, Attributes, Hierarchies, Transformations, Filters, Prompts, Sets and groups in Tableau.
- Experience with Data Extraction, Transforming and Loading (ETL) using various tools such as Data Transformation Service (DTS), SSIS.
- Used Tableau to analyze and obtain insights into large datasets create visually compelling and actionable interactive reports and dashboards.
- Experienced in generating and documenting Metadata while designing the OLTP and OLAP systems environment.
- Expertise in Data Flows, Data Architecture, Data Profiling, Data Quality, Data Cleansing, Data Processing and the ability to create Data Dictionaries.
Application/Web Servers: JBoss, Glassfish 2.1, WebLogic, Web Sphere, Apache Tomcat Server.
MSOffice Package: Microsoft Office (Windows, Word, Excel, PowerPoint, Visio, Project).
Programming Skills: SQL, SSRS, Advance SAS, SAS Macros, Advanced Excel Skills (V-lookup, Charts, Pivot tables,etc.), Python and R.
Databases: Microsoft SQL Server 2014/2012/2008 R2/2008, MySQL, Oracle, DB2, Teradata, MS Access.
Exploratory Data Analysis:: Univariate/Multivariate Outlier detection, Missing value imputation, Histograms/Density estimation, EDA in Tableau.
Supervised Learning:: Linear/Logistic Regression, Decision Trees, Ensemble Methods,Unsupervised Learning: Association Rules, Hierarchical Clustering, Market Basket Analysis
R and Python Packages: Dplyr, Ggplot2, Caret, TensorFlow, Pandas, Matplotlib,Seaborn, Scikit learn, NumPy, NLTK
Business Intelligence/Reporting Tools: Microsoft PowerBI, Tableau.
Databases: Teradata R12 R13 R14.10, MS SQL Server, DB2, Netezza
- Analyzed the Business Requirements Specification Documents and Source to Target Mapping Documents and identified the test requirements.
- Developing SQL Queries to pull the data from the different database for the specified Diagnosis/ICD - 10 and Procedural (CPT) codes
- Participated in all phases of data preparation; data collection, data cleaning, validation, visualization, and performed Gap Analysis.
- Interaction with Business Analyst, SMEs, and other executive stakeholders to understand Business needs and functionality for various project solutions
- Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using Tableau.
- Gathering all the data that is required from multiple data sources and creating datasets that will be used in the analysis.
- Analyze specialty pharmacy referral and dispense data to explore the key trends and most importantly identify the gaps in the analysis to aid to inform the future scope of research.
- Identifying the major roadblocks for shipment of specialty product and resolving them to increase the shipment rate from 74% to 87% and reducing the average number of shipment days from 23 to 18
- Creating Ad-hoc reports and visualizations on different therapies like Immunology based on their sales, market share, patient demographics, out of pocket expenses, product share by total prescriptions,etc.
- Developed Visualizations to analyze the performance of territories for Specialty Pharmacy by various metrics that would help case managers identify gaps in the patient journey and define actions for their resolution
- Developed and implemented predictive models of user behavior data on websites, URL categorical, social network analysis, social mining and search content based on large-scale MachineLearning.
- Developed predictive models on large-scale datasets to address various business problems through leveraging advanced statistical modeling, machine learning, and deep learning.
- Updated Pythonscripts to match data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification.
- Co-ordinated with business users to understand functional requirements. This included creating ETL Specification Document, participating in review meetings.
- Tested the format of the reports according to the specifications provided and compared the data in the reports with the backend Datamart through SQLand using excel for data comparison.
- Created XML schema definitions (XSDs) with the XMLSpy tool and converted them into Informatica metadata.
- Manipulate and prepare data, extract data from the database for a business analyst using SAS.
- Involved in end to end testing of the entire process flow starting from the source database to the target Datamart to the reports by considering all possible scenarios.
- Created Data Stage jobs to extract, transform and load data into data warehouses from various sources like relational databases, application systems, temp tables, flat files, etc.
- Expertise in the development of High-level design, Conceptual design, Logical and Physical design for Database, Data warehousing and many Distributed IT systems.
- Prepared Business Requirement Documents (BRD's) after the collection of Functional Requirements from System Users that provided the appropriate scope of work for the technical team to develop a prototype and overall system.
Environment : Informatica Power center, ERWIN, PL/SQL, ETL Data Validation, Cube Testing, Report testing, OLTP, SDLC, Pandas, NumPy, Seaborn, Data Marts, MicroStrategy, UAT, Python.
Confidential, Woonsocket, Rhode Island
- Perform root cause analysis on smaller self-contained data analysis tasks that are related to assigned data processes.
- Worked to ensure high levels of data consistency between diverse source systems including flat files, XML and SQL Database.
- Created framework required for CMSMedicareSTARS quality program optimization that helps in tracking the Compliance and Non-Compliance members concerning their health care gaps
- Analyzing workflow and targets from quality team members and prepared the weekly report to the upper management using SAS and Power BI, created dashboards/scorecards to provide analytical support to the QA department as needed
- Forecasted the summary statistics for both Part C and Part D,CAHPS (Consumer Assessment of Healthcare Providers and Systems) survey and identified opportunities in improvising them.
- Automated Scorecards for the STARS initiative and provided analysis reporting in a presentable data format to providers.
- Run-bi-weekly production reporting to the vendor through SAS and perform data validation, Update and maintain quality outreach spreadsheets through updated production run data.
- Assisting call planning team in reaching out the list of the beneficiaries who are non- compliant and planning their appointments with physicians
- Analyzing the individual performance of call planning team against their pre-defined targets and reporting the same to managerial staff
- Auditing the HEDIS data with third party vendor to differentiate between Compliant and Non-compliant beneficiaries for Part C
- Cleaning Data and removing duplicates for Part D beneficiaries in response to their adherence to the drugs.
- Created reports and dashboards, by using D3.js and Tableau 9.x, to explain and communicate data insights, significant features, models scores and performance of new recommendation systems to both technical and business teams.
- Utilize SQL, Excel and several Marketing/Web Analytics tools (Google Analytics, Bing Ads, AdWords, AdSense, Criteo, Smartly, SurveyMonkey, and Mailchimp) to complete business & marketing analysis and assessment.
- Updated Pythonscripts to match training data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification.
- Wrote SQL Stored Procedures and Views, and coordinate and perform in-depth testing of new and existing systems.
- Provided support to Data Architect and Data Modeler in Designing and Implementing Databases for MDM using ERWIN Data Modeler Tool and MS Access.
- Designed and Developed Complex Active reports and Dashboards with different data visualizations using Tableau desktop on customer data.Proficient in importing/exporting large amounts of data from files to Teradata and vice versa.
- Worked with data compliance teams, Data governance team to maintain data models, Metadata, Data Dictionaries; define source fields and its definitions.
Environment: Informatica, SAS/BASE, SAS/Access, SAS/Connect, XML,Erwin, Pivot tables, Snowflake schema, Star schema, VLOOKUP, Teradata, Python, Informatica, SSRS, SSIS, UNIX, SQL, Oracle, Tableau, JIRA.
Data Modeler/Data Analyst
Confidential, Akron, OH
- Created conceptual, logical and physical models based on requirements gathered through interviews with the business users.
- Updated existing models to integrate new functionality into an existing application. Conducted one-on-one sessions with business users to gather warehouse requirements.
- Analyzed database requirements in detail with the project stakeholders by conducting joint Requirement Development sessions.
- Developed normalized Logical and Physical database models to design the OLTP system.
- Created a dimensional model for the reporting system by identifying the required dimensions and facts using Erwin.
- Used forward engineering to create a Physical Data Model with DDL that best suits the requirements from the Logical Data Model.
- Maintaining and implementing Data Models for Enterprise Data Warehouse using ERWIN.
- Create and maintain Metadata, including table, column definitions.
- Worked on PL/SQL programming Stored Procedures, Functions, Packages,and Triggers
- Used Model Mart of Erwin for effective model management of sharing, dividing and reusing model information and design for productivity improvement.
- Eliminated errors in Erwin models through the implementation of Model Mart (a companion tool to Erwin that controls the versioning of models).
- Used Erwin for reverse engineering to connect to existing database and ODS to create a graphical representation in the form of Entity Relationships and elicit more information
- Identified the most appropriate data sources based on an understanding of corporate data thus providing a higher level of consistency in reports being used by various levels of management.
Environment: Erwin r8, Windows XP NT 2000, SQL Server 2008, Teradata, Oracle11g, DB2, Informix, MS Excel, Mainframes MS Visio, Rational Rose, Requisite Pro.
Data Modeler/Data Analyst
Confidential, Grand Island, NE
- Involved in the entire data Migration process from analyzing the existing data, cleansing, validating, translating tables, converting and subsequent upload into the new platform.
- Worked on Performance Tuning of the database which includes indexes, optimizing SQL Statements.
- Implemented Forward engineering to create tables, views and SQL scripts, and mapping documents.
- Prepared data dictionaries and Source-Target Mapping documents to ease the ETL process and user's understanding of the data warehouse objects
- Created documentation and test cases, worked with users for new module enhancements and testing.
- Worked with business analysts to design weekly reports using a combination of Crystal Reports.
- Identified existing data model and documented suspected design affecting the performance of the system.
- Extracted data from databases like Oracle, SQL Server and DB2 using Informatica to load it into a single repository for data analysis.
- Involved in the development and implementation of SSIS, SSRS and SSAS application solutions for various business units across the organization.
Environment : Oracle SQL Developer, Oracle Data Modeler, Teradata, SSIS, Business Objects, Teradata, Oracle 10g, SQL Server 2012, SQL Assistant, data stage 8.1, DB2, Informatica Power Center.
- Maintained models to predict the customer who is willing to opt for a new healthcare plan by building a logistic regression model.
- Worked on Claims, EMR records, Billing and Membership modules in Facets that includes creating group, subscriber, billing, claims, and pharmacy data, acquiring knowledge on Medicaid and Medicare Business knowledge of SQL programming to drive analyses of large data sets and provide solutions based on ICD-9 and HCPCS codes and used statistical software packages to extract, clean, transform and analyze data from large datasets.
- Creating visualizations, dashboards and user stories in Tableau depending upon the business requirements
- Collaborated with clients to find key business performance indicators and suggest measurable improvements
- Entrusted with training and managing a team of 10 Level 1 employees to identify trends and rectify critical issues.
Jr. Data Analyst
- Preparation of unstructured data from multiple data files from customers database and ad hoc reporting requests for customers as needed along with business review analysis of the program data.
- Communicated with the Source code provider in case of any discrepancies.
- Extracted, Transformed and Loaded (ETL) data to map it from disparate sources to the required target database.
- Worked on production support activities to monitor all Daily, Weekly, Monthly, Quarterly jobs and in Scheduler, fixing the failed workflows, communicating to the different teams to get the issue fixed based on the issue.
- Monitoring all Daily, Weekly, Monthly, Quarterly jobs and tracking the run statistics.
- Delved into data to discover discrepancies and patterns.
- Collaborated with management and internal teams to implement and evaluate improvements.