Data Scientist / Software Developer Resume
Plano, TexaS
SUMMARY
- Goal oriented professional with 6+ years of experience in Data Science, Data Analytics, Software Developer across Healthcare and Banking domains, experience in data scrubbing to mine the data, Data Acquisition, Data Engineering to extract features utilizing Statistical Techniques, Exploratory Data Analysis with an inquisitive mind, and design Stunning Visualizations to help the growth of Business Profitability
- Having working experience on various IT Systems & application using open source technologies involving Analysis, Design, Coding, Testing, Implementation and Training.
- Outstanding preeminence in Data extraction, Data cleaning, Data Loading, Statistical Data Analysis, Exploratory Data Analysis, Data Wrangling, Predictive Modeling using R, Python and Data visualization using Kibana (ELK stack), Grafana, SSRS, Power BI.
- Excellent skills in state - of-the-art technology of client server computing with good understanding of Big Data Technologies
- Strong experience in Data Analysis, Data Cleaning, data Validation, Data Verification, and Identification of Data Mismatch.
- Expert in Python libraries such as NumPy, SciPy for mathematical calculations, Pandas for data preprocessing/wrangling, Mat plot, Seaborn for data visualization, Sklearn for machine learning, Theano, TensorFlow, Keras for Deep leaning and NLTK, Gensim, Re for NLP, statsmodel for Time series forecasting
- Developed search activity dashboards that provides an overview of clinical messages activity and count of message segment per facility to better understand flow on mirth connect interface, in real-time and over time.
- Experience writing complex SQL queries, procedures, triggers to obtain filtered data for various RDBMS such as SQL Server, MySQL, PostgreSQL, Teradata and Oracle, NoSQL databases such as MongoDB, HBase and Cassandra to handle unstructured data.
- Expert in using visualization tools like Kibana, Grafana (Bar charts, trees, histograms, sets, calculated fields, combing dimensions, publishing workbooks, Dashboard, logs, alerts), ggplot2 (details graphs for machine learning models), Power BI (visual level filters, page level filters, reporting, dashboards) for creating dashboards and providing beautiful insights.
- Proficient in data analysis, data validation, data cleansing, data verification and entire data science project life cycle.
- Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupiter Notebook 4.X, R 3.0 (ggplot2, Caret, dplyr) and Excel
- Expertise and Vast knowledge of Enterprise Data Warehousing including Data Modeling, Data Architecture, Data Integration (ETL/ELT) and Business Intelligence.
- Highly proficient and experience in working on software development and data modelling projects using JavaScript, Python, MSSQL, PostgreSQL, Elasticsearch, Logstash, Kibana, Grafana, Power BI, SAS, HL7, Mirth Connect and PowerShell on windows and Unix/Linux machines.
- Progressively involved in accessing ServiceNow/Remedy Forec tool and other internal issue trackers for project development, SDLC, GIT, Agile methodology and SCRUM process
- Good experience in Production Support, identifying root causes, Troubleshooting, Submitting Change Controls and deploying the changes
- Experienced in handling all the domain and technical interaction with application users, analyzing client business processes, documenting business requirements.
- Possess strong analytical and problem-solving skills and have a quick learning curve. Committed team player and capable of working on tight project delivery schedules and deadlines.
- Proficient in handling complex processes using SAS/ Base, SAS/ SQL, SAS/ STAT SAS/Graph, Merge, Join and Set statements, SAS/ ODS.
TECHNICAL SKILLS
Programming Languages: Python (NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn) 2.x/3.x, R 3.x, MS SQL, JavaScript, C, XML, HTML, Rest API, HL7, VBA, PowerShell, SAS 9.X
Querying Languages: MS SQL, PostgreSQL, Mango DB, My SQL, No SQL
Data Visualization: Kibana, Grafana, Python (Matplotlib, Seaborn), R(ggplot2), Power BI, SSRS tools: and technologies: Mirth Connect, Mirth Match, Mirth Result, GitHub, Jupyter
SDLC Methodologies: Agile, Scrum, Waterfall
Platforms: Windows, Unix/Linux
PROFESSIONAL EXPERIENCE
Confidential - Plano, Texas
Data Scientist / Software Developer
Technical Environment: Mirth (Match 1.5, Result 2.1, Connect 3.3/3.4/3.5), HL7, MSSQL Server 2008/2012, PostgreSQL 9.X, Mango DB, SSRS, Power BI, ELK 6.3, JavaScript, HL7v6, PowerShell, REST API, SOUP, HTML, XML, Tortoise GIT, Python, R, Service Now, BMC Remedy, Microsoft office, Excel ODBC, Microsoft Azure, Windows.
Responsibilities:
- I’m responsible for being the technical point of contact to upper management, business analysts, project management, and miscellaneous other groups for the proactive monitoring project.
- Developed and modified existing interface/channels for required changes.
- Debugging interfaces for bugs and Unit testing for required output.
- Created and maintaining probabilistic matching rules in Mirth Match EMPI and use Match's match-quality reporting to rate matching effectiveness and look for areas for improvement.
- Created tooling needed to perform data-steward activities against Mirth Results' clinical data repository (stored in PostgreSQL). Such activities include Detecting missing values from contributed systems and suggesting ways to solve and prevent the issue.
- Proficient in SQL databases like MySQL, MS SQL, and PostgreSQL.
- Worked with UNIX/Linux including commands and shell scripting
- Proficient in Linux, including python scripting and deployment of applications in an enterprise environment
- Worked very closely with healthcare interoperability and messaging standards, like HL7 2.x, HL7 3.x, FHIR, CCDA, X12 HIPAA, Blood Bank, Microbiology, Transcription, Laboratory, Pathology, Radiology, Patient Access, etc
- Have a Strong development experience of SQL, REST APIs, Web Services, and Message Queues
- Responsible for SQL Server Reporting Services Planning, Architecture, Training, Support, and Administration in Development, Test and Production Environments.
- Analysis between Mirth result and mirth match data to find the duplicates patients (Demographic) records and finding the duplicates from Mirth match threshold score.
- Preparing and exporting an ad-hoc report from the different databases for Testing scenarios and discovering areas where a potential for data standardization exists
- Responsible for enabling analysis through producing information products and is involved in the research and development efforts. Traditional programming (SAS, SQL, R, and PostgreSQL) and business intelligence (i.e. ELK) experience for creating dashboards
- Experience installing and developing on ELK on production and test servers.
- Used Mango DB for local Mirth Connect storage.
- Extracting and analyzing raw data from Match's PostgreSQL database and Serving as SME for IHE PIX/PDQ feeds with other exchange participants.
- Determining which feed and Mirth Results data are useful for metrics and developed Mirth Connect channel to automate metric-shipping feeds to Elasticsearch
- Creating Individual message count visualizations using Kibana to show messages statistics over a selected period and utilize reporting via Kibana
- Developed Rest API with the help of Python for Kibana to create visualizations and dashboards.
- Monitoring and logging tools such as, ELK Stack (Mirth Connect, Elasticsearch and Kibana).
- Integrated Clinical messages with Elasticsearch and Kibana for real-time log analytics (performance monitoring & alerting).
- Indexing and search/query substantial number of documents inside Elasticsearch and created a Kibana dashboard for sanity-checking the data and Working with the Kibana dashboard for the overall build status with drill-down features
- Data Discovery, visualizations, and dashboards are created in Kibana for quick analysis on healthcare data.
- Created SSRS Consolidated report for Audit data and plotted graph for calculated Threshold value and creating reports utilizing SQL Server Reporting Services (SSRS)
- Developing Dynamic Mirth Connect channel with the help of JavaScript, XML, SQL Server to import live data from SQL Server and PostgreSQL database to Elasticsearch and performed visualizations.
- Created Kibana dashboards for metrics which need to be trending for operational and business activities (message volume, ordered tests, etc.)
- Expertise in reporting tools like Grafana, Kibana (ELK) for setting up charts and graphs for better visual representation of test results.
- Integrating with Grafana and determining how to best visualize operational metrics for live feeds from Mirth Connect
- Demonstrated experience developing Grafana dashboards from Clinical messages by using MSSQL, MySQL, PostgreSQL Data.
- ETL process for continuously bulk importing clinical data from SQL server and PostgreSQL Database into Elasticsearch and into Spreadsheet.
- Developed python scripts and Grafana dashboards to automate the monitoring tasks and fire alerts if data anomalies are detected
- Created an ODBC connection for Excel to pull the data from PostgreSQL and SQL Server.
- Experienced in using ServiceNow and BMC Remedy for creating tickets like Request, Incident, Change management.
Confidential - Ontario, CA
Data Science
Technical Environment: BASE SAS 9.2, SAS EG 7.1, SAS/Connect, SAS/STAT, SAS/Graph, SAS/ETL, SAS Macros, SAS SQL, SAS ODS, VB, Python, R, MS Excel, MS Access, UNIX/Windows.
Responsibilities:
- Created and extracted Clinical data tables from SQL, Text, CSV, and Excel files to SAS using SAS tools like SAS/ACCESS, Infile, LIBNAME engine.
- Scripted SAS programs with the use of Base SAS and SAS/MACROS for ETL purposes also transferring and converting finance data from Excel files to another to be used for further analysis and created global and local variables.
- Queried and retrieved data from SQL Server database to get the sample dataset.
- Worked as a Data Modeler/Analyst to generate Data Models using Erwin and developed a relational database system.
- Analyzed the business requirements of the project by studying the Business Requirement Specification document.
- Participated in the installation of SAS on Linux platform
- Used Data Quality validation techniques to validate Critical Data elements (CDE) and identified many anomalies. Extensively worked on statistical analysis tools and adept at writing code in Advanced Excel, R and Python.
- Predominant practice of Python Matplotlib package and Tableau to visualize and graphically analyses the data. Data pre-processing, Splitting the identified data set into Training set and Test set using other libraries in python.
- Derived data from relational databases and perform complex data manipulations and also conducted extensive data checks to ensure data quality.
- Created SQL tables with referential integrity and developed queries using SQL, SQL*PLUS,andPL/SQL.
- Formulated sophisticated visualization of analysis output for business users. Publish results and address constraints or limitations with business partners.
- Created customized figures for clinical study reports as well as ad-hoc requests using SAS procedures Gplot and Gchart.
- Designed ETL Process Using SAS to extract the data from flat files, excel, and csv.
- Superintended usage of Python NumPy, SciPy, Pandas, Matplot, Stats packages to perform dataset manipulation, data mapping, data cleansing and feature engineering. Built and analyzed datasets using R and Python.
- Researched, evaluated, architected, and deployed new tools, frameworks, and patterns to build sustainable Big Data platforms for the clients.
- Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, NumPy.
- Do trend analysis and check interaction p-values based on separated and combined dataset, using results of regression analysis, and make the programs reusable using SAS MACRO.
- Implemented waterfall method of Software Development Life Cycle (SDLC) methodology for design, development, and implementation and testing of various SAS modules.
- Worked with financial data like analyzing the salary sheets, inpatient and outpatient billing.
- Analyzed clinical trial data and generated tables, listings, and graphs using MACRO, GRAPH, and SQL.
- Generated Summary Tables, Listings and Graphs to support clinical study reports by using Base SAS, SAS/STAT, SAS/MACRO SAS EG and SAS/GRAPH.
- Implemented CDISC SDTM and ADaM standards as well as generating tables, figures, and listings to support the statistical analysis of clinical trials data.
- Summarize results of complex data and statistical analysis, and present reports to laboratory to support decisions on future operations and manufactory orientation.
- Built Macros and create macro variables using %Macro and %Mend, and DATA NULL to help generate analysis data sets and create specified structure.
Confidential
Senior Data Engineer
Technical Environment : Oracle, MSSQL, MS Visio, Excel Vlookup, Pivot tables, Excel VB, FTP (File transfer protocol), MS Office, Windows/Linux
Responsibilities:
- Provided forms item analysis and summary statistics reports in Excel and reading data from different sources like .csv, excel, tab delimited files.
- Worked with gathering requirements, design and develop for the CVWS internal tool
- Performed independent research and analysis activities and coordinated with business partners. Performed research tasks with a view to gather data pertaining to industry, company and customer trends
- Used Agile methodology and worked with developers and responsible to assign the task.
- Constructing and deconstructing the SQL queries as per the business requirement. Created predefined queries for department performance reports.
- Research new emerging technologies for data storage and reporting analysis, implementing and promoting best practices.
- Proficient in maintaining and configuring Microsoft SQL Server database instance with SQL Server Integration (SSIS) and Reporting Service (SSRS)
- Extracted data from Oracle using SQL Pass-through facility and generated reports.
- Importing/exported reports from/ to CVWS tool for employment verification.
- Prepared invoices for different clients based on contracted prices and supported to the client for ad-hoc reports.
- Updated Databases on daily basis for multiple clients and maintain records for incoming-outgoing cases
- Proficient in Exporting and importing the reports with FTP.
- Created advanced formulas, Vlookup, pivot tables, charts, graphs and custom reports.
Confidential
Data Engineer
Technical Environment: SAS/Base, SAS/STAT, SAS/GRAPH, SAS/MACROS, SAS/ACCESS, SAS/CONNECT, SAS/ODS, SAS/SQL, MS-Office, Mainframe/UNIX.
Responsibilities:
- Understand and update the existing code or create new code by using BASE/SAS, SAS/MACROS, and SAS/SQL to implement new requirements.
- Expertise in statistical modeling and analysis on study data by utilizing appropriate statistical methods.
- Extracted data from Oracle using SQL Pass-through facility and generated reports.
- Modified existing SAS programs and created new programs using SAS macro variables to improve ease and speed of modification as well as the consistency of results.
- Generate Reports in user required format by using ODS and PROC REPORT.
- SAS data sets are validated using SAS procedures like PROC MEANS, PROC FREQUENCY, and PROC UNIVARIATE.
- Analyzed and implemented the code and table changes to improve performance and enhance data quality.
- Reading data from different sources like .csv, excel, tab delimited files.
- Compare the source data with historical data to get some statistical analysis.
- Perform transformations like Merge, Sort and Update to get the data in the required format.
- Create the understandable EXCEL template to start dad mapping for each line of business. Extensively utilized SAS/BASE, SAS/SQL, and SAS/MACRO.
- Involved in project review meetings with respective Business SME's.