We provide IT Staff Augmentation Services!

Data Scientist/data Engineer Resume

SUMMARY

  • Having working experience on various IT Systems & application using open source technologies involving Analysis, Design, Coding, Testing, Implementation and Training.
  • Outstanding preeminence in Data extraction, Data cleaning, Data Loading, Statistical Data Analysis, Exploratory Data Analysis, Data Wrangling, Predictive Modeling using R, Netezza, AWS, Azure, Python and Data visualization using Kibana(ELK stack), Databricks, Grafana, SSRS, Power BI.
  • Excellent skills in state - of-the-art technology of client server computing with good understanding of Big Data Technologies
  • Strong experience in Data Analysis, Data Cleaning, data Validation, Data Verification, and Identification of Data Mismatch.
  • Expert in Python libraries such as NumPy, SciPy for mathematical calculations, Pandas for data preprocessing/wrangling, Mat plot, Seaborn for data visualization, Theano, TensorFlow, Keras for Deep leaning and NLTK, Re for NLP, statsmodel for Time series forecasting
  • Developed search activity dashboards that provides an overview of clinical messages activity and count of message segment per facility to better understand flow on mirth connect interface, in real-time and over time.
  • Experience writing complex SQL queries, procedures, triggers to obtain filtered data for various RDBMS such as SQL Server, PL/SQL, TSQL, MySQL, PostgreSQL, Teradata and Oracle, NoSQL databases such as MongoDB, HBase and Cassandra to handle unstructured data.
  • Expert in using visualization tools like Kibana, Grafana (Bar charts, trees, histograms, sets, calculated fields, combing dimensions, publishing workbooks, Dashboard, logs, alerts), ggplot2 (details graphs for machine learning models), Power BI (visual level filters, page level filters, reporting, dashboards) for creating dashboards and providing beautiful insights.
  • Developed Complex database objects like Stored Procedures, Functions, Packages and Triggers using SQL and PL/SQL.
  • Implement systems that are highly available, scalable, and self-healing on the AWS platform
  • Deploying, managing, and operating scalable, highly available, and fault tolerant systems on AWS
  • Proficient in data analysis, data validation, data cleansing, data verification and entire data science project life cycle.
  • Experience in Oracle supplied packages,Dynamic SQL, Records and PL/SQL Tables.
  • Experience in working in Reporting Services, Power BI, SSRS using MS SQL Server,Netezza and in supporting services Mirth Connect technology of the analysis services.
  • Implemented Power BI to create various analytical dashboards that depicts critical KPI’s.
  • Created Power BI gateway different source to matching analysis criteria and further modelling for power pivot and power view
  • Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupyter Notebook 4.X, R 3.0 (ggplot2, Caret) and Excel
  • Expertise and Vast knowledge of Enterprise Data Warehousing including Data Modeling, Data Architecture, Data Integration (ETL/ELT) and Business Intelligence.
  • Highly proficient and experience in working on software development and data modelling projects using JavaScript, Python, MSSQL, PostgreSQL, Elasticsearch, Logstash, Kibana, Grafana, Power BI, SAS, HL7, Mirth Connect andPowerShell on windows and Unix/Linux machines.
  • Progressively involved in accessing ServiceNow/Remedy Force tool and other internal issue trackers for project development, SDLC, GIT, Agile methodology and SCRUM process
  • Good experience in Production Support, identifying root causes, Troubleshooting, Submitting Change Controls and deploying the changes
  • Experienced in handling all the domain and technical interaction with application users, analyzing client business processes, documenting business requirements.
  • Possess strong analytical and problem-solving skills and have a quick learning curve. Committed team player and capable of working on tight project delivery schedules and deadlines.
  • Proficient in handling complex processes using SAS/ Base, SAS/ SQL, SAS/ STAT SAS/Graph, Merge, Join and Set statements, SAS/ ODS.

TECHNICAL SKILLS

Programming Languages: Python (NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn) 2.x/3.x, R 3.x, MS SQL, JavaScript, C, XML, HTML, Rest API, HL7, VBA, PowerShell, SAS 9.X

Querying Languages: MS SQL, PostgreSQL, Mango DB, My SQL, No SQL, PL/SQL, TSQL

Data Visualization: Kibana, Grafana, Python (Matplotlib, Seaborn), R(ggplot2), Power BI, SSRS tools: and technologies: Mirth Connect, Mirth Match, Mirth Result, GitHub, Jupyter Notebooks, VS Code

SDLC Methodologies: Agile, Scrum, Waterfall

Platforms: Windows, Unix/Linux

PROFESSIONAL EXPERIENCE

Confidential

Data Scientist/Data Engineer

Technical Environment: Mirth (Match 1.5, Result 2.1, Connect 3.3/3.4/3.5), HL7, MSSQL Server 2008/2012, PostgreSQL 9.X, Azure SQL, MongoDB, SSRS, PowerBI, ELK 6.3, Machine Learning, JavaScript, HL7v6, PowerShell, REST API, SOUP, HTML, XML, TortoiseGit, Python, R, Jupyter Notebooks, VS Code, ServiceNow, BMC Remedy, Microsoft office, Excel ODBC, Microsoft Azure, Windows.

Responsibilities:

  • I’m responsible for being the technical point of contact to upper management, business analysts, project management, and miscellaneous other groups for the proactive monitoring project.
  • Responsible for SQL Server Reporting Services Planning, Architecture, Training, Support, and Administration in Development, Test and Production Environments.
  • Developed/modified and bug fixing on Mirth Interface/channels for required changes.
  • Created and maintaining probabilistic matching rules in Mirth Match EMPI and use Match's match-quality reporting to rate matching effectiveness and look for areas for improvement.
  • Created tooling needed to perform data-steward activities against Mirth Results' clinical data repository (stored in PostgreSQL). Such activities include Detecting missing values from contributed systems and suggesting ways to solve and prevent the issue.
  • Proficient in SQL databases like MySQL, MS SQL, and PostgreSQL.Worked with UNIX/Linux including commands and shell scripting
  • Generated server-side PL/SQL, TSQL scripts for data manipulation and validation and materialized views for remote instances.
  • Created PL/SQL stored procedures, functionsandpackages for moving the data from staging area to data mart.
  • Experience with machine learning platforms like Tensorflow, Jupyter notebook, Scikit or Apache Spark, Hive, Kafka
  • Strong knowledge of clinical trials and other types of clinical and translational research
  • Evaluating pre-clinical and translational work for the purpose of generating early clinical development plan and Investigational New Drug applications
  • Performed SQL and PL/SQL tuning and Application tuning using various tools like EXPLAIN PLAN, SQL*TRACE, TKPROF and AUTOTRACE.
  • Proficient in Linux, including python scripting and deployment of applications in an enterprise environment
  • Developed Mirth Connect interfaces independently for new requirements and fixed bugs for existing channels as per the client’s request with the help of JavaScript, JSON, XML, Database connections/queries and REST API’s.
  • Have good experience of production support to maintain and resolve the issues while on call rotation.
  • Developed Git pre-commit/post-commit hooks to automate the Mirth Connect Configuration keys and values.
  • Worked very closely with healthcare interoperability and messaging standards, like HL7 2.x, HL7 3.x, FHIR, CCDA, X12 HIPAA, Blood Bank, Microbiology, Transcription, Laboratory, Pathology, Radiology, Patient Access, etc
  • Experience with Azure SQL and cloud data solutions such as HD Insight, Databricks, SQL DW, Data Factory.
  • Worked on large datasets over a millions of records in it with Pythons Scripts to make Visualizations and Dashboards on Kibana and Grafana
  • Experience in using various packages in Python-like ggplot2, caret, dplyr, Rweka, gmodels, twitter, NLP, Reshape2, rjson, plyr, pandas, NumPy, Seaborn, SciPy, Matplotlib, sci-kit-learn, Jupyter Notebooks, VS Code, Beautiful Soup.
  • Develop integration with Jupyter Notebook
  • Built the model Python for the model development and Dash by plotly for visualizations.
  • Experience in python, Jupyter, Scientific computing stack (numpy, scipy, pandasand matplotlib)
  • Have a Strong development experience of SQL, REST APIs, Web Services, and Message Queues
  • Analysis between Mirth result and mirth match data to find the duplicates patients (Demographic) records and finding the duplicates from Mirth match threshold score.
  • Developed Power BI reports and dashboards from multiple data sources using data blending.
  • Explored data in a variety of ways and across multiple visualization using Power BI. Strategic expertise in design of experiments, data collection, analysis and visualization.
  • Created Dashboards and visualizationsfrom HIPAA auditing data by doing analysis with threshold values.
  • Created Machine Learning Jobs from Kibana X-Pack ML component for anomalies and watcher for ML jobs for root-cause analyses to predict KPI Metrics and errors.
  • Created Machine Learning Modules with the help of ELK and Python for representing at client site.
  • Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL, Databricks.
  • Created reports from HIPAA database from audit reports and number of times accessed patient data.
  • Responsible for enabling analysis through producing information products and is involved in the research and development efforts. Traditional programming (SAS, SQL, R, and PostgreSQL) and business intelligence (i.e. ELK) experience for creating dashboards
  • Generated Summary Tables, Listings and Graphs to support clinical study reports by using Base SAS, SAS/STAT, SAS/MACRO SAS EG and SAS/GRAPH.
  • Imported the data into SAS from Excel Spreadsheet, and various delimited files using Proc Import to generate reports as per the directions of the higher management
  • Experience installing and developing on ELK on production and test servers.
  • Used MongoDB for local Mirth Connect storage.
  • Extracting and analyzing raw data from Match's PostgreSQL database and Serving as SME for IHE PIX/PDQ feeds with other exchange participants.
  • Determining which feed and Mirth Results data are useful for metrics and developed Mirth Connect channel to automate metric-shipping feeds to Elasticsearch
  • Creating Individual message count visualizations using Kibana to show messages statistics over a selected period and utilize reporting via Kibana
  • Developed Rest API with the help of Python for Kibana to create visualizations and dashboards.
  • Monitoring and logging tools such as, ELK Stack (Mirth Connect, Elasticsearch and Kibana).
  • Integrated Clinical messages with Elasticsearch and Kibana for real-time log analytics (performance monitoring & alerting).
  • Indexing and search/query substantial number of documents inside Elasticsearch and created a Kibana dashboard for sanity-checking the data and Working with the Kibana dashboard for the overall build status with drill-down features
  • Data Discovery, visualizations, and dashboards are created in Kibana for quick analysis on healthcare data.
  • Created SSRS Consolidated report for Audit data and plotted graph for calculated Threshold value and creating reports utilizing SQL Server Reporting Services (SSRS)
  • Developing Dynamic Mirth Connect channel with the help of JavaScript, XML, SQL Server to import live data from SQL Server and PostgreSQL database to Elasticsearch and performed visualizations.
  • Created Kibana dashboards for metrics which need to be trending for operational and business activities (message volume, ordered tests, etc.)
  • Expertise in reporting tools like Grafana, Kibana (ELK) for setting up charts and graphs for better visual representation of test results.
  • Integrating with Grafana and determining how to best visualize operational metrics for live feeds from Mirth Connect
  • Demonstrated experience developing Grafana dashboards from Clinical messages by using MSSQL, MySQL, PostgreSQL Data.
  • ETL process for continuously bulk importing clinical data from SQL server and PostgreSQL Database into Elasticsearch and into Spreadsheet.
  • Developed python scripts and Grafana dashboards to automate the monitoring tasks and fire alerts if data anomalies are detected
  • Created an ODBC connection for Excel to pull the data from PostgreSQL and SQL Server.
  • Experienced in using ServiceNow and BMC Remedy for creating tickets like Request, Incident, Change management.

Confidential

Data Scientist

Technical Environment: BASE SAS 9.2, SAS EG 7.1, SAS/Connect, SAS/STAT, SAS/Graph, SAS/ETL, SAS Macros, SAS SQL, SAS ODS,VB, Python, R, Jupyter Notebooks, VS Code,MS Excel, MS Access, UNIX/Windows.

Responsibilities:

  • Created and extracted Clinical data tables from SQL, Text, CSV, and Excel files to SAS using SAS tools like SAS/ACCESS, Infile, LIBNAME engine.
  • Scripted SAS programs with the use of Base SAS and SAS/MACROS for ETL purposes also transferring and converting finance data from Excel files to another to be used for further analysis and created global and local variables.
  • Queried and retrieved data from SQL Server database to get the sample dataset.
  • Worked as a Data Modeler/Analyst to generate Data Models using Erwin and developed a relational database system.
  • Write samples and guides using Jupyter Notebook
  • Good experience in working with various Python Integrated Development Environments like PyCharm, Spyder, Jupyter Notebook, Anaconda
  • Analyzed the business requirements of the project by studying the Business Requirement Specification document.
  • Participated in the installation of SASon Linux platform
  • Used Data Quality validation techniques to validate Critical Data elements (CDE) and identified many anomalies. Extensively worked on statistical analysis tools and adept at writing code in Advanced Excel, R and Python.
  • Predominant practice of Python Matplotlib package and Tableau to visualize and graphically analyses the data. Data pre-processing, Splitting the identified data set into Training set and Test set using other libraries in python.
  • Derived data from relational databases and perform complex data manipulations and also conducted extensive data checks to ensure data quality.
  • Created SQL tables with referential integrity and developed queries using SQL, SQL*PLUS,andPL/SQL.
  • Formulated sophisticated visualization of analysis output for business users. Publish results and address constraints or limitations with business partners.
  • Created customized figures for clinical study reports as well as ad-hoc requests using SAS procedures Gplot and Gchart.
  • Designed ETL Process Using SAS to extract the data from flat files, excel, and csv.
  • Superintended usage of Python NumPy, SciPy, Pandas, Matplot, Stats packages to perform dataset manipulation, data mapping, data cleansing and feature engineering. Built and analyzed datasets using R and Python.
  • Researched, evaluated, architected, and deployed new tools, frameworks, and patterns to build sustainable Big Data platforms for the clients.
  • Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, NumPy.
  • Do trend analysis and check interaction p-values based on separated and combined dataset, using results of regression analysis, and make the programs reusable using SAS MACRO.
  • Implemented waterfall method of Software Development Life Cycle (SDLC) methodology for design, development, and implementation and testing of various SAS modules.
  • Worked with financial data like analyzing the salary sheets, inpatient and outpatient billing.
  • Analyzed clinical trial data and generated tables, listings, and graphs using MACRO, GRAPH, and SQL.
  • Generated Summary Tables, Listings and Graphs to support clinical study reports by using Base SAS, SAS/STAT, SAS/MACRO SAS EG and SAS/GRAPH.
  • Implemented CDISC SDTM and ADaM standards as well as generating tables, figures, and listings to support the statistical analysis of clinical trials data.
  • Summarize results of complex data and statistical analysis, and present reports to laboratory to support decisions on future operations and manufactory orientation.
  • Built Macros and create macro variables using %Macro and %Mend, and DATA NULL to help generate analysis data sets and create specified structure.

Hire Now