Click to enlarge

Overview

DataCheQC is an interactive application based on the R Shiny framework developed for the purposes of performing quality control (QC) checks on pharmacometrics datasets, and thereby supporting the implementation of model-informed drug development.

Features include visual inspection of variables and data entries for errors and/or anomalies, and ensuring structural integrity through comparison with a dataset specification file.

The open-source app, which requires no programming knowledge to operate, also allows the user to collect all findings into a summary QC Report which can be downloaded directly from the app.

For more information and background on the creation of this project, check out the associated paper published in CPT: Pharmacometrics & Systems Pharmacology.

The QC Workflow



Features


Dataset and Specification Format

  • A Generalized Format for both dataset and specification file

  • Allows app outputs to be software-agnostic

  • Enables working with multiple file types (.csv, .sas7bdat, .doc, .xlsx)

Data Handling

  • Automatic detection of doses, observations, and covariates
  • Impute missing continuous covariate values
  • Run data cleaning functions on the dataset


Dataset Quality Checks

  • Compare the dataset to the specification file’s general table
  • Compare dataset observations with the specification file’s event table
  • Filter out optional variables
  • Check for missing or incorrectly defined values
  • Check that corresponding columns aligned (e.g., Time and Nominal Time, Cohort and Dose, Value and Unit)
  • Collect all findings into an editable QC report
  • Download the final report as a .docx document
  • Continue previous work by loading a pre-existing interim report


Data Visualization and Summary Tables

  • For Each Observation:

    • Spaghetti Plots
    • Median Range Plots
    • Individual Plots
  • For Continuous/Categorical Covariates:

    • Covariate Distribution
    • Covariate Correlations
  • For Timings:

    • Plots comparing Nominal vs Actual Time
    • Dosing Schedule plots
    • Sampling Schedule Plots
  • Summary Tables:

    • Summary of Continuous & Categorical Covariates
    • Summary of Duplicated Time-Value Pairs
    • Summary of Observations
  • All of the above can be downloaded separately or all together as .zip archive which will be built in the background

  • Hover over the info icon - - which appears next to most features to show an informative tooltip about that feature and what to look for during QC.


Getting Started


To app is freely available online on shinyapps.io and its source code is hosted as a public repository on Github.


Local Installation

If you wish to run a local version of the app, run the following line in your R IDE:

shiny::runGitHub("DotanOr/DataCheQC")

Alternatively, you can download the code as a .zip archive and extract the files, then run the app using:

shiny::runApp("path/to/directory")

Note:
As the app has several dependencies, running the code locally may not work right away depending on your R environment.

If you have issues with conflicting package versions, run the following to try and resolve those discrepancies:

source("https://github.com/DotanOr/DataCheQC/blob/main/www/load_libs.R?raw=TRUE")