Dataset and Specification Format

The Generalized Dataset Format

To fully leverage app functionalities and process many different types of data from different sources, a standardized dataset structure needs to be assumed. Schmidt et al. (2014) proposed the “Generalized Dataset” as a solution for a pharmacometrics dataset format which is compound- and indication-independent, not specific to a particular type of pharmacometrics analysis, and not tied to a specific nonlinear mixed-effect (NLME) software.

The format can handle various types of data, such as demographics, pharmacokinetics (PK), and pharmacodynamics (PD), and requires minimal manipulation of Clinical Data Interchange Standards Consortium (CDISC) data to compile a resulting dataset. The generalized dataset can be seamlessly converted to an NLME dataset suitable for analysis in NONMEM and Monolix using available tools.

The high rate of success and efficiency that were achieved while working with the generalized dataset across different drug development programs and organizations influenced the choice of this particular format. Requirements for the datasets to be used with the DataCheQC app are detailed in the table below:

Structure of the Generalized Dataset Format
Name	Description	Type
USUBJID	Unique subject identifier	String
COMPOUND	Name of the investigational compound	String
TRTNAME	Name of actual treatment given to subject	String
TIMEUNIT	Unit of all numeric time in the dataset	String
NT	Nominal time of event relative to the first dose administration	Numeric
TIME	Actual time of event relative to the first dose administration	Numeric
TYPENAME	Unique type of event (e.g., dose, PK, PD, continuous covariate, categorical covariate, adverse event, concomitant medication)	String
NAME	Unique short name of event	String
VALUE	Value of event defined by NAME	Numeric
VALUETXT	Text version of value (if applicable)	String
UNIT	Unit of the value reported in the VALUE column	String
ROUTE	Route of administration	String

Click here to download an example dataset in this format

The Specification File Format

The Specification File should be compiled in a Word or Excel document. The file must contain two distinct tables, either in separate Word pages or Excel sheets: the General Table and the Event Table.

The General Table

The General Table comprises a listing of the expected non-study specific variables (i.e., those listed in the table above), along with their description, data type (e.g., numeric, string, date, time), derivation, label and more. It can also contain optional elements specifying whether the described variable is required or not for inclusion in the dataset and/or whether it needs input from the pharmacometrician following each data update.

Structure of the Specification File General Table
Name	Label	Type	Comments	Required (Optional)	Pharmacometrician Input (Optional)
The name of the variable (e.g., USUBJID, STUDY, TIME)	Description of the variable (e.g., Subject ID, Study name, Actual time of assessment)	The type of variable (e.g., numeric, string, date-time)	Comments regarding the variable and its derivation (e.g., the unique subject ID should be composed of the study name plus a serial number, separated by dashes)	Whether this variable’s inclusion in the dataset is required or optional	Whether a pharmacometrician’s input and review is required for the variable after each data update

The Event Table

“Event” refers to a row in the two-dimensional dataset, distinguished by its NAME and VALUE/VALUETEXT, which can reflect the dosing, PK concentrations, PD observations, efficacy or safety readouts, baseline or time-dependent covariates, adverse events, co-medications, or any other relevant observation. The Event Table consequently describes the various events in the dataset, accompanied by their description, values, units, and, when applicable, limits of quantification.

Structure of the Specification File Event Table
NAME	VALUE	VALUETXT	UNIT	TYPENAME	LLOQ	ULOQ
Name of the event	Indicates whether the observed event is numeric (i.e., ‘[Num]’), or otherwise defines the numeric mapping of the event’s text values	If the VALUE is not numeric, the categories for the event	Unit of measurement of the event (if applicable)	Type of the event (Dose, PK, PD, covariate, adverse event, etc.)	Lower level of quantification of the event (if applicable)	Upper level of quantification of the event (if applicable)

Click here to download an example of a full specification file containing both tables

Dataset and Specification Format

Or Dotan

The Generalized Dataset Format

The Specification File Format

The General Table

The Event Table