Cross-item level metadata contains descriptions and expectations about the joint use of two or more data elements for data quality assessments. A distinct table is necessary as there is a 1:n relationship of potential checks to any single data element. This means that several checks are possible for each data element.
Defines the list of variables to be assessed for a defined check. The list of variables must be a string in which each variable is separated by a pipe character (|).
Specifies a sentence that explains in clear language what is done to improve the readability of the report. For example, the label for a check may be as follows.
VARIABLE_LIST | CHECK_LABEL | |
---|---|---|
12 | v00004 | v00005 | Blood pressure checks |
Sets the term for the contradiction checks. The input must be readable logic in REDCap format. See the contradictions dataquieR function for an explanation on the definition of contradictions.
For instance, we may define contradiction checks for the age, sex and smoking variables as below.
CHECK_LABEL | CONTRADICTION_TERM | |
---|---|---|
1 | Age follow-up | [AGE_1] < [AGE_0] |
2 | Sex follow-up | [SEX_1] <> [SEX_0] |
8 | Smokers inconsistency | ([SMOKING_0] = “yes”) and ([SMOKE_SHOP_0] = “never”) |
Establishes whether the contradiction is logical or empirical.
For example, for the contradictions defined above, we may specify the following contradiction types.
CHECK_LABEL | CONTRADICTION_TERM | CONTRADICTION_TYPE | |
---|---|---|---|
1 | Age follow-up | [AGE_1] < [AGE_0] | LOGICAL |
2 | Sex follow-up | [SEX_1] <> [SEX_0] | LOGICAL |
8 | Smokers inconsistency | ([SMOKING_0] = “yes”) and ([SMOKE_SHOP_0] = “never”) | EMPIRICAL |
Sets the type of check for multivariate assessments of outliers.
VARIABLE_LIST | CHECK_LABEL | MULTIVARIATE_OUTLIER_CHECKTYPE | |
---|---|---|---|
12 | v00004 | v00005 | Blood pressure checks | Hubert |
Specifies the number of rules that must be violated for an observation to be flagged as an outlier. It applies to all potential assessment rules for multivariate outliers.
CHECK_LABEL | MULTIVARIATE_OUTLIER_CHECKTYPE | N_RULES | |
---|---|---|---|
12 | Blood pressure checks | Hubert | 1 |
Specifies the allowable range of an association. The inclusion of the endpoints follows standard mathematical notation using round brackets for open intervals and square brackets for closed intervals. Values must be separated by a semicolon.
The metadata excerpt below shows an example of a possible interval for an association.
CHECK_LABEL | ASSOCIATION_RANGE | |
---|---|---|
12 | Blood pressure checks | (0.7;) |
The metric underlying the association in
ASSOCIATION_RANGE
. The input is a string that specifies the
analysis algorithm to be used.
For instance, in the example below, Pearson association is specified.
CHECK_LABEL | ASSOCIATION_RANGE | ASSOCIATION_METRIC | |
---|---|---|---|
12 | Blood pressure checks | (0.7;) | Pearson |
The allowable direction of an association. The input is a string that can be either “positive” or “negative”.
In the following example, a positive association is expected for the blood pressure variables.
CHECK_LABEL | ASSOCIATION_METRIC | ASSOCIATION_DIRECTION | |
---|---|---|---|
12 | Blood pressure checks | Pearson | positive |
The allowable form of association. The string specifies the form based on a selected list.
In the metadata excerpt below, a linear association form is expected for the blood pressure accuracy tests.
CHECK_LABEL | ASSOCIATION_METRIC | ASSOCIATION_FORM | |
---|---|---|---|
12 | Blood pressure checks | Pearson | linear |
Specifies the type of reliability or validity analysis. The string specifies the analysis algorithm to be used, and can be either “inter-class” or “intra-class”.
In the following example, an inter-class reliability is defined for the blood pressure variables.
CHECK_LABEL | REL_VAL | |
---|---|---|
12 | Blood pressure checks | inter_class |
Defines the measurement variable to be used as a known gold standard. Only one variable can be defined as the gold standard.
Defines the pre-processing steps that can be applied before checking the contradiction rules. The following possible options can be specified:
LABEL
: the value levels will be replaced by the
value labels;
MISSING_NA
: missing codes will be replaced by
NAs;
MISSING_LABEL
: missing codes will be replaced by
their labels;
MISSING_INTERPRET
: missing codes will be replaced by
the corresponding AAPOR codes;
LIMITS
: hard limits violations will be replaced by
NAs.
More than one option can be specified using the pipe symbol \(|\), e.g. LABEL | LIMITS
However, the three MISSING_
options are mutually
exclusive.
If the column is not present in the metadata or it is empty, the default is to use the three options: LABEL (if VALUE_LABELS are specified in the item level metadata), MISSING_NA, and LIMITS.
The user can disable all options writing only the pipe symbol
|
in this column.