Definition

Information in a data collection that is missing due to a specified reason.

Explanation

This indicator provides an overview on the frequency of coded categories for missing data, such as missing by design, refusal, technical error, met exclusion criteria etc. Depending on the coding of missing values, implementations may provide insights into why data is missing.

Example

In total 1000 subjects participated in a health study consisting of several examinations. One of them is a magnetic resonance imaging (MRI) substudy. Nonparticipation in the MRI substudy is coded as follows:

Code	Missing type	n	Percent
1	examination conducted	640	64%
11	refusal	200	20%
12	met exclusion criterion	100	10%
13	no show	20	2%
14	not examined due to technical reason	10	1%
15	no examination data possible	30	3%

Percentages for categories related to missing due to a specified reason are provided for the codes in the fourth column.

Note that no eligibility issues are taken into account. Computing DQ2003 refusal rate would provide a different result as ineligible individuals do not enter the denominator. Therefore the refusal rate would be 22% while the corresponding percentage with this indicator is 20%

Guidance

Obtaining an overview on the count and percentage of missing observations provides important insights on the potential reasons underlying missing values. Context knowledge about the missing categories is of particular relevance to make inferences about potential missing mechanisms (Schafer & Graham 2002).

Interpretation

The higher the number or percentage of occurrences the lower the data quality.

Implementations

Literature

Stausberg, J., D. Nasseh and M. Nonnemacher (2015). “Measuring data quality: A review of the literature between 2005 and 2013.” Stud Health Technol Inform 210: 712-716.
Schafer JL, Graham JW. Missing data: our view of the state of the art. Psychol Methods. 2002;7(2):147-177.

Indicator “Missing due to specified reason”