The functions com_qualified_item_missingness and com_qualified_segment_missingness allow computing Non-response rate and Refusal rate. These are Qualified missingness indicators in the Completeness dimension.

Missingness rates can be calculated for either items or segments of the data when missing codes and intepretations for these are provided in the metadata. In this example, the column MISSING_LIST_TABLE in the item level metadata contains the name of another table where missing codes are given, either for the item or the segment:

VAR_NAMES MISSING_LIST_TABLE
id NA
exdate NA
sex NA
age NA
obs_bp missing_table
dev_bp missing_table
sbp1 NA
sbp2 NA
dbp1 NA
dbp2 NA
obs_soma missing_table
height missing_table
dev_length missing_table
weight missing_table
dev_weight missing_table
waist missing_table
obs_int missing_table
school missing_table
family missing_table
smoking missing_table
stroke missing_table
myocard missing_table
diab_known missing_table
diab_age missing_table
contraception missing_table
income missing_table
hdl missing_table
ldl missing_table
cholesterol missing_table
seg_part_intro segment_missing_table
seg_part_somatometry segment_missing_table
seg_part_interview segment_missing_table
seg_part_laboratory segment_missing_table


The missing tables contain the following information:

CODE_VALUE CODE_LABEL CODE_INTERPRET CODE_CLASS
99800 JUMP - other reason NE JUMP
99801 JUMP - not applicable NE JUMP
99802 JUMP - design change O JUMP
99900 Missing - other reason O MISSING
99901 Missing - refusal R MISSING
99902 Missing - not assessable NC MISSING
99903 Missing - technical problem O MISSING
99904 Missing - not available (material) O MISSING
99905 Missing - not usable (material) O MISSING
99906 Missing - reason unknown UO MISSING
99907 Missing - optional value NE MISSING
99908 Deleted - other reason O MISSING
99909 Deleted - contradiction O MISSING
99910 Deleted - value outside limits O MISSING
99912 Value above detection limit O MISSING
99913 Value below detection limit O MISSING
99914 Data management ongoing O MISSING


Item level

We can use com_qualified_item_missingness to calculate missingness rates per item as follows:

# Load dataquieR
library(dataquieR)

# Load data
sd1 <- prep_get_data_frame("ship")

# Load metadata
file_name <- system.file("extdata", "ship_meta_v2.xlsx", package = "dataquieR")
prep_load_workbook_like_file(file_name)
meta_data_item <- prep_get_data_frame("item_level") # item_level is a sheet in ship_meta_v2.xlsx

# Apply indicator function
qual_miss <-    com_qualified_item_missingness(
  study_data = sd1, 
  meta_data  = meta_data_item, 
  label_col  = "LABEL"
)

An overview is given in the output:

qual_miss$SummaryData
Variables Non-response rate (Percentage (0 to 100)) Refusal rate (Percentage (0 to 100))
OBS_SOMA_0 0.09% 0%
BODY_HEIGHT_0 0% 0%
DEV_HEIGHT_0 0.09% 0%
BODY_WEIGHT_0 0% 0%
DEV_WEIGHT_0 0.09% 0%
WAIST_CIRC_0 0% 0%
OBS_INT_0 0.05% 0%
SCHOOL_GRAD_0 5.25% 0%
RELATION_STATUS_0 3.11% 0%
SMOKING_STATUS_0 3.16% 0%
STROKE_YN_0 3.25% 0.14%
MYOCARD_YN_0 3.44% 0.32%
DIABETES_KNOWN_0 0.32% 0%
DIAB_AGE_ONSET_0 3.89% 0%
CONTRACEPTIVA_EVER_0 0.45% 0%
HOUSE_INCOME_MONTH_0 5.39% 4.64%
CHOLES_HDL_0 0% 0%
CHOLES_LDL_0 0% 0%
CHOLES_ALL_0 0% 0%
SEG_PART_INTRO 0% 0%
SEG_PART_SOMATOMETRY 0.19% 0.05%
SEG_PART_INTERVIEW 100% 0.32%
SEG_PART_LABORATORY 0.79% 0.05%
OBS_BP_0 0.14% 0%
DEV_BP_0 0.14% 0%


Segment level

To calculate missingnes rates per segment, we use com_qualified_segment_missingness:

meta_data_segment <- prep_get_data_frame("segment_level") # load the  segment_level sheet from ship_meta_v2.xlsx

# Call the indicator function
seg_miss_qual <- com_qualified_segment_missingness(
    study_data = sd1, 
    meta_data = meta_data_item, 
    meta_data_segment = meta_data_segment, 
    label_col = "LONG_LABEL"
)

The output contains the elements SegmentTable and SegmentData. SegmentTable is the abbreviated output for reporting, while SegmentData shows the missingnes rates:

seg_miss_qual$SegmentData
Segment Non-response rate (Percentage (0 to 100)) Refusal rate (Percentage (0 to 100))
INTRO 0% 0%
SOMATOMETRY 0.19% 0.05%
INTERVIEW 0.51% 0.32%
LABORATORY 0.79% 0.05%


Back to Example data quality assessment of SHIP data