The functions com_qualified_item_missingness
and
com_qualified_segment_missingness
allow computing Non-response rate and Refusal rate. These are Qualified missingness indicators in the
Completeness dimension.
Missingness rates can be calculated for either items or segments of
the data when missing codes and
intepretations for these are provided in the metadata. In this
example, the column MISSING_LIST_TABLE
in the item level
metadata contains the name of another table where missing codes are
given, either for the item or the segment:
VAR_NAMES | MISSING_LIST_TABLE |
---|---|
id | NA |
exdate | NA |
sex | NA |
age | NA |
obs_bp | missing_table |
dev_bp | missing_table |
sbp1 | NA |
sbp2 | NA |
dbp1 | NA |
dbp2 | NA |
obs_soma | missing_table |
height | missing_table |
dev_length | missing_table |
weight | missing_table |
dev_weight | missing_table |
waist | missing_table |
obs_int | missing_table |
school | missing_table |
family | missing_table |
smoking | missing_table |
stroke | missing_table |
myocard | missing_table |
diab_known | missing_table |
diab_age | missing_table |
contraception | missing_table |
income | missing_table |
hdl | missing_table |
ldl | missing_table |
cholesterol | missing_table |
seg_part_intro | segment_missing_table |
seg_part_somatometry | segment_missing_table |
seg_part_interview | segment_missing_table |
seg_part_laboratory | segment_missing_table |
The missing tables contain the following information:
CODE_VALUE | CODE_LABEL | CODE_INTERPRET | CODE_CLASS |
---|---|---|---|
99800 | JUMP - other reason | NE | JUMP |
99801 | JUMP - not applicable | NE | JUMP |
99802 | JUMP - design change | O | JUMP |
99900 | Missing - other reason | O | MISSING |
99901 | Missing - refusal | R | MISSING |
99902 | Missing - not assessable | NC | MISSING |
99903 | Missing - technical problem | O | MISSING |
99904 | Missing - not available (material) | O | MISSING |
99905 | Missing - not usable (material) | O | MISSING |
99906 | Missing - reason unknown | UO | MISSING |
99907 | Missing - optional value | NE | MISSING |
99908 | Deleted - other reason | O | MISSING |
99909 | Deleted - contradiction | O | MISSING |
99910 | Deleted - value outside limits | O | MISSING |
99912 | Value above detection limit | O | MISSING |
99913 | Value below detection limit | O | MISSING |
99914 | Data management ongoing | O | MISSING |
We can use com_qualified_item_missingness
to calculate
missingness rates per item as follows:
# Load dataquieR
library(dataquieR)
# Load data
sd1 <- prep_get_data_frame("ship")
# Load metadata
file_name <- system.file("extdata", "ship_meta_v2.xlsx", package = "dataquieR")
prep_load_workbook_like_file(file_name)
meta_data_item <- prep_get_data_frame("item_level") # item_level is a sheet in ship_meta_v2.xlsx
# Apply indicator function
qual_miss <- com_qualified_item_missingness(
study_data = sd1,
meta_data = meta_data_item,
label_col = "LABEL"
)
An overview is given in the output:
qual_miss$SummaryData
Variables | Non-response rate (Percentage (0 to 100)) | Refusal rate (Percentage (0 to 100)) |
---|---|---|
OBS_SOMA_0 | 0.09% | 0% |
BODY_HEIGHT_0 | 0% | 0% |
DEV_HEIGHT_0 | 0.09% | 0% |
BODY_WEIGHT_0 | 0% | 0% |
DEV_WEIGHT_0 | 0.09% | 0% |
WAIST_CIRC_0 | 0% | 0% |
OBS_INT_0 | 0.05% | 0% |
SCHOOL_GRAD_0 | 5.25% | 0% |
RELATION_STATUS_0 | 3.11% | 0% |
SMOKING_STATUS_0 | 3.16% | 0% |
STROKE_YN_0 | 3.25% | 0.14% |
MYOCARD_YN_0 | 3.44% | 0.32% |
DIABETES_KNOWN_0 | 0.32% | 0% |
DIAB_AGE_ONSET_0 | 3.89% | 0% |
CONTRACEPTIVA_EVER_0 | 0.45% | 0% |
HOUSE_INCOME_MONTH_0 | 5.39% | 4.64% |
CHOLES_HDL_0 | 0% | 0% |
CHOLES_LDL_0 | 0% | 0% |
CHOLES_ALL_0 | 0% | 0% |
SEG_PART_INTRO | 0% | 0% |
SEG_PART_SOMATOMETRY | 0.19% | 0.05% |
SEG_PART_INTERVIEW | 100% | 0.32% |
SEG_PART_LABORATORY | 0.79% | 0.05% |
OBS_BP_0 | 0.14% | 0% |
DEV_BP_0 | 0.14% | 0% |
To calculate missingnes rates per segment, we use
com_qualified_segment_missingness
:
meta_data_segment <- prep_get_data_frame("segment_level") # load the segment_level sheet from ship_meta_v2.xlsx
# Call the indicator function
seg_miss_qual <- com_qualified_segment_missingness(
study_data = sd1,
meta_data = meta_data_item,
meta_data_segment = meta_data_segment,
label_col = "LONG_LABEL"
)
The output contains the elements SegmentTable
and
SegmentData
. SegmentTable
is the abbreviated
output for reporting, while SegmentData
shows the
missingnes rates:
seg_miss_qual$SegmentData
Segment | Non-response rate (Percentage (0 to 100)) | Refusal rate (Percentage (0 to 100)) |
---|---|---|
INTRO | 0% | 0% |
SOMATOMETRY | 0.19% | 0.05% |
INTERVIEW | 0.51% | 0.32% |
LABORATORY | 0.79% | 0.05% |