Description

The function com_qualified_segment_missingness checks for Non-response rate and Refusal rate for each segment.

com_qualified_segment_missingness provide indicators for non-response rate and refusal rate. Both indicators belong to the qualified missingness domain in the Completeness dimension.

For more details, see the user’s manual and source code.

Usage and arguments

com_qualified_segment_missingness(
   study_data = "study_data",
   meta_data_v2 = "meta_data_v2"
 )

The function has the following arguments:

  • study_data: mandatory, the data frame containing the study measurements
  • meta_data_v2: mandatory, the data frame containing the metadata
  • label_col: optional, the column in the metadata data frame containing the labels of all the variables in the study data.
  • item_level: optional, the data frame that contains metadata attributes of study data
  • meta_data: alias for item_level
  • _expected_observations: a character vector indicating the observations expected using three possible options based on the old PART_VAR concept: ALL(all observations are expected and included), SEGMENT (the column PART_VAR is expected to point to a variables with values 0 and 1, indicating if the variable was expected to be observed and therefore included in the check), or HIERARCHY (a recursive check, so if a variable points to such a participation variable IN PART_VAR, and that other variable does has also a PART_VAR entry pointing to a variable, the observation of the initial variable is only expected, if both segment variables are 1)
  • _meta_data_segment: the data frame containing the metadata for the segment level
  • ___segment_level: alias for meta_data_segment

Example output

To illustrate the output, we use a subset of the SHIP example data and metadata that are bundled with the dataquieR package. See the introductory tutorial for instructions on importing these files into R, as well as details on their structure and contents.

qual_seg <- com_qualified_segment_missingness( 
  study_data = "ship",
  meta_data_v2 = "ship_meta_v2"
 )

The function generates two outputs: SegmentTable and SegmentData.

Output 1: SegmentTable This data frame contains information for each segment on the missing values using user defined value codes. In the following example the codes are based on the AAPOR definitions (Public Opinion Research 2016).

The SegmentTable is called using qual_seg$SegmentTable

Segment NE O R NC UO P I RR1 NRR1 PCT_com_qum_nonresp RR2 NRR2 REF1 PCT_com_qum_refusal N N2
INTRO 0 0 0 0 0 0 2154 1.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 2154 2154
SOMATOMETRY 0 2 1 0 1 0 2150 0.9981430 0.0018570 0.1857010 0.9981430 0.0018570 0.0004643 0.0464253 2154 2154
INTERVIEW 0 4 7 0 0 2143 0 0.9948932 0.0051068 0.5106778 0.9948932 0.0051068 0.0032498 0.3249768 2154 2154
LABORATORY 0 16 1 0 0 0 2137 0.9921077 0.0078923 0.7892293 0.9921077 0.0078923 0.0004643 0.0464253 2154 2154

Output 2: SegmentData This data frame summarize the information of SegmentTable and provides just the percentage of the indicators. It is called with qual_seg$SegmentData.

Segment Non-response rate (Percentage (0 to 100)) Refusal rate (Percentage (0 to 100))
INTRO 0% 0%
SOMATOMETRY 0.19% 0.05%
INTERVIEW 0.51% 0.32%
LABORATORY 0.79% 0.05%

Interpretation

The higher the percentage of non-response rate and refusal rate, the lower the data quality.

Algorithm of the implementation

  1. The lists of missing codes and labels are selected from the metadata
  2. The number of each missing code (e.g., I for Complete participation) in each segment is calculated
  3. The percentage of non-response rate and refusal rate are calculated using the formulas: for non-response rate 1 - RR1 (where RR1 is the response rate based on participation only, i.e., (I+P)/((I+P+PL) + (R+BO+NC+O) + (UH+UO)); for refusal rate (REF1) the formula include all who refused at any stage, i.e., (R+BO)/((I+P+PL) + (R+BO+NC+O) + (UH+UO)

Concept relations

Public Opinion Research, T.A.A. for (2016). Standard definitions: Final dispositions of case codes and outcome rates for surveys (AAPOR).