The function acc_distributions
examines Unexpected location and Unexpected proportion using histograms
and displays empirical cumulative distribution functions (ecdf) if a
grouping variable is provided.
The following example examines measurements in which a possible influence of the examiners is considered:
# Load dataquieR
library(dataquieR)
# Load data
sd1 <- prep_get_data_frame("ship")
# Load metadata
file_name <- system.file("extdata", "ship_meta_v2.xlsx", package = "dataquieR")
prep_load_workbook_like_file(file_name)
meta_data_item <- prep_get_data_frame("item_level") # item_level is a sheet in ship_meta_v2.xlsx
# Apply indicator function
ECDFSoma <- acc_distributions(
study_data = sd1,
meta_data = meta_data_item,
resp_vars = c("WAIST_CIRC_0", "BODY_HEIGHT_0", "BODY_WEIGHT_0"),
group_vars = "OBS_SOMA_0",
label_col = "LABEL"
)
The respective list of plots may be displayed using
ECDFSoma$SummaryPlotList
(only the first 2 plots are
displayed below):
The function acc_margins
is also
related to these indicators. However, it also provides descriptive
outputs, such as violin and box plots for continuous variables, count
plots for categorical data, and density plots for both. The main
application of acc_margins
is to make
inference on effects related to process variables, such as examiners,
devices, or study centers. The function determines whether measurements
are continuous or discrete. Alternatively, this information may be
specified in the metadata.
In the example, acc_margins
is applied
to the variable waist circumference (WAIST_CIRC_0). In this
case, dependencies related to the examiners (OBS_SOMA_0) are
assessed, while the raw measurements are controlled for variable age and
sex (AGE_0, SEX_0):
marginal_dists <- acc_margins(
study_data = sd1,
meta_data = meta_data_item,
resp_vars = "WAIST_CIRC_0",
co_vars = c("AGE_0", "SEX_0"),
group_vars = "OBS_SOMA_0",
label_col = "LABEL"
)
A plot is provided to view the results:
marginal_dists$SummaryPlot
Based on a statistical test, no mean waist circumference of any examiner differed substantially (p<0.05) from the overall mean.
However, some examiners can have a mean that differ from the overall mean. This can be observed in the following example, where the measurements of waist circumference of the examiner “3” have been increased by 20.
#increase by 20 the measurements of observer 3
sd1_example <- dplyr::mutate(sd1, waist= ifelse(obs_soma == "3", as.numeric(waist)+20, waist))
marginal_dists <- acc_margins(
study_data = sd1_example ,
meta_data = meta_data_item,
resp_vars = "WAIST_CIRC_0",
co_vars = c("AGE_0", "SEX_0"),
group_vars = "OBS_SOMA_0",
label_col = "LABEL"
)
marginal_dists$SummaryPlot
The result shows elevated proportions for the examiner 03.
The study of effects across groups and times is particularly complex.
The function acc_loess
provides a
descriptor related to the indicator Unexpected location. acc_loess
may also be
used to obtain information related to other indicators in the domain of
unexpected distributions.
An example call using waist circumference as the target variable is:
timetrends <- acc_loess(
study_data = sd1,
meta_data = meta_data_item,
resp_vars = "WAIST_CIRC_0",
co_vars = c("AGE_0", "SEX_0"),
group_vars = "OBS_SOMA_0",
time_vars = "EXAM_DT_0",
label_col = "LABEL"
)
invisible(lapply(timetrends$SummaryPlotList, print))
The graph for this variable indicates no major discrepancies between the examiners over the examination period.