The acc_loess function conducts local regression (LOESS)
to examine the impact of so-called process variables on the
measurements over time
(Cleveland et al. 1988). In this way, the
acc_loess function is a descriptor for Unexpected location in the Unexpected distributions domain of the
Accuracy dimension. Moreover, it is
also a descriptor for Unexpected
association strength, Unexpected
association direction and Unexpected
association form, in the Unexpected
associations domain of the Accuracy dimension.
For more details, see the user’s manual and source code.
acc_loess(
resp_vars = NULL,
group_vars = NULL,
time_vars = NULL,
co_vars = NULL,
min_obs_in_subgroup,
label_col = NULL,
study_data = sd1,
meta_data = md1,
resolution = 180,
se_line = list(color = "red", linetype = 2),
plot_data_time = NULL,
plot_format = "AUTO"
)
The function has the following arguments:
NULL for
output without grouping.group_vars with less measurements than defined in
min_obs_level are excluded.This implementation makes no use of a threshold. See Interpretation for guidance on the use of this function.
To illustrate the output, we use the example synthetic data and metadata that are bundled with the dataquieR package. See the introductory tutorial for instructions on importing these files into R, as well as details on their structure and contents.
Similar to the approach of the acc_margins function,
we assume that at least one examiner does not adhere to the SOP and may
influence the measurement process.
| v00000 | v00001 | v00002 | v00003 | v00004 | v00005 | v01003 | v01002 | v00103 | v00006 |
|---|---|---|---|---|---|---|---|---|---|
| 3 | LEIIX715 | 0 | 49 | 127 | 77 | 49 | 0 | 40-49 | 3.8 |
| 1 | QHNKM456 | 0 | 47 | 114 | 76 | 47 | 0 | 40-49 | 1.9 |
| 1 | HTAOB589 | 0 | 50 | 114 | 71 | 50 | 0 | 50-59 | 0.8 |
| 5 | HNHFV585 | 0 | 48 | 120 | 65 | 48 | 0 | 40-49 | 3.8 |
| 1 | UTDLS949 | 0 | 56 | 119 | 78 | 56 | 0 | 50-59 | 4.1 |
| 5 | YQFGE692 | 1 | 47 | 133 | 81 | 47 | 1 | 40-49 | 9.5 |
| 1 | AVAEH932 | 0 | 53 | 114 | 78 | 53 | 0 | 50-59 | 5.0 |
| 3 | QDOPT378 | 1 | 48 | 116 | 86 | 48 | 1 | 40-49 | 9.6 |
| 3 | BMOAK786 | 0 | 44 | 115 | 71 | 44 | 0 | 40-49 | 2.0 |
| 5 | ZDKNF462 | 0 | 50 | 116 | 74 | 50 | 0 | 50-59 | 2.4 |
For the acc_loess function, the columns
DATA_TYPE, MISSING_LIST and
HARD_LIMITS in the metadata are relevant.
| VAR_NAMES | LABEL | MISSING_LIST | DATA_TYPE | HARD_LIMITS | |
|---|---|---|---|---|---|
| 9 | v00004 | SBP_0 | 99980 | 99981 | 99982 | 99983 | 99984 | 99985 | 99986 | 99987 | 99988 | 99989 | 99990 | 99991 | 99992 | 99993 | 99994 | 99995 | float | [80;180] |
| 10 | v00005 | DBP_0 | 99980 | 99981 | 99982 | 99983 | 99984 | 99985 | 99986 | 99987 | 99988 | 99989 | 99990 | 99991 | 99992 | 99993 | 99994 | 99995 | float | [50;Inf) |
| 11 | v00006 | GLOBAL_HEALTH_VAS_0 | 99980 | 99983 | 99987 | 99988 | 99989 | 99990 | 99991 | 99992 | 99993 | 99994 | 99995 | float | [0;10] |
| 14 | v00009 | ARM_CIRC_0 | 99980 | 99981 | 99982 | 99983 | 99984 | 99985 | 99986 | 99987 | 99988 | 99989 | 99990 | 99991 | 99992 | 99993 | 99994 | 99995 | float | [0;Inf) |
| 21 | v00014 | CRP_0 | 99980 | 99981 | 99982 | 99983 | 99984 | 99985 | 99986 | 99988 | 99989 | 99990 | 99991 | 99992 | 99994 | 99995 | float | [0;Inf) |
| 22 | v00015 | BSG_0 | 99980 | 99981 | 99982 | 99983 | 99984 | 99985 | 99986 | 99988 | 99989 | 99990 | 99991 | 99992 | 99994 | 99995 | float | [0;100] |
The call of the function is illustrated here:
loess_1 <- acc_loess(
resp_vars = "SBP_0",
group_vars = "USR_BP_0",
time_vars = "EXAM_DT_0",
co_vars = c("AGE_0", "SEX_0"),
min_obs_in_subgroup = 30,
label_col = LABEL,
study_data = sd1,
meta_data = md1,
plot_format = "BOTH"
)
The first plot is obtained by calling
loess_1$SummaryPlotList[[1]] and provides panels for each
subject/object. The plot contains LOESS-smoothed curves for each level
of the group_vars. The red dashed lines represent the
confidence interval of a LOESS curve for the whole data.
Output 1:

Output 2:
The second plot combines all levels of group_vars:

The following aspects should be considered when investigating the plots:
Random fluctuation
If changes in all levels of the group_vars appear at
random, no systematic trends over time are likely.
Seasonal trends
If seasonal trends such as sigmoidal curves are observed in one or
selected levels of the group_vars, intermittent location
shifts are observed.
Persistent trends
As shown in the example above for “USR_482”, persistent trends in one
or selected levels of the group_vars imply a systematic
change in measurements over time. If a fitted curve exceeds the
confidence band of dashed red lines of the overall distribution a severe
shift is observed.
Discrete processes
If for one level of the group_vars a complete separation
of the LOESS curve compared to all other levels is apparent, systematic
differences in measurements are likely which are independent of
time.
resp_vars (if defined in
the metadata)resp_vars using
co_vars for adjustment.group_vars statement along with date-values of
a time_vars statement.group_vars.The application of LOESS usually requires model fitting, i.e. the
smoothness of a model is subject to a smoothing parameter (span).
Particularly in the presence of interval-based missing data (USR_181),
high variability of measurements combined with a low number of
observations in one level of the group_vars the fit to the
data may be distorted. Since our approach handles data without knowledge
of such underlying characteristics, finding the best fit is complicated
if computational costs should be minimal. The default of LOESS in R uses
a span 0.75 which provides in most cases reasonable fits. The function
above increases the fit to the data automatically if the minimum of
observations in one level of the group_vars is higher than
30.