Loading Tree…

DQI-3006

Definition

Observed numerical values are uncertain or improbable because they are outside the expected ranges.

Explanation

The identification of uncertain numerical values is methodologically identical to inadmissible numerical-values. Yet, the meaning is different. The latter identifies inadmissible values and a certain data quality issue. The former identifies uncertain data values and a potential data quality problem. That is the case because a value may for example be improbable but plausible.

Identified uncertain values merits some further inspection to reach a decision on whether a problem exists.

In other concepts the applied range limits are also described as “soft limits”. In eCRF commonly such values receive a warning at data entry.

Example

In a somatometry examination body weight measures in adults intervals of admissible values were defined as follows: [35; 250] . In addition limits have been identified for rare weights that merit further inspection as follows: [40kg; 160kg].

Applying these limits revealed no inadmissible numerical values but in five cases uncertain numerical values with the values: 165kg; 187kg; 177kg; 203kg; 222kg. Rechecking with examiners revealed that the three values below were correct. However, the two values above 200kg should have been 103kg; and 222kg. Values were corrected accordingly.

Guidance

The challenge in identifying inadmissible values rests in finding adequate limits for checks for numerical values. Cut-off values need to take into account empirical knowledge about what may be expectable within the specific context in which the study takes place. Values must further ensure to many “false positive” alarms.

Any violation triggers primarily processes to decide whether an uncertain value is correct or not. Should no such check be possible an elevated count of uncertain numerical values can be interpreted as sign of a potentially lower data quality.

Interpretation

The higher the number or percentage of inadmissible numerical values the higher the probability of a lower data quality.

Implementations

Literature