Loading Tree…
The observed data type does not match the expected data type.
This implementation targets any deviation between some observed and expected data types such as numerical, categorical, string, date-time. The expected data type is provided in a reference metadata file.
The study data file on an eye-background examination contains, among others, the following two variables:
ebg_av “Number of arterial vessels”
ebg_iq “Image quality”
According to the related metadata file, the corresponding data types should be:
ebg_av Count
ebg_iq Categorical
Based on the related check it turns out that both variables are of string type. Either an automated attempt to convert the variable is conducted or some manual inspection is necessary to figure out underlying for the wrong data type.
A wrong data type is of particular concern for the appropriate assessment of correctness related issues, as the scope of statistical meaningfully assessments depends on it.
In case of findings related to this indicator, issues should be remedied to ensure an appropriate assessment of subsequent indicators.
The higher the number or percentage of occurrences the lower the data quality.
Lee K, Weiskopf N, Pathak J. A framework for data quality assessment in clinical research datasets. AMIA Annu Symp Proc 2017;2017:1080-9.
Kahn MG, Callahan TJ, Barnard J, et al. A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS (Wash DC). 2016;4(1):1244.
Nonnemacher M, Nasseh D, Stausberg J. Datenqualität in der medizinischen Forschung: Leitlinie zum Adaptiven Datenmanagement in Kohortenstudien und Registern. Berlin: TMF e.V..; 2014.
Stausberg J, Bauer U, Nasseh D, et al. Indicators of data quality: review and requirements from the perspective of networked medical research MIBE 2019;15(1):1-8.
Weiskopf NG, Bakken S, Hripcsak G, Weng C. A Data Quality Assessment Guideline for Electronic Health Record Data Reuse. EGEMS (Wash DC). 2017;5(1):14.