Data quality report
2024-04-02
Study data summary
Number of ... | |
---|---|
observations in study data set | 2154 |
variables in study data set | 33 |
variables with item-level metadata | 33 |
Metadata summary
Number of ... | |
---|---|
variables in item-level metadata | 33 |
segments | 4 |
Scope of the data quality assessment
Dimension | No. DQ indicators |
---|---|
Integrity | 1 |
Completeness | 3 |
Consistency | 3 |
Accuracy | 0 |
Data Quality Summary
Display Detailed ViewTechnical information
The table below summarizes technical information about this report, the R session and the operating system.
Rendered using dataquieR 2.1.0 at 2024-04-02 15:02:31.914901
Related literature
Kasbohm E, Marino J, Richter A, Schmidt CO, Struckmann S (2024). dataquieR: Data Quality in Epidemiological Research. https://dataquality.qihs.uni-greifswald.de/.
Richter A, Schmidt CO, Krüger M, Struckmann S (2021). “dataquieR: assessment of data quality in epidemiological research.” Journal of Open Source. doi:10.21105/joss.03093.
Schmidt CO, Struckmann S, Enzenbach C, Reineke A, Stausberg J, Damerow S, Huebner M, Schmidt B, Sauerbrei W, Richter A (2021). “Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R.” BMC Med Res Methodol. doi:10.1186/s12874-021-01252-7.
Overview of all performed quality checks
This overview is composed of a pie chart and a matrix. The pie chart gives an overview on the number of variables assessed to be in one of the specified data quality categories. In the matrix, the rows contain the variables of the study data, and columns refer to the targeted data quality indicator functions. A final column called Total contains the worse classification obtained by the variable of reference.
The meaning of the colors is the same in the pie chart and in the matrix and corresponds with the data quality categories. The matrix has 3 additional colors: the cell will be in white, if an assessment was not reasonably possible or not selected by the user; light grey, if there was an assessment, but it has not been classified; and dark grey if the assessment failed, but usually amending the metadata the user could make an assessemnt possible. The texts in the matrix show the metrics behind the classification. If more than one metric is avaialable, the one with the worst quality rating is visible.
A classification is done based on the grading rulesets, the colors base on the grading formats. Both can be provided as separate metadata sheets; for both, dataquieR has default sheets.
Summary Matrix
The summary matrix displays all potential data quality issues in the data, and all problems originating from missing or deficient metadata or other possible problems.
Other information
Moving the mouse pointer on the table shows the function, the variable
of interest, the corresponding data quality category or,
if not available, a "Not classified"/"No results available" text. These are
followed by the metrics behind the classification and the warning/error messages
produced when calling the function.
In case of more than one metric, all are reported. Only the metric with
the worst data quality classification is displayed in bold face.
Moving the
mouse pointer on the variables displays the actual variable name
instead of the label.