Loading, please wait...

Data quality report

2024-04-02

Study data summary

Number of ...
observations in study data set 2154
variables in study data set 33
variables with item-level metadata 33

Metadata summary

Number of ...
variables in item-level metadata 33
segments 4

Scope of the data quality assessment

Dimension No. DQ indicators
Integrity 1
Completeness 3
Consistency 3
Accuracy 0

Data Quality Summary

Display Detailed View

Technical information

The table below summarizes technical information about this report, the R session and the operating system.

Rendered using dataquieR 2.1.0 at 2024-04-02 15:02:31.914901


Related literature

Kasbohm E, Marino J, Richter A, Schmidt CO, Struckmann S (2024). dataquieR: Data Quality in Epidemiological Research. https://dataquality.qihs.uni-greifswald.de/.

Richter A, Schmidt CO, Krüger M, Struckmann S (2021). “dataquieR: assessment of data quality in epidemiological research.” Journal of Open Source. doi:10.21105/joss.03093.

Schmidt CO, Struckmann S, Enzenbach C, Reineke A, Stausberg J, Damerow S, Huebner M, Schmidt B, Sauerbrei W, Richter A (2021). “Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R.” BMC Med Res Methodol. doi:10.1186/s12874-021-01252-7.

Overview of all performed quality checks

This overview is composed of a pie chart and a matrix. The pie chart gives an overview on the number of variables assessed to be in one of the specified data quality categories. In the matrix, the rows contain the variables of the study data, and columns refer to the targeted data quality indicator functions. A final column called Total contains the worse classification obtained by the variable of reference.

The meaning of the colors is the same in the pie chart and in the matrix and corresponds with the data quality categories. The matrix has 3 additional colors: the cell will be in white, if an assessment was not reasonably possible or not selected by the user; light grey, if there was an assessment, but it has not been classified; and dark grey if the assessment failed, but usually amending the metadata the user could make an assessemnt possible. The texts in the matrix show the metrics behind the classification. If more than one metric is avaialable, the one with the worst quality rating is visible.

A classification is done based on the grading rulesets, the colors base on the grading formats. Both can be provided as separate metadata sheets; for both, dataquieR has default sheets.



Summary Matrix

The summary matrix displays all potential data quality issues in the data, and all problems originating from missing or deficient metadata or other possible problems.



Other information

Moving the mouse pointer on the table shows the function, the variable of interest, the corresponding data quality category or, if not available, a "Not classified"/"No results available" text. These are followed by the metrics behind the classification and the warning/error messages produced when calling the function. In case of more than one metric, all are reported. Only the metric with the worst data quality classification is displayed in bold face.
Moving the mouse pointer on the variables displays the actual variable name instead of the label.



dataquieR 2.1.0