Tools |
Struckmann et al. dataquieR 2:
An updated r package for FAIR data quality assessments in observational
studies and electronic health record data, 2024, Journal of Open
Source Software, 9(98), 6581, https://doi.org/10.21105/joss.06581 |
Data quality, Observational health studies, Data
quality indicators, Data quality monitoring, R |
Methods |
Saleem et al. A review and
empirical comparison of univariate outlier detection methods, 2021,
Pakistan Journal of Statistics, 37(4) |
outliers |
Methods |
Aguinis et al. Best-practice
recommendations for defining, identifying, and handling outliers, 2013,
Organizational Research Methods, 16(2),
270–301 |
quantitative research, ethics in research,
outliers |
Standards |
A. A. for Public Opinion
Research Standard definitions: Final dispositions of case codes and
outcome rates for surveys, 2011, ” |
Standard definitions |
Methods |
Altman & Bland Assessing
agreement between methods of measurement, 2017, Clin Chem, https://doi.org/10.1373/clinchem.2016.268870 |
|
Tools |
Assenov et al. Comprehensive
analysis of DNA methylation data with RnBeads, 2014, Nature
Methods, 11(11), 1138 |
DNA methylation analysis, computational epigenetics,
whole genome bisulfite sequencing, reduced representation bisulfite
sequencing, epigenotyping microarrays, Illumina Infinium
HumanMethylation450 assay, bioinformatics software, epigenome-wide
association studies, medical epigenomics |
Thresholds |
Bach The freiburg visual acuity
test–automatic measurement of visual acuity, 1996, Optom Vis
Sci, 73(1), 49–53, https://www.ncbi.nlm.nih.gov/pubmed/8867682 |
visual acuity, computer test, psychometric threshold
estimation |
Cohort Studies |
Bamberg et al. Whole-body MR
imaging in the german national cohort: Rationale, design, and technical
background, 2015, Radiology, 277(1),
206–220 |
|
Software |
Boehmke Data wrangling with
r, 2016 |
R |
Methods |
Bakar et al. A comparative study
for outlier detection techniques in data mining, 2006, 2006 IEEE
Conference on Cybernetics and Intelligent Systems, 1–6 |
data mining , clustering , outlier |
Dictionary |
Bangia Dictionary of
information technology, 2010 |
|
Documentation |
Bargaje Good documentation
practice in clinical research, 2011, Perspectives in Clinical
Research, 2(2), 59 |
ALCOA, documentation, source, training |
Methods |
Barnett & Lewis Outliers
in statistical data, 1994 |
|
Standards |
Begley & Ellis Drug
development: Raise standards for preclinical cancer research, 2012,
Nature, 483(7391), 531–533 |
|
Methods |
Bennett How can i deal with
missing data in my study?, 2001, Australian and New Zealand Journal
of Public Health, 25(5), 464–469 |
|
Metadata |
Bretherton Reference model for
metadata: A strawman, 1994, Whitepaper, University Wisconsin.,
https://pdfs.semanticscholar.org/f941/4454ef0e25ef102831ed8c7a4b6e9c094b00.pdf |
|
Methods |
Brown & Forsythe Robust
tests for the equality of variances, 1974, Journal of the American
Statistical Association, 69(346), 364–367 |
|
Review |
Callahan et al. A comparison of
data quality assessment checks in six data sharing networks, 2017,
eGEMs (Generating Evidence & Methods to Improve Patient
Outcomes), 5(1) |
|
Review |
Chalmers & Glasziou
Avoidable waste in the production and reporting of research evidence,
2009, Obstetrics & Gynecology, 114(6),
1341–1345 |
|
Review |
Chen et al. A review of data
quality assessment methods for public health information systems, 2014,
International Journal of Environmental Research and Public
Health, 11(5), 5170–5207 |
data quality, information quality, data use, data
collection process, evaluation, assessment, public health, population
health, information systems |
Software |
Chang et al. Shiny: Web
application framework for r, 2015, 2018, R Package Version,
1(0), 14 |
|
Methods |
Callegaro et al. Web survey
methodology, 2015 |
|
Methods |
Cleveland et al. Regression by
local fitting: Methods, properties, and computational algorithms, 1988,
Journal of Econometrics, 37(1), 87–114 |
|
Methods |
Cleveland & Devlin Locally
weighted regression: An approach to regression analysis by local
fitting, 1988, Journal of the American Statistical Association,
83(403), 596–610 |
|
Concept |
Couchoud et al. Renal
replacement therapy registries—time for a structured data quality
evaluation programme, 2013, Nephrology Dialysis
Transplantation, 28(9), 2215–2220 |
completeness, data quality, quality assessment, RRT
registry, timeliness, validity |
Methods |
Das et al. A new method to
evaluate the completeness of case ascertainment by a cancer registry,
2008, Cancer Causes & Control, 19(5),
515–525 |
Data quality, Cancer, Population registers, Estimation,
techniques |
Methods |
Dasu & Johnson
Exploratory data mining and data cleaning, 2003 |
|
Methods |
Dong & Peng Principled
missing data methods for researchers, 2013, SpringerPlus,
2(1), 222 |
Missing data Listwise deletion MI FIML EM MAR MCAR
MNAR |
Methods |
Drion et al. Some
distribution-free tests for the difference between two empirical
cumulative distribution functions, 1952, The Annals of Mathematical
Statistics, 23(4), 563–574 |
|
Methods |
Durrleman & Simon Flexible
regression models with cubic splines, 1989, Statistics in
Medicine, 8(5), 551–561 |
Smoothing splines Non‐parametric regression Piecewise
polynomials |
Epidemiology |
Ebrahim & Davey Smith
Commentary: Should we always deliberately be non-representative?, 2013,
International Journal of Epidemiology, 42(4),
1022–1026 |
|
Concept |
Edwards et al. Science friction:
Data, metadata, and collaboration, 2011, Social Studies of
Science, 41(5), 667–690 |
collaboration, communication, data, metadata |
Methods |
Fasano & Franceschini A
multidimensional version of the kolmogorov–smirnov test, 1987,
Monthly Notices of the Royal Astronomical Society,
225(1), 155–170 |
|
Methods |
Feinstein & Cicchetti High
agreement but low kappa: I. The problems of two paradoxes, 1990,
Journal of Clinical Epidemiology, 43(6),
543–549 |
Kappa Concordance Agreement Paradox |
Methods |
Filzmoser A multivariate
outlier detection method, 2004 |
|
Standards |
Finnie et al. EpiJSON: A unified
data-format for epidemiology, 2016, Epidemics, 15,
20–26 |
Outbreaks, Epidemics, Software, Databases,
Communications standards |
Epidemiology |
Fletcher et al. Clinical
epidemiology: The essentials, 2012 |
|
Methods |
Freedman & Diaconis On the
histogram as a density estimator: L 2 theory, 1981, Probability
Theory and Related Fields, 57(4), 453–476 |
|
Methods |
Golub & Van Loan Matrix
computations johns hopkins university press, 1996, Baltimore and
London |
|
Methods |
Gonzalez-Chica et al. Test of
association: Which one is the most appropriate for my study?, 2015,
Anais Brasileiros de Dermatologia, 90(4),
523–528 |
Data analysis; Association; Epidemiology and
biostatistics; Hypothesis testing; Statistical methods and
procedures |
Methods |
Grant Data visualization:
Charts, maps, and interactive graphics, 2018 |
|
Software |
Hahsler et al. Introduction
to arules-a computational environment for mining association rules and
frequent item sets, 2010, 2018 |
|
Methods |
Hallgren Computing inter-rater
reliability for observational data: An overview and tutorial, 2012,
Tutorials in Quantitative Methods for Psychology,
8(1), 23 |
behavioral observation, coding, inter-rater agreement,
intra-class correlation, kappa, reliability, tutorial |
Methods |
Hansen et al. Enabling
longitudinal data comparison using DDI, 2011 |
Data Documentation in Social Sciences; DDI Metadata
Standard |
Methods |
Harrell Jr Regression
modeling strategies: With applications to linear models, logistic and
ordinal regression, and survival analysis, 2015 |
|
Software |
Harris et al. Research
electronic data capture (REDCap)—a metadata-driven methodology and
workflow process for providing translational research informatics
support, 2009, Journal of Biomedical Informatics,
42(2), 377–381 |
Medical informaticsElectronic data captureClinical
researchTranslational research |
Dictionary |
Hartge A dictionary of
epidemiology, sixth edition, 2015, Am J Epidemiol, https://doi.org/10.1093/aje/kwv031 |
|
Methods |
Hawkins Introduction, 1980, In
Identification of outliers (pp. 1–12), https://doi.org/10.1007/978-94-015-3994-4_1 |
|
Methods |
Hayat et al. Statistical methods
used in the public health literature and implications for training of
public health professionals, 2017, PloS One, 12(6),
e0179032 |
|
Software |
Horton & Kleinman Using
r and RStudio for data management, statistical analysis, and
graphics, 2015 |
|
Metadata |
Hoyle et al. Metadata for the
longitudinal data life cycle: The role and benefit of metadata
management and reuse., 2010, DDI Working Paper Series: Longitudinal
Data Best Practices, https://doi.org/http://dx.doi.org/10.3886/DDILongitudinal03 |
|
Methods |
Hubert & Vandervieren An
adjusted boxplot for skewed distributions, 2008, Computational
Statistics & Data Analysis, 52(12),
5186–5201 |
|
Methods |
Hu & Sung Detecting
pattern-based outliers, 2003, Pattern Recognition Letters,
24(16), 3059–3068 |
Outlier detectionComplete spatial
randomnessClusteringRegular spacing |
Methods |
Huebner et al. A contemporary
conceptual framework for initial data analysis, 2018, Observational
Studies, 4, 71–192, https://obsstudies.org/wp-content/uploads/2018/04/idarev2.pdf |
nitial data analysis, data cleaning, data screening,
reporting, metadata,research plan, STRATOS Initiative |
Methods |
Huser et al. Methods for
examining data quality in healthcare integrated data repositories,
2017 |
Data Quality, Evaluation Methods, Visualization,
Observational Research |
Review |
Ioannidis Why most published
research findings are false, 2005, PLoS Medicine,
2(8), e124 |
|
Epidemiology |
Ioannidis Discussion: Why an
estimate of the science-wise false discovery rate and application to the
top medical literature is false, 2013, Biostatistics,
15(1), 28–36 |
|
Epidemiology |
Ioannidis et al. Increasing
value and reducing waste in research design, conduct, and analysis,
2014, The Lancet, 383(9912), 166–175 |
|
Epidemiology |
Jager & Leek An estimate of
the science-wise false discovery rate and application to the top medical
literature, 2013, Biostatistics, 15(1),
1–12 |
|
Epidemiology |
Jager & Leek Rejoinder: An
estimate of the science-wise false discovery rate and application to the
top medical literature, 2013, Biostatistics, 15(1),
39–45 |
|
Methods |
Joshi et al. Likert scale:
Explored and explained, 2015, British Journal of Applied Science
& Technology, 7(4), 396 |
Psychometrics, Likert scale, points on scale, analysis,
education |
Methods |
Jinyuan et al. Correlation and
agreement: Overview and clarification of competing concepts and
measures, 2016, Shanghai Archives of Psychiatry,
28(2), 115 |
concordance correlation, intraclass correlation,
Kendall’s tau, non-linear association, Pearson’s correlation, Spearman’s
rho |
Concept |
Kahn et al. A harmonized data
quality assessment terminology and framework for the secondary use of
electronic health record data, 2016, eGEMs,
4(1) |
electronic health records, data use & quality, data
completeness |
Methods |
Kalton The treatment of missing
survey data, 1986, Survey Methodology, 12,
1–16 |
|
Methods |
Kao & Green Analysis of
variance: Is there a difference in means and what does it mean?, 2008,
Journal of Surgical Research, 144(1),
158–170 |
research/statistics and numerical datadata
interpretation/statisticalmodelsstatisticalreview |
Methods |
Kahn et al. Quantifying clinical
data quality using relative gold standards, 2010, AMIA Annual
Symposium Proceedings, 2010, 356 |
|
Concept |
Karr et al. Data quality: A
statistical perspective, 2006, Statistical Methodology,
3(2), 137–173 |
|
Methods |
Kalton & Kasprzyk The
treatment of missing survey data, 1986, Survey Methodology,
12(1), 1–16 |
|
Concept |
Keller et al. The evolution
of data quality: Understanding the transdisciplinary origins of data
quality concepts and approaches, 2017 |
designed data, administrative data, opportunity data,
reproducibility, total survey error, decision theoretic framework |
Methods |
Kleiber & Zeileis
Visualizing count data regressions using rootograms, 2016, The
American Statistician, 70(3), 296–303 |
Finite mixture, Goodness of fit, Hurdle model, Negative
binomial regression, Poisson regression |
Methods |
Koo & Li A guideline of
selecting and reporting intraclass correlation coefficients for
reliability research, 2016, Journal of Chiropractic Medicine,
15(2), 155–163 |
Reliability and validityResearchStatistics |
Methods |
Kullback & Leibler On
information and sufficiency, 1951, The Annals of Mathematical
Statistics, 22(1), 79–86 |
|
Methods |
Kullback Information theory
and statistics, 1997 |
|
Methods |
Levene Robust tests for equality
of variances, 1961, Contributions to Probability and Statistics.
Essays in Honor of Harold Hotelling, 279–292 |
|
Concept |
De Lusignan et al. Key concepts
to assess the readiness of data for international research: Data
quality, lineage and provenance, extraction and processing errors,
traceability, and curation, 2011, Yearb Med Inform,
6(1), 112–120 |
Medical records systems, computerized; research design;
registry;records as topic; databases genetic |
Methods |
Lang & Little Principled
missing data treatments, 2016, Prevention Science, https://doi.org/10.1007/s11121-016-0644-5 |
Missing data Multiple imputation Full information
maximum likelihood Auxiliary variables Intent-to-treat Statistical
inference |
Cohort Studies |
Langeheine et al. Consequences
of an extended recruitment on participation in the follow‐up of a child
study: Results from the german IDEFICS cohort, 2017, Paediatric and
Perinatal Epidemiology, 31(1), 76–86 |
loss to follow‐up late respondents IDEFICS
paradata |
Concept |
Lee et al. A framework for data
quality assessment in clinical research datasets, 2017, AMIA Annual
Symposium Proceedings, 2017, 1080 |
|
Methods |
Lehmann & Casella Theory
of point estimation, 2006 |
|
Methods |
Lenth et al. Least-squares
means: The r package lsmeans, 2016, Journal of Statistical
Software, 69(1), 1–33 |
least-squares means, linear models, experimental
design |
Concept |
Liaw et al. Towards an ontology
for data quality in integrated chronic disease management: A realist
review of the literature, 2013, International Journal of Medical
Informatics, 82(1), 10–24 |
RealistResearch designChronic diseaseInformation
systemData qualityOntology |
Methods |
Lindsey Comparison of
probability distributions, 1974, Journal of the Royal Statistical
Society. Series B (Methodological), 38–47 |
likelihood inference grouping data goodness of fit
comparing models |
Methods |
Lindsey & Mersch Fitting and
comparing probability distributions with log linear models, 1992,
Computational Statistics & Data Analysis, 13(4),
373–384 |
Comparison of modelsGeneralized linear modelsGoodness
of fitLikelihood inferenceLog linear modelsProbability
distributionsTruncated distributions |
Methods |
Little & Rubin
Statistical analysis with missing data, 2014 |
|
Methods |
Mayr et al. A permutation test
to analyse systematic bias and random measurement errors of medical
devices via boosting location and scale models, 2017, Statistical
Methods in Medical Research, 26(3), 1443–1460 |
Measurement errors, systematic bias, random error,
statistical models, permutation test, gradient boosting, regression |
Methods |
Mahalanobis On the
generalized distance in statistics, 1936 |
|
Methods |
Seo A review and comparison
of methods for detecting outliers in univariate data sets,
2006 |
boxplot; lognormal; outlier; skewed distribution |
Concept |
McMahon & Denaxas A novel
framework for assessing metadata quality in epidemiological and public
health research settings, 2016, AMIA Summits on Translational
Science Proceedings, 2016, 199 |
|
Review |
Meyer et al. Efficient data
management in a large-scale epidemiology research project, 2012,
Computer Methods and Programs in Biomedicine, 107(3),
425–435 |
Central Data Management, Electronic Data Capture,
Electronic Case Report Forms, Individualized medicine, Personalized
Medicine |
Software |
Mitchell et al. Data
management using stata: A practical handbook, 2010 |
|
Methods |
Morgenthaler A survey of robust
statistics, 2007, Statistical Methods and Applications,
15(3), 271–293 |
|
Methods |
Müller & Büttner A critical
discussion of intraclass correlation coefficients, 1994, Statistics
in Medicine, 13(23-24), 2465–2476 |
|
Metadata |
Nadkarni Metadata-driven
software systems in biomedicine: Designing systems that can adapt to
changing knowledge, 2011, https://doi.org/doi:
10.1007/978-0-85729-510-1 |
biomedicine, metadata |
Cohort Studies |
Consortium The german national
cohort: Aims, study design and organization, 2014, European Journal
of Epidemiology, 29, 371–382 |
Population-based cohort, Non-communicable diseases,
Chronic infections, Life-style and socio-economic factors, Magnetic
resonance imaging, Pre-clinical disease, Functional impairments |
Methods |
Newsom Longitudinal
structural equation modeling: A comprehensive introduction,
2015 |
|
Epidemiology |
Nohr & Olsen Commentary:
Epidemiologists have debated representativeness for more than 40
years—has the time come to move on?, 2013, International Journal of
Epidemiology, 42(4), 1016–1017 |
pregnancy, conflict of interest, epidemiology, adult,
biometry, child, follow-up, garbage, internet, logic, shoes, sociology,
time factors, infections, epidemiologic causality, statutes and laws,
prenatal care, conception, epidemics, child health, birth, inference,
killing, national institute of child health and human development,
imputation |
Concept |
Nonnemacher et al.
Datenqualität in der medizinischen forschung, 2014 |
|
Software |
Potter et al. Web application
teaching tools for statistics using r and shiny, 2016, Technology
Innovations in Statistics Education, 9(1) |
|
Software |
Plantier et al. Biomedical
engineering systems and technologies: 7th international joint
conference, BIOSTEC 2014, angers, france, 3-6, 2014, revised selected
papers, 2016 |
|
Dictionary |
Porta A dictionary of
epidemiology, 2014 |
|
Methods |
Press & Teukolsky
Kolmogorov-smirnov test for two-dimensional data: How to tell whether a
set of (x, y) data paints are consistent with a particular probability
distribution, or with another data set, 1988, Computers in
Physics, 2(4), 74–77 |
|
Epidemiology |
Prinz et al. Believe it or not:
How much can we rely on published data on potential drug targets?, 2011,
Nature Reviews Drug Discovery, 10(9), 712 |
Drug discovery |
Methods |
Priyadarshana & Sofronov
Multiple break-points detection in array CGH data via the cross-entropy
method, 2014, IEEE/ACM Transactions on Computational Biology and
Bioinformatics, 12(2), 487–498 |
Break-point modelling , aCGH microarray data ,
stochastic optimization , CNVs , DNA copy number , Cross-Entropy |
Methods |
Ranganathan et al. Common
pitfalls in statistical analysis: Measures of agreement, 2017,
Perspectives in Clinical Research, 8(4),
187 |
Agreement, biostatistics, concordance |
Documentation |
Rasmussen & Blank The data
documentation initiative: A preservation standard for research, 2007,
Archival Science, 7(1), 55–71 |
|
Software |
Rossini et al. Simple parallel
statistical computing in r, 2007, Journal of Computational and
Graphical Statistics, 16(2), 399–420 |
Bootstrap, Cross-validation, Grid computing, Kriging,
LAM-MPI, MPI, Message passing, Profile likelihood, pVM |
Software |
Reineke et al. Modys–ein
modulares steuerungs-und dokumentationssystem für epidemiologische
studien, 2006, Medizinische Dokumentation–Wichtig Oder
Nichtig |
|
Metadata |
A. Richter et al. Data quality
monitoring in clinical and observational epidemiologic studies: The role
of metadata and process information, 2019, GMS Med Inform Biom
Epidemiol, 15(1), https://doi.org/doi:
10.3205/mibe000202 |
data quality, metadata, process variables, data
monitoring, health research, cohort studies |
Methods |
R. Rigby et al. Distributions
for modelling location, scale, and shape: Using GAMLSS in r, 2017,
URL Www. Gamlss. Org.(last Accessed 5 March 2018) |
|
Methods |
R. A. Rigby & Stasinopoulos
Generalized additive models for location, scale and shape, 2005,
Journal of the Royal Statistical Society: Series C (Applied
Statistics), 54(3), 507–554 |
Beta–binomial distribution; Box–Cox transformation;
Centile estimation; Cubicsmoothing splines; Generalized linear mixed
model;LMSmethod; Negative binomialdistribution; Non-normality;
Nonparametric models; Overdispersion; Penalized likelihood;Random
effects; Skewness and kurtosis |
Omics |
Risch Searching for genetic
determinants in the new millennium, 2000, Nature,
405(6788), 847 |
|
Software |
Rossini et al. Simple parallel
statistical computing in r, 2007, Journal of Computational and
Graphical Statistics, 16(2), 399–420 |
|
Epidemiology |
Rothman et al. Why
representativeness should be avoided, 2013, International Journal of
Epidemiology, 42(4), 1012–1014 |
ethnic, group habits, statutes and laws, public health
medicine, inference, social survey |
Epidemiology |
Rothman et al. Modern
epidemiology, 2008 |
|
Epidemiology |
Rothwell External validity of
randomised controlled trials: “To whom do the results of this
trial apply?” 2005, The Lancet, 365(9453),
82–93 |
|
Software |
R Core Team R: A language
and environment for statistical computing, 2020, https://www.R-project.org/ |
|
Documentation |
Ryssevik The data documentation
initiative (DDI) metadata specification, 2001, Ann Arbor, MI: Data
Documentation Alliance. Retrieved from Http://Www. Ddialliance.
Org/Sites/Default/Files/Ryssevik_0. Pdf |
|
Methods |
Schafer & Graham Missing
data: Our view of the state of the art, 2002, Psychol Methods,
7(2), 147–177, https://www.ncbi.nlm.nih.gov/pubmed/12090408 |
|
Tools |
C. Schmidt et al. Square2-a web
application for data monitoring in epidemiological and clinical studies,
2017, Studies in Health Technology and Informatics,
235, 549–553 |
|
Concept |
C. O. Schmidt et al. Assessment
of a data quality guideline by representatives of german epidemiologic
cohort studies., 2019, MIBE, 15(1), https://doi.org/doi:
10.3205/mibe000203 |
data quality, cohort studies, data quality indicators,
data monitoring |
Software |
Schmidberger et al.
State-of-the-art in parallel computing with r, 2009, Journal of
Statistical Software, 47(1) |
R, high performance computing, parallel computing,
computer cluster, multi-core systems, grid computing, benchmark |
Software |
Signorell et al. DescTools:
Tools for descriptive statistics. R package version 0.99. 18, 2016,
R Foundation for Statistical Computing, Vienna,
Austria |
|
Methods |
Sison & Glaz Simultaneous
confidence intervals and sample size determination for multinomial
proportions, 1995, Journal of the American Statistical
Association, 90(429), 366–369 |
Coverage probabilities; Multinomial distribution;
Probability approximations; Simultaneous inference |
Methods |
Sniders & Bosker
Multilevel analysis: An introduction to basic and advanced
multilevel modeling., 1999 |
|
Epidemiology |
Stang & Jöckel Avoidance of
representativeness in presence of effect modification, 2014,
International Journal of Epidemiology, 43(2),
630–631 |
|
Metadata |
Stausberg et al. Indicators of
data quality: Review and requirements from the perspective of networked
medical research indikatoren zur datenqualität: Stand und anforderungen
aus sicht der vernetzten medizinischen forschung, 2019, GMS Med
Inform Biom Epidemiol, 15(1), https://doi.org/doi:
10.3205/mibe000199 |
medical research, data quality, healthcare, guidelines,
analytics, informatics |
Methods |
Sterne & Smith Sifting the
evidence—what’s wrong with significance tests?, 2001, Physical
Therapy, 81(8), 1464–1469 |
|
Methods |
Sturges The choice of a class
interval, 1926, Journal of the American Statistical
Association, 21(153), 65–66 |
|
Cohort Studies |
Teppo et al. Data quality and
quality control of a population-based cancer registry: Experience in
finland, 1994, Acta Oncologica, 33(4),
365–369 |
|
Epidemiology |
Thygesen & Ersbøll When the
entire population is the sample: Strengths and limitations in
register-based epidemiology, 2014, European Journal of
Epidemiology, 29(8), 551–558 |
Registers, Database management systems, Epidemiology,
Bias, Nordic countries |
Methods |
Tukey Exploratory data
analysis, 1977 |
|
Software |
Van der Loo The stringdist
package for approximate string matching, 2014, The R Journal,
6(1), 111–122 |
|
Concept |
Vardaki et al. A statistical
metadata model for clinical trials’ data management, 2009, Computer
Methods and Programs in Biomedicine, 95(2),
129–145 |
Metadata, Clinical trials, Medical research,
Statistical metadata modeling, Transformations, Clinical Study Data
Management, Systems, Harmonization, Quality |
Documentation |
Vardigan et al. Data
documentation initiative: Toward a standard for the social sciences,
2008, International Journal of Digital Curation, 3(1),
107–113 |
|
Cohort Studies |
Völzke et al. Cohort profile:
The study of health in pomerania, 2010, International Journal of
Epidemiology, 40(2), 294–307 |
ultrasonography , follow-up , germany , ships |
Methods |
Wager et al. Model selection for
penalized spline smoothing using akaike information criteria, 2007,
Australian & New Zealand Journal of Statistics,
49(2), 173–190 |
Penalized Spline; Model Selection; Conditional versus
Marginal In-ference; Variance Component Selection |
Concept |
Wang & Strong Beyond
accuracy: What data quality means to data consumers, 1996, Journal
of Management Information Systems, 12(4), 5–33 |
data administration, data quality, database system |
Concept |
Watts et al. Data quality
assessment in context: A cognitive perspective, 2009, Decision
Support Systems, 48(1), 202–211 |
Dual-Process Theory, Cognition, Quality Metadata,
Information Quality Management, Information Quality Dimensions, Decision
Support |
Concept |
Nicole G. Weiskopf et al. A data
quality assessment guideline for electronic health record data reuse,
2017, eGEMs (Generating Evidence & Methods to Improve Patient
Outcomes), 5(1) |
|
Concept |
Nicole G. Weiskopf et al.
Defining and measuring completeness of electronic health records for
secondary use, 2013, Journal of Biomedical Informatics,
46(5), 830–836 |
Data quality, Electronic health records, Secondary use,
Completeness |
Concept |
Nicole Gray Weiskopf & Weng
Methods and dimensions of electronic health record data quality
assessment: Enabling reuse for clinical research, 2013, Journal of
the American Medical Informatics Association, 20(1),
144–151 |
Clinical research, clinical research informatics, data
quality, electronic health records, knowledge acquisition, knowledge
acquisition and knowledge management, knowledge bases, knowledge
representations, methods for integration of information from disparate
sources, secondary use |
Standards |
Organization International
statistical classification of diseases and related health problems,
2004 |
|
Metadata |
Wilson Toward releasing the
metadata bottleneck, 2011, Library Resources & Technical
Services, 51(1), 16–28 |
|
Software |
Wickham Advanced r,
2014 |
|
Software |
Wickham R packages:
Organize, test, document, and share your code, 2015 |
|
Methods |
De Leeuw et al. Prevention and
treatment of item nonresponse, 2003, Journal of Official
Statistics, 19, 153–176 |
causes of missingness, data collection mode,
ignorability, imputation, item nonresponse, questionnaire development,
follow-up survey |
Concept |
Carsten Oliver Schmidt et al.
Facilitating harmonized data quality assessments. A data quality
framework for observational health research data collections with
software implementations in r, 2021, BMC Medical Research
Methodology, 21(1), 1–15, https://doi.org/10.1186/s12874-021-01252-7 |
Data quality, Observational health studies, Data
quality indicators, Data quality monitoring, Initial data analysis,
R |
Tools |
Adrian Richter et al. dataquieR:
Assessment of data quality in epidemiological research, 2021,
Journal of Open Source Software, 6(61), 3093, https://doi.org/10.21105/joss.03093 |
Data quality, Observational health studies, Data
quality indicators, Data quality monitoring, R |
Concept |
T. A. A. for Public Opinion
Research Standard definitions: Final dispositions of case codes and
outcome rates for surveys, 2016 |
|
Concept |
ISO ISO 8000-1:2022 data
quality part 1: overview, 2022, https://www.iso.org/obp/ui/#iso:std:iso:8000:-1:ed-1:v1:en |
Data quality |