Skip to main content

Table 4 Rating of the Quality of the statistical outcomes to determine measurement properties

From: A systematic review investigating measurement properties of physiological tests in rugby

Measurement property

Definition

(Rating) Quality criteriaa, b

Reliability

 Internal consistency

The extent to which items in a (sub)scale are intercorrelated, thus measuring the same construct

(+) Factor analyses performed on adequate sample size (7 * # items and >100) AND Cronbach’s alpha(s) calculated per dimension AND Cronbach’s alpha(s) between 0.70 and 0.95;

(?) No factor analysis OR doubtful design or method

(−) Cronbach’s alpha(s) 0.70 or O0.95, despite adequate design and method.

(0) No information found on internal consistency.

Reproducibility

 Agreement

The extent to which the scores on repeated measures are close to each other (absolute measurement error)

(+) MIC < SDC OR MIC outside the LOA OR convincing arguments that agreement is acceptable.

(?) Doubtful design or method OR (MIC not defined AND no convincing arguments that agreement is acceptable)

(−) MIC > SDC OR MIC equals or inside LOA, despite adequate design and method; (0) No information found on agreement.

 Reliability

The extent to which patients can be distinguished from each other, despite measurement errors (relative measurement error)

(+) ICC > 0.70 OR k > 0.70

(?) Doubtful design or method (e.g., time interval not mentioned)

(−) ICC or weighted Kappa ≤0.70, despite adequate design and method

(0) No information on reliability found

Validity

 Content Validity

The extent to which the domain of interest is comprehensively sampled by the items in the questionnaire

(+) A clear description is provided of the measurement aim, the target population, the concepts that are being measured, and the item selection AND target population and (investigators OR experts) were involved in item selection;

(?) A clear description of above-mentioned aspects is lacking OR only target population involved OR doubtful design or method;

(−) No target population involvement;

(0) No information found on target population involvement.

Construct validity

The extent to which scores on a particular questionnaire relate to other measures in a manner that is consistent with theoretically derived hypotheses concerning the concepts that are being measured

(+) Specific hypotheses were formulated AND at least 75% of the results are in accordance with these hypotheses;

(?)Doubtful design or method (e.g., no hypotheses);

(−) Less than 75% of hypotheses were confirmed, despite adequate design and methods;

(0) No information found on construct validity.

Criterion validity (predictive or concurrent

The extent to which scores on a particular

questionnaire relate to a gold standard

C(+) correlation with standard ≥0.70 OR no statistically significant differences between the two tests found OR sensitivity and specificity ≥0.70 OR convincing arguments that gold standard is “gold” AND correlation with gold standard >0.70;

(?)No convincing arguments that gold standard is “gold” OR

doubtful design or method;

(−) Correlation with standard <0.70 or AUC < 0.70 OR statistically significant differences between outcome measures and gold standard OR sensitivity or specificity <0.70

Responsiveness

The ability of a questionnaire to detect clinically important changes over time

(+) SDC or SDC < MIC OR MIC outside the LOA OR RR O 1.96 OR AUC > 0.70;

(?) Doubtful design or method;

(−) SDC or SDC > MIC OR MIC equals or inside LOA OR RR < 1.96 OR AUC < 0.70, despite adequate design and methods.

(0)No information found on responsiveness.

Floor and ceiling effects

The number of respondents who achieved the lowest or highest possible score

(+) ≤ 15% of the respondents achieved the highest or lowest possible score

(?) Doubtful design or method

(−) > 15% achieved the highest and lowest possible score despite adequate designs and methods

(0) No information found on interpretation

Interpretability

The degree to which one can assign qualitative meaning to quantitative scores

(+) Mean and SD scores presented of at least four relevant subgroups of patients and MIC defined;

(?) Doubtful design or method OR less than four subgroups OR no MIC defined;

(0) No information found on interpretation.

  1. MIC minimal important change, SDC smallest detectable change, LOA limits of agreement, ICC Intraclass correlation, SD standard deviation
  2. a(+) positive rating; (?) indeterminate rating; (−) negative rating; (0) no information available
  3. bDoubtful design or method = lacking of a clear description of the design or methods of the study, or any important methodological weakness in the design or execution of the study