This DokuWiki is maintained by Jonas Ranstam.
This DokuWiki is maintained by Jonas Ranstam.
Sample and population
Please clarify whether the presented study is descriptive or inferential, i.e. if the aim is to present a case series without generalizing the findings to other patients, or if the aim is to generalize the findings beyond the studied patients. The latter approach requires presentation of inferential uncertainty, usually p-values or confidence intervals. If such measures are presented in descriptive studies, it should be explained what they represent.
Table 1 presents baseline characteristics with p-values. Why? P-values are measures of inferential uncertainty. Their usefulness for descriptive purposes needs to be explained. How should the reader interpret these p-values?
It is unclear whether the performed comparisons are part of a finite population approach, or if the authors wish to generalize their findings beyond the studied patients. In the former case, the results are relevant only for the patients included in the study. No generalization to other patients is made. There is no place for statistical inference.
If it is the authors' purpose to describe a finite population, the statistical calculations may need to include a finite population correction (FPC).
Statistical vs clinical significance
The results presentation is systematically ambiguous with regard to the word significant. Please clarify when referring to statistical significance (inferential uncertainty) and to clinical significance (practical importance). Both these issues are important, and the inferential uncertainty must be accounted for when interpreteting a finding's practical importance. Just focusing on p-values is inadequate, see Wasserstein RL, Lazar NA. The ASA’s statement on p-values: context, process, and purpose. The American Statistician 2016 doi: 10.1080/00031305.2016.1154108.
Please distinguish between inferential uncertainty (statistical significance) and practical relevance (clinical significance). Statistically significant differences are not necessarily practically relevant. As a general rule, clinical significance must be shown by other means than p-values, and estimated effects and differences should have their estimation uncertainty presented. Confidence intervals are useful for this. See also Wasserstein RL, Lazar NA. The ASA’s statement on p-values: context, process, and purpose. The American Statistician 2016 doi: 10.1080/00031305.2016.1154108.
It is stated in the results section that the studied risk factor no longer had a statistical association with the outcome“. I assume that the aim of this analysis was to estimate the effect of the risk factor on the outcome. Such risk estimates should be presented together with their estimation uncertainty, preferably using confidence intervals. The interesting question is then whether or not the estimated (relative) risk, with due consideration to its estimation uncertainty, is clinically relevant. Just focusing on “statistical association” is simplistic as questions regarding validity and relevance are ignored, substituted by a p-value, which says nothing about bias or relevance.
Significant and n.s.
The analysis strategy is based on dichotomising findings as either statistically significant or not statistically significant. This is a strategy that leads to considerable distortion of the scientific process, see Wasserstein RL, Lazar NA. The ASA’s statement on p-values: context, process, and purpose. The American Statistician 2016 doi: 10.1080/00031305.2016.1154108. Statistically significant findings are not necessarily scientifically relevant and statistical nonsignificance is just an indication of uncertainty, not of equivalence.
Statistical nonsignificance is not evidence of equivalence. Claims of “no effect” or “no difference” need to include considerations regarding the estimation uncertainty of the studied effect or difference. Please refer to whether any clinically relevant effects are included in the relevant 95% confidence interval.
It is stated in the results section that the studied outcome score “was not different between groups A and B”. Does this refer to a) what has been observed, i.e. identical outcome scores? or to b) observed minor differences considered by the authors to be clinically irrelevant? or c) to statistically nonsignificant differences? Again, remember that the inferential uncertainty needs to be considered.
A primary outcome is presented, overall survival. A structuring of endpoints as primary and secondary is usually a part of a strategy for addressing multiplicity issues in confirmatory randomised trials. What is the purpose in this observational study?
In order to avoid confusion between statistical and clinical significance, please clarify to the reader if the presented odds ratios can be interpreted as relative risks, or explain how these measures should be interpreted with respect to clinical significance. The basic problem is that two similar odds ratios may have different clinical interpretation (and two different odds ratios the same) depending on baseline risk: RR = OR/(1-R+OR*R), where R = baseline risk, RR = relative risk, and OR = odds ratio, see Grant R. Converting an odds ratio to a range of plausible relative risks for better communication of research findings. BMJ 2014;348:f7450
The statistical model building is based on logistic regression. This method provides effect estimates in terms of odds ratios. Can these be interpreted as relative risks? Or would such an interpretation be misleading (see Davies HTO. When can odds ratios mislead? BMJ 1998;316:989)? In the latter case, I recommend using an alternative statistical method that provides direct estimate of the relative risk (see McNutt L-A, Wu C, Xue X, Hafner JP. Estimating the Relative Risk in Cohort Studies and Clinical Trials of Common Outcomes. Am J Epidemiol 2003;157:940–943) because the evaluation of the estimated risk factors' clinical significance requires clinically interpretable risk estimates.
A results presentation with bar charts is not transparent as bar charts hide both the number of observations and their distributions. I recommend presenting the results using dot plots or box plots instead.