Statistical Mistakes

The statistical reviewer's handbook

Prediction modelling

Stepwise regression

The prediction model was fitted using logistic regression with stepwise regression and backward elimination. Why? Did the exclusion of variables from the model improve the predictive accuracy? Furthermore, the elimination is based on the outcome of hypothesis tests not directly relevant for the predictive accuracy. Wouldn’t LASSO regression, LAR regression or the Elastic Net, lead to a model with even better predictive accuracy?

Interpretation of regression coefficients

The presented regression coefficients of the prediction model can be misinterpreted as estimates of relative risk. As the model has not been developed with respect to cause-effect relations between the variables these estimates are likely to reflect confounding bias and perhaps collider-stratification bias and overadjustment bias, see e.g. Westreich D, Greenland S. The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. Am J Epidemiol 2013;177:292-298. I recommend removing these tables from the manuscript.