Chapter 11 Conclusion

Statistical analysis should start with questions that we are trying to answer using the data. These questions of interest should not only guide statistical analysis, but also guide data collection and finally interpretations of the analysis and modelling results. In order to answer these questions with data, we are always representing the Large World with the Small World models. There is no entirely objective approach to do it, rather pluralism of approaches (133,134) should be applied. The value of these models and Small World representations should be judged by qualities suggested by Gelman and Hennig (56): transparency, consensus, impartiality, correspondence to observable reality, awareness of multiple perspectives, awareness of context-dependence, and investigation of stability. Finally, we need to accept that we must act based on cumulative knowledge rather than solely rely on single studies or even single lines of research (8).

For example, let’s take a question that a sport practitioner might ask: “From how many athletes I can expect to see positive improvements after this intervention? Will Johnny improve?” This question is predictive question, which is common in practice. Assume that I provide this coach with the estimate of the average causal effect and accompanying magnitude-based inference (MBI), using SESOI he provided (i.e. 20% harmful, 30% equivalent, and 50% beneficial) or frequentist p-value of p<0.05. Will that answer the practitioner’s question?

The accompanying MBIs might even confuse him and appear to answer the “proportion” question he asked. “So, 50% of athletes will show beneficial response to treatment?”. Unfortunately no - MBIs (or METs) answer different question about estimator (be it mean or Cohen’s d or some other), not about individual response proportions.

Second part of his question also demands predictive modeling, that calls for taking into account Johnny’s known data and getting the best estimate for his response. If there is some historical data about Johnny, we might get better predictions (i.e. either through individual modeling or hierarchical model), but if not, then the average-based estimators might be our best guess of the most likely response that Johnny might manifest. Reporting proportions of responses on top of the average effect estimate might help answering questions about uncertainty regarding individual responses.

I am not saying that these are not important. I am only saying that they should not be automatically selected as an answer to any question. CIs can provide us with the uncertainty interval around proportions (i.e. “Model gives us 90% confidence that the proportion of the beneficial responses will be 40-60%”), or even around predictive performance metrics. However, we need to make sure to start with the question asked, as well as to suit our analysis and conclusions so that practitioners can understand it, and finally act based on it.

In the following part of this book, I will provide solution to the most common sport science problems and question using the material covered in this part, as well as bmbstats package written by the author.

References

8. Amrhein, V, Trafimow, D, and Greenland, S. Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis if We Don’t Expect Replication. The American Statistician 73: 262–270, 2019.

56. Gelman, A and Hennig, C. Beyond subjective and objective in statistics. Journal of the Royal Statistical Society: Series A (Statistics in Society) 180: 967–1033, 2017.

133. Mitchell, S. Unsimple truths: Science, complexity, and policy. paperback ed. Chicago, Mich.: The Univ. of Chicago Press, 2012.

134. Mitchell, SD. Integrative Pluralism. Biology & Philosophy 17: 55–70, 2002.