bmbstats: bootstrap magnitude-based statistics for sports scientists
2020-09-22
Welcome
The aim of this book is to provide an overview of the three classes of tasks in the statistical modeling: description, prediction and causal inference (76). Statistical inference is often required for all three tasks. Short introduction to frequentist null-hypothesis testing, Bayesian estimation and bootstrap are provided. Special attention is given to the practical significance with the introduction of magnitude-based estimators and statistical inference by using the concept of smallest effect size of interest (SESOI). Measurement error is discussed with the particular aim of interpreting individual change scores. In the second part of this book, common sports science problems are introduced and analyzed with the bmbstats
package.
This book, as well as the bmbstats
package are in active open-source development. Please be free to contribute pull request at GitHub when you spot an issue or have an improvement idea. I am hoping both this book and the bmbstats
package to be collaborative tools that can help both up-and-coming as well as experienced researchers and sports scientists.
bmbstats package
bmbstats book
R and R packages
This book is fully reproducible and was written in R (154) and the R-packages automatic (117), bayestestR (128), bmbstats (97), bookdown (205), boot (39), carData (52), caret (109), cowplot (201), directlabels (80), dorem (96), dplyr (194), effects (49–51), forcats (192), ggplot2 (190), ggridges (202), ggstance (70), hardhat (184), kableExtra (208), knitr (204), lattice (173), markdown (3), Metrics (64), minerva (1), mlr (17,19,116), mlr3 (116), mlrmbo (19), multilabel (152), nlme (151), openml (32), ParamHelpers (18), pdp (62), psych (156), purrr (69), readr (196), rpart (182), shorts (95), stringr (191), tibble (141), tidyr (195), tidyverse (193), vip (61), visreg (26), and vjsim (98).
License
This work, as a whole, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The code contained in this book is simultaneously available under the MIT license; this means that you are free to use it in your own packages, as long as you cite the source.
References
1. Albanese, D, Filosi, M, Visintainer, R, Riccadonna, S, Jurman, G, and Furlanello, C. Minerva and minepy: A c engine for the mine suite and its r, python and matlab wrappers. Bioinformatics bts707, 2012.
3. Allaire, J, Horner, J, Xie, Y, Marti, V, and Porte, N. Markdown: Render markdown with the c library ’sundown’. 2019.Available from: https://CRAN.R-project.org/package=markdown
17. Bischl, B, Lang, M, Kotthoff, L, Schiffner, J, Richter, J, Studerus, E, et al. mlr: Machine learning in r. Journal of Machine Learning Research 17: 1–5, 2016.Available from: http://jmlr.org/papers/v17/15-066.html
18. Bischl, B, Lang, M, Richter, J, Bossek, J, Horn, D, and Kerschke, P. ParamHelpers: Helpers for parameters in black-box optimization, tuning and machine learning. 2020.Available from: https://CRAN.R-project.org/package=ParamHelpers
19. Bischl, B, Richter, J, Bossek, J, Horn, D, Thomas, J, and Lang, M. MlrMBO: A modular framework for model-based optimization of expensive black-box functions. arXiv preprint arXiv:170303373, 2017.
26. Breheny, P and Burchett, W. Visualization of regression models using visreg. The R Journal 9: 56–71, 2017.
32. Casalicchio, G, Bossek, J, Lang, M, Kirchhoff, D, Kerschke, P, Hofner, B, et al. OpenML: An r package to connect to the machine learning platform openml. Computational Statistics 1–15, 2017.
39. Davison, AC and Hinkley, DV. Bootstrap methods and their applications. Cambridge: Cambridge University Press, 1997.Available from: http://statwww.epfl.ch/davison/BMA/
49. Fox, J. Effect displays in R for generalised linear models. Journal of Statistical Software 8: 1–27, 2003.Available from: http://www.jstatsoft.org/v08/i15/
51. Fox, J and Weisberg, S. Visualizing fit and lack of fit in complex regression models with predictor effect plots and partial residuals. Journal of Statistical Software 87: 1–27, 2018.Available from: https://www.jstatsoft.org/v087/i09
52. Fox, J, Weisberg, S, and Price, B. CarData: Companion to applied regression data sets. 2019.Available from: https://CRAN.R-project.org/package=carData
61. Greenwell, B, Boehmke, B, and Gray, B. Vip: Variable importance plots. 2020.Available from: https://CRAN.R-project.org/package=vip
62. Greenwell, BM. Pdp: An r package for constructing partial dependence plots. The R Journal 9: 421–436, 2017.Available from: https://journal.r-project.org/archive/2017/RJ-2017-016/index.html
64. Hamner, B and Frasco, M. Metrics: Evaluation metrics for machine learning. 2018.Available from: https://CRAN.R-project.org/package=Metrics
69. Henry, L and Wickham, H. Purrr: Functional programming tools. 2020.Available from: https://CRAN.R-project.org/package=purrr
70. Henry, L, Wickham, H, and Chang, W. Ggstance: Horizontal ’ggplot2’ components. 2020.Available from: https://CRAN.R-project.org/package=ggstance
76. Hernán, MA, Hsu, J, and Healy, B. A Second Chance to Get Causal Inference Right: A Classification of Data Science Tasks. CHANCE 32: 42–49, 2019.
80. Hocking, TD. Directlabels: Direct labels for multicolor plots. 2020.Available from: https://CRAN.R-project.org/package=directlabels
95. Jovanovic, M. shorts: Short sprints., 2020.Available from: https://mladenjovanovic.github.io/shorts/
96. Jovanovic, M and Hemingway, BS. dorem: Dose response modeling., 2020.Available from: https://dorem.net
97. Jovanović, M. bmbstats: Bootstrap magnitude-based statistics. Belgrade, Serbia, 2020.Available from: https://github.com/mladenjovanovic/bmbstats
98. Jovanović, M. vjsim: Vertical jump simulator., 2020.Available from: https://mladenjovanovic.github.io/vjsim/
109. Kuhn, M. Caret: Classification and regression training. 2020.Available from: https://CRAN.R-project.org/package=caret
116. Lang, M, Binder, M, Richter, J, Schratz, P, Pfisterer, F, Coors, S, et al. mlr3: A modern object-oriented machine learning framework in R. Journal of Open Source Software, 2019.Available from: https://joss.theoj.org/papers/10.21105/joss.01903
117. Lang, M, Kotthaus, H, Marwedel, P, Weihs, C, Rahnenfuehrer, J, and Bischl, B. Automatic model selection for high-dimensional survival analysis. Journal of Statistical Computation and Simulation 85: 62–76, 2014.
128. Makowski, D, Ben-Shachar, MS, and Lüdecke, D. BayestestR: Describing effects and their uncertainty, existence and significance within the bayesian framework. Journal of Open Source Software 4: 1541, 2019.Available from: https://joss.theoj.org/papers/10.21105/joss.01541
141. Müller, K and Wickham, H. Tibble: Simple data frames. 2020.Available from: https://CRAN.R-project.org/package=tibble
151. Pinheiro, J, Bates, D, DebRoy, S, Sarkar, D, and R Core Team. nlme: Linear and nonlinear mixed effects models., 2020.Available from: https://CRAN.R-project.org/package=nlme
152. Probst, P, Au, Q, Casalicchio, G, Stachl, C, and Bischl, B. Multilabel classification with r package mlr. arXiv preprint arXiv:170308991, 2017.
154. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2020.Available from: https://www.R-project.org/
156. Revelle, W. Psych: Procedures for psychological, psychometric, and personality research. Evanston, Illinois: Northwestern University, 2019.Available from: https://CRAN.R-project.org/package=psych
173. Sarkar, D. Lattice: Multivariate data visualization with r. New York: Springer, 2008.Available from: http://lmdvr.r-forge.r-project.org
182. Therneau, T and Atkinson, B. Rpart: Recursive partitioning and regression trees. 2019.Available from: https://CRAN.R-project.org/package=rpart
184. Vaughan, D and Kuhn, M. Hardhat: Construct modeling packages. 2020.Available from: https://CRAN.R-project.org/package=hardhat
190. Wickham, H. Ggplot2: Elegant graphics for data analysis. Springer-Verlag New York, 2016.Available from: https://ggplot2.tidyverse.org
191. Wickham, H. Stringr: Simple, consistent wrappers for common string operations. 2019.Available from: https://CRAN.R-project.org/package=stringr
192. Wickham, H. Forcats: Tools for working with categorical variables (factors). 2020.Available from: https://CRAN.R-project.org/package=forcats
193. Wickham, H, Averick, M, Bryan, J, Chang, W, McGowan, LD, François, R, et al. Welcome to the tidyverse. Journal of Open Source Software 4: 1686, 2019.
194. Wickham, H, François, R, Henry, L, and Müller, K. Dplyr: A grammar of data manipulation. 2020.Available from: https://CRAN.R-project.org/package=dplyr
195. Wickham, H and Henry, L. Tidyr: Tidy messy data. 2020.Available from: https://CRAN.R-project.org/package=tidyr
196. Wickham, H, Hester, J, and Francois, R. Readr: Read rectangular text data. 2018.Available from: https://CRAN.R-project.org/package=readr
201. Wilke, CO. Cowplot: Streamlined plot theme and plot annotations for ’ggplot2’. 2019.Available from: https://CRAN.R-project.org/package=cowplot
202. Wilke, CO. Ggridges: Ridgeline plots in ’ggplot2’. 2020.Available from: https://CRAN.R-project.org/package=ggridges
204. Xie, Y. Dynamic documents with R and knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC, 2015.Available from: https://yihui.org/knitr/
205. Xie, Y. Bookdown: Authoring books and technical documents with R markdown. Boca Raton, Florida: Chapman; Hall/CRC, 2016.Available from: https://github.com/rstudio/bookdown
208. Zhu, H. KableExtra: Construct complex table with ’kable’ and pipe syntax. 2019.Available from: https://CRAN.R-project.org/package=kableExtra