Improper Application and Interpretation of Sports Science Statistics

By Team | September 14, 2015 |

Juking the Stats: Why not all “research” is valid

The latest craze in competitive sport appears to be the use of data to aid understanding of, and improvement in sporting performance. This has resulted in a glut of material, each item claiming to have established some new result which may have useful implications in the development and performance of human athletes.

There are often studies conducted with non-athletes as well, and the line between what could be considered medical research as opposed to what is known as sports science is not clearly defined.

I should stress that my knowledge of the field of sports science is limited, the purpose of this article is to question the structure and findings of some typical articles.

A typical paper in this field might take the following form:

1) Design a study with some hypothesis of interest
2) Collect data from subjects (fitness testing)
3) Analyse the data to check for consistency with the hypothesis
4) Draw conclusions.

A good statistician should be able to perform multiple roles.

In my opinion, some of the most important are:

To decide on the real questions at the heart of a problem of interest, not to just churn out results for the sake of it.
To decide if a hypothesis is necessary, and if so to construct one which is of real actual interest. Sometimes it is best to approach a problem with an open mind, in the knowledge that there are likely to be interesting results, but unsure of what they will be.
To employ appropriate methods (typically statistical models) to analyse the collected data (we don’t need to get too technical here).
To explain the underlying reasons behind any results – studies in which results are simply quoted as gospel are of limited interest to me.
To critically review the work, pointing out potential shortcomings and areas for future research.

The final point is perhaps the most interesting. It is often the role of a statistician to dampen (or in some cases pour cold water) on enthusiasm about some exciting results.

Sports Science Statistics must be taken in context.

Conclusions drawn from a study of, say, a weightlifter’s improved performance due to a certain type of training programme should not be used as an automatic basis for a different strength-based sport, such as rowing.

I work in the field of weather forecasting. A modern-day weather forecast involves running a computer model forward in time to produce a single forecast of the atmosphere. Statistics of this forecast (such as the average forecast error) can be calculated at different locations. It is well-known that such statistics vary by location – it is more difficult to predict the weather in Shetland than in the Sahara Desert. We could not, therefore, use statistics derived from one location to predict the average forecast error in another.

In short, statistics is about describing what might have happened in a given context, but didn’t. We can use these findings to issue probabilities of what might happen in the future, on the basis that the context is consistent.

Forget the weather: what about sport science?

The few articles I have read in the sports science field (in all honesty I couldn’t face reading too many!) seem to fall short on many of the above points.

For example, Owen et al. (J Strength Cond Res, 2011) conduct a study of heart rate responses of soccer players when playing in three-sided and nine-sided games. They conclude that the HR of players in three-sided games is consistently higher than for nine-sided games. They also note that three-sided games provide more shooting chances, and encourage players to run more with the ball, whilst the nine-sided games produced more tackles, passes and interceptions.

They draw the conclusions that three-sided games are preferable for fitness training, and suggest that strikers should participate in three-sided games whilst defenders should concentrate on nine-sided games.

I have two main problems with this work from both a scientific and practical viewpoint.

The statistics quoted in themselves should be treated with caution, given the small sample size of fifteen players who participated in only a few games of each type. Without conducting a formal test I cannot be more precise, but these measurements are undoubtedly subject to substantial variation.
What insight does the study really offer us? Aren’t the findings, on which the entire article is based, merely confirmation of the obvious? It is useful here to consider the so-called `pyramid of outcomes’ .

This study gives only surrogate measures (the base of the pyramid), but assumes in the conclusion that such surrogates automatically extend in to true performance measures (essentially whether they can be used to increase the probability of winning football matches).

This assumption seems completely without foundation when one considers the practical implications of the study. For example, suppose that on the basis of the study, strikers train in three-sided games whilst defenders train in nine-sided games, in order to provide more shooting opportunities for strikers and more defending opportunities for defenders. Is there really any point in this? Wouldn’t three-sided games just result in strikers shooting from anywhere, and playing (by definition) against less able defensive opposition? Surely the way to improve as a striker is to learn how to play against good defenders?

Frankly, this work smacks of conducting a study for the sake of it, and drawing conclusions based on a few surrogate measurements without paying any attention to the sport of interest.

How to conduct a more informative study.

1) Collect a larger sample of players from a variety of clubs, preferably from different countries.
2) Train different groups of players in different environments, as suggested by the study.
3) Collect surrogate measurements from the different training sessions.
4) Examine if the surrogates had an effect on actual game results (i.e. construct a proper statistical model rather than merely reporting surrogate values).
5) Examine whether a return to previous training routines result in a reversion to previous performance.

A statistical model is essentially the use of surrogate measurements to aid in predicting the value of, and assessing the uncertainty in, measurements at the top of the pyramid. The article mentioned here simply assumes that larger surrogate values immediately imply improved results, an assertion which is without foundation.

Such a study would admittedly be hard to carry out both practically and from a theoretical statistical viewpoint. However, we are dealing with complicated situations – we are essentially trying to model outcomes from the human body, an immensely complicated organism.

This is my overriding point, studies which simply churn out results for the sake of publishing papers are of little practical use. I would go further and suggest that they are actually dangerous in the wrong hands – a statistical model is no good in the hands of an incapable operator.

Conclusion

From my brief consultation of the literature, I have seen many examples of a mis-use of statistics which would not be permitted in a statistics journal.

The typical methods used are likely to underestimate the complexity of the situation at hand. I suspect therefore, that the true value of statistics such as the p-value are somewhat larger than reported.

I feel confident in ascerting that the conclusions of the articles I have read are based on extremely shaky ground in a theoretical sense, let alone their practical shortcomings.

Robin Williams Statistics Phd Student (University of Exeter), England Blind Footballer, 2012 Olympian

10 Comments

James Marshall on September 13, 2011 at 7:38 pm

Thanks a lot Robin: brilliant analysis. Sorry you had to read the journals first: welcome to my world!
What is worrying is that people who stick their heads in journals and quote “research” based on dodgy stats ignore what is in front of them.
Anonymous on September 14, 2011 at 11:43 am

If you would like to read more on this subject and how poor research and suspect reporting affects the information that reaches us then try the book “Bad Science” by Ben Goldacre. Informative, enlightening and easy to read.
Anonymous on September 14, 2011 at 12:48 pm

Excellent piece even to the supremely statistically uninclined like me.
Excelsior | Using RPE to predict 1RMs- Harrison Evans on August 27, 2015 at 6:23 pm

[…] getting bogged down in the statistics, there was no significant difference between the actual 1-RM and the predicted 1-RM, and the […]
Excelsior | Level 1 S&C Course: Challenging Assumptions on August 27, 2015 at 6:23 pm

[…] evidence and opinion. Evidence can be gained from personal observation and experience, and also the correct interpretation of scientific research, as well as experience of other […]
Excelsior | Reverse engineering the Olympic Lifts on August 27, 2015 at 6:24 pm

[…] Improper application and interpretation of statistics […]
Excelsior | An Accurate Observation Is Never Wrong or What a Coach Needs to Know: Thomas Kurz on August 27, 2015 at 7:29 pm

[…] Improper application of research and statistics in sport. […]
Excelsior | Force, power or acceleration? on August 27, 2015 at 10:53 pm

[…] part of my preparation for the workshop I trying to find research that was both current and measured what we are trying to train (rather than some abstract concept […]
Excelsior | What the academics are keeping from the public on August 28, 2015 at 12:33 am

[…] Improper application and interpretation of statistics in sports “science” […]
Topsy on June 5, 2017 at 12:27 pm

Well put and informative James, thank you.
It can be a bit of a minefield these days, with is much information being pushed out there, its nice to see some clarity on it.