The global COVID-19 pandemic has generated health problems worldwide [1]. In recent years, a misuse of statistical analyses leading to flawed or inaccurate conclusions has been observed around the world [2]. At the International Statistical Congress held in Malaysia in 2019, it was mentioned that only 40% of accepted medical articles are statistically correct [3]. It transpires that this also applies to articles about COVID-19.
Methods
The analysis included 2600 articles on the medical aspects of COVID-19 (overall quality of life, medical/pharmacological approach) published between the beginning of 2020 and June 2021. International databases such as PubMed and Scopus have been searched for the medical aspects of COVID-19. For the analysis, articles published in journals with an impact factor of up to 15 were considered, of which over 95% had a maximum impact factor of 10. The statistical correctness of each article was analysed, namely in terms of all aspects related to the analysis performed. It includes the selection of appropriate statistical tests (including checking their assumptions), as well as the correct interpretation and recording of the obtained results (including the use of appropriate descriptive statistics). During the evaluation, additional factors were also considered, such as a description of the analysed variables, statistical software used, appropriate sample size, and considering possible data gaps. Incorrect analysis of the results, and referring to them in the discussion, resulted in treating the article as one in which the statistical analysis was conducted incorrectly. This applies, for example, to incorrect selection of statistical tests, failure to meet the assumptions of many statistical tests, visible and not described by the authors, etc.
Results
Firstly, of the 2600 analysed publications on the medical aspects of COVID-19, only 39% (n = 1014; unpublished) were statistically correct. This can even cause ambiguous results regarding various aspects of COVID-19, and this is not what we all expect. The most common mistakes include the use of inadequate statistical tests (including parametric equivalents despite unfulfilled assumptions), as well as incorrect estimation or underestimation of the correct size of the tested sample. The more advanced statistical analyses used by the authors in the manuscript should be reviewed by experienced people or, for example, statistical reviewers [4, 5].
The first table below shows only the basic statistical errors made by the authors when publishing the obtained research results related to various aspects of COVID-19 (Table I).
Table I
Secondly, the retraction of articles on COVID-19 due to incoherent data also applies to prestigious international periodicals. Among the results of this situation is information chaos, which undermines trust in reliable sources of information and affects the approach to vaccination against COVID-19. In a study of 3480 non-medical students, 75% of them (n = 2.610; unpublished) stated that due to such situations, their confidence in the results of studies on COVID-19 published in prestigious journals significantly decreased.
Third, as can be seen for many years, young researchers who want to have their articles published in ranked journals commit statistical fraud frequently. For example, a study was carried out on a group of 14,000 people working in various fields of medicine (including physicians, graduate students, PhD students, PhDs, and professors) indicating, for example, that as many as 76% of respondents stated that they did not know what type 1 cumulative error is. Forty-six percent of people admitted that they often performed several or over a dozen t-tests instead of conducting an analysis of variance. While 10% of them did so due to a lack of knowledge, others wanted to increase the chance of obtaining a statistically significant result [3].
The same, unfortunately, applies to analysing the results obtained for COVID-19. Researchers deal with this issue because it is topical now, and so they think that even if they carry out their analysis incorrectly, they will have a chance to obtain points for the published article. Instead of improving the quality of life of people suffering from COVID-19, they can unfortunately work to their detriment. In an exemplary study that I carried out in 2020 on a group of 550 scientifically working psychiatrists, 48% of them (n = 264; unpublished) began research on COVID-19, explaining that it was a topical issue, and hence they would be able to publish the obtained results even faster. As we know, this plays a significant role in the functioning of research units. The speed in question may unfortunately result in the publication of scientific articles in which statistical analysis is incorrectly carried out. It may be related to the factors motivating employees to conduct scientific research. This applies, e.g., to competitive pressures, institutional, regional, and national recognition, and financial remuneration [6, 7].
Discussion
Optimizing protection against infection in wealthy nations (e.g. UE, USA) and not helping low-income countries with vaccinations will lead to a prolonged pandemic. The editors of medical journals should pay more attention to validating the authors’ statistical analysis. Having statistical reviewers or statistical editors is essential for biomedical journals. The authors of an article published in PLoS One point out that there is a huge need to improve statistical education [8]. The substantive assessment of the submitted articles is clearly insufficient. We will have to face this pandemic for a long time, not (only) because of the non-observance of the statistical rigors by the researchers, but also because of the non-observance of the specific prevention measures by a large proportion of the general population.
In conclusion, the results on various aspects of COVID-19 sent to journals should be subjected to thorough statistical review.