Introduction

Breast cancer (BC) ranks as the most prevalent cancer among women, accounting for 30% of all cases and contributing to 15% of deaths in 2022 [1, 2]. It has been highlighted in recent reports that BC is recognized as the leading malignancy among cancer-related deaths of women around the world. Furthermore, the incidence and mortality rates have been constantly rising since the beginning of the twenty-first century [3]. In order to lower the increasing burden of BC [4], a number of researchers have been actively investigating the potential influence and risk factors for BC so as to reduce the incidence rate of BC [5, 6].

Physical activity (PA) and exercise are considered to exert positive effect on the handling of various chronic diseases, especially with regard to prevention and treatment [7, 8]. Individuals reporting higher levels of PA tended to exhibit better overall health [9, 10]. However, engagement in PA differs greatly from person to person. Culture and the economy have an impact on it, among other environmental factors. Multiple studies have provided evidence that the predisposition of humans to exercise is associated with genetic factors [11, 12]. Several epidemiological studies have investigated the relationship between physical activities and BC, with conflicting findings [1315]. While some prospective cohort studies have suggested a link between physical activities and reduced BC risk [16, 17], other studies have shown that physical activities are not correlated with the risk of BC [18, 19]. Taking the limitations of observational studies into account, random and systematic errors affect the validity of the findings above, including potential selection bias, effects of cohort design bias, limited sample size, missed follow-ups as well as the presence of reverse causality between outcomes and exposure. In addition, ethical issues, cost, as well as long follow-ups restrict randomized controlled trials. It remains unknown whether physical activities play a causal role in BC. Also, it is difficult to determine the specific distinctions about the duration, intensity, and type of exercise. In particular, the best solution for the types of exercise that patients with BC can choose is even harder to decide. Thus, a two-sample Mendelian randomization (MR) method was utilized in this study to identify a potential causal link between different types of physical activities and BC.

In the MR analysis, single nucleotide polymorphisms (SNPs) which have a strong relationship with exposure, for example heavy PA, were considered as instrumental variables (IVs) to estimate the causal effect on outcome (i.e., BC). MR is a ‘natural’ RCT that makes use of the random distribution of genetic variants with an influence on exposure [20]. Those SNPs that are strongly linked with confounders are eliminated before performing the MR analysis in order to remove the effect of confounding factors. Reverse MR analysis can exclude a potential reverse causal effect between exposures and outcomes. In this study, five types of PA with various intensities were analyzed to investigate their association with BC. Furthermore, bilateral MR was performed by using datasets from genome-wide association study (GWAS) to examine the causal link between PA and BC.

Material and methods

Study design

A two-sample MR study was designed to estimate the potential causal link between PA and BC. The SNPs were selected as IVs following three essential premises [21]: (1) SNPs should be strongly linked to PA as exposure; (2) SNPs should not be linked to confounding factors; and (3) SNPs should not be linked directly to BC as the outcome (Figure 1).

Figure 1

Study design to assess the correlation between physical activities and risk of breast cancer based on the assumptions of bidirectional Mendelian randomization

https://www.archivesofmedicalscience.com/f/fulltexts/195271/AMS-20-6-195271-g001_min.jpg

GWAS summary statistics

The summary statistics for PA of various types were acquired and extracted from an online public database (IEU Open GWAS Project https://gwas.mrcieu.ac.uk/). In this study, five types of PA corresponding to different intensities were selected from the database and utilized to investigate the causal association with BC: heavy DIY, light DIY, strenuous sports, walking for pleasure, and other exercises. The data were collected by asking participants to fill in questionnaires using a touchscreen; the participants were provided with the different options above and asked to choose the one that they had been involved in most in the last month. Furthermore, the survey included 497,174 European participants, both males and females. The PA assessment in this study was validated by 4 instances, including 497,235 participants, thus ensuring a large enough sample size for reliable results; specifically, instance 0 was the initial assessment visit (2006–2010), instance 1 was the first repeat assessment visit (2012–13), instance 2 was the imaging visit (2014), and the last instance was the first imaging visit (2019).

The summary data for BC, which include 122,977 cases and 105,974 controls, were extracted from the Breast Cancer Association Consortium. Based on the estrogen receptor status, the cases were further classified into two categories: estrogen receptor-positive breast cancer (ER+ BC) and estrogen receptor-negative breast cancer (ER− BC) Table I presents details of the exposure and outcomes.

Table I

Detailed information on exposure and outcomes

Exposure/outcomeN casesN controlsSample sizeAncestryMRC-IEU ID
Heavy DIY (e.g. weeding, lawn mowing, carpentry, digging)197,006263,370460,376Europeanukb-b-13184
Light DIY (e.g. pruning, watering the lawn)236,244224,132460,376Europeanukb-b-11495
Strenuous sports47,468412,908460,376Europeanukb-b-7663
Walking for pleasure (not as a means of transport)329,755130,621460,376Europeanukb-b-7337
Other exercises (e.g. swimming, cycling, keep fit, bowling)222,470237,906460,376Europeanukb-b-8764
Breast cancer122,977105,974228,951Europeanieu-a-1126
ER+ BC69,501105,974175,475Europeanieu-a-1127
ER– BC21,468105,974127,442Europeanieu-a-1128

[i] ER+ BC – estrogen receptor status into estrogen receptor-positive, ER BC – estrogen receptor-negative.

Ethical approval

All summary-level datasets in our study were obtained from de-identified public data/studies. Ethical approval and informed consent were previously obtained from the ethics committee. Thus, the requirement for ethical approval was waived for this study.

SNP selection

Firstly, we conducted a screening process to identify SNPs that were highly correlated with exposure at a genome-wide significance level (p < 5 × 10–8). Secondly, we implemented a criterion (r2 < 0.001, kb = 10000) to choose SNPs that were free from dependence on linkage disequilibrium (LD). Thirdly, we excluded SNPs that were not present in the BC dataset and palindromic SNPs which have the potential to introduce bias. All of the SNPs for instrumental variables were uploaded to PhenoScanner to identify confounding SNPs associated with BC. Based on the assumption of the MR analysis, SNPs used as instrumental variables should be strongly associated with exposure. Subsequently, we ensured the harmonization of exposure and outcome data, confirming that the effect of the SNP on the exposure corresponded to the same allele as its effect on the outcome. Following this, we assessed the possibility of weak instrumental bias by calculating F-statistics, and excluded SNPs with F-statistics less than 10. The F statistic was calculated using the formula F = beta2/se2. Finally, we employed the MR-PRESSO method to identify outlier SNPs. After removing the outliers, the remaining SNPs were utilized for subsequent MR analysis. A flowchart illustrating the selection process is provided in Figure 1.

Two-sample Mendelian analysis

Three popular MR methods were employed to assess causal effects: inverse variance weighted (IVW), weighted median and MR-Egger [22, 23]. IVW, a reliable and robust MR method in the absence of horizontal pleiotropy [24], combines the Wald estimates of individual SNP to derive overall estimates of the effect of physical activities on BC risk. Consequently, the IVW method is broadly acknowledged as the most effective approach to assess causality. Odds ratios (ORs) were utilized to express the effects of physical activities on BC risk. If the result of the IVW method is significant (p < 0.05), it can be considered positive even if other methods yield nonsignificant results, provided that the ORs of those methods line up in the identical direction without heterogeneity or pleiotropy. Two types of IVW approaches, namely the fixed and random effect model, were employed to account for existing heterogeneity. Cochran’s Q test was used to assess the heterogeneity in the IVW method and MR-Egger regression, with a p-value < 0.05 considered statistically significant [25]. Unlike IVW, the MR-Egger method includes an intercept term designed to test for horizontal pleiotropy. A non-zero intercept term indicates that not all genetic variants are valid instruments, thereby biasing IVW estimates. When the instrument strength independent of direct effect (InSIDE) assumption is met, the MR-Egger method can offer an approximation of the causal impact of horizontal pleiotropy [26]. The weighted median method offers a robust effect estimate, even in the presence of unbalanced horizontal pleiotropy (e.g., when 50% of instrumental SNPs are invalid). Finally, the MR-PRESSO method encompasses three detection functions [27, 28]: horizontal pleiotropic detection, horizontal pleiotropic correction (after outlier removal), as well as assessment of differences in the results of causality estimation before and after correction.

Statistical analysis

Heterogeneity was assessed by employing Cochran’s Q test [29], where a p-value > 0.05 indicated the absence of heterogeneity. The MR-Egger regression test was utilized to identify horizontal pleiotropy, where a zero-intercept suggests the absence of pleiotropy (p > 0.05).

Reverse MR analysis

To explore the potential causal relationship between BC and PA, a reverse MR analysis was carried out, wherein BC served as the exposure and PA as the outcome, employing SNPs associated with BC as IVs.

All statistical analyses were conducted using R software (version 4.2.3) with the TwoSampleMR (version 0.5.6), MR PRESSO (version 1.0), and MendelianRandomization (version 0.7.0) packages.

Results

MR analysis results

The results of the MR analysis for the three methods are presented in Table II for physical activities, BC, ER+ BC and ER– BC. The MR estimates suggested that walking for pleasure conferred a protective effect against ER+ BC (odds ratio (OR) = 0.302, 95% CI = –2.257– –0.137, p = 0.027). No causal effect was observed for the other four types of physical activities on BC, ER+ BC and ER– BC. Scatter plots depicting the MR analysis of the causal effect of PA on BC, ER+ BC, and ER– BC are presented in Figures 24, respectively. Cochran’s Q and MR-Egger regression analyses revealed no evidence of heterogeneity or horizontal pleiotropy affecting the stability of the results. Hence, drawing from the IVW findings (p < 0.05), we can infer the presence of a causal relationship between leisurely walking and ER+ BC.

Table II

MR analysis outcomes from several methods investigating the casual impact of distinct physical activities on breast cancer

ExposurensnpSEIVWMR-EggerWMM
OR/β (95% CI)P-valueOR/β (95% CI)P-valueOR/β (95% CI)P-value
Breast cancer as outcome
 Heavy DIY180.4210.739 (0.324–1.689)0.4740.139 (0.001–254.7)0.6140.880 (0.394–1.963)0.755
 Light DIY130.5631.006 (0.334–3.031)0.9926.827 (0.088–529.8)0.9910.634 (0.257–1.567)0.323
 Strenuous sports61.1130.292 (0.032–2.592)0.2690.003 (0.001–122.0)0.3440.097 (0.017– 0.541)0.008
 Walking for pleasure200.5600.410 (0.137–1.230)0.1111.949 (0.001–158978.8)0.9090.638 (0.256–1.591)0.909
 Other exercises140.5960.361 (0.112–0.112)0.08732.95 (0.005–215785.3)0.4500.580 (0.255–1.318)0.193
ER+ breast cancer as outcome
 Heavy DIY180.4200.529 (0.232–1.206)0.1300.081 (0.001–149.4)0.5220.431 (0.182–1.022)0.056
 Light DIY130.4861.176 (0.453–0.453)0.7382.109 (0.044–99.82)0.7111.143 (0.399–3.276)0.802
 Strenuous sports61.1770.314 (0.031–3.163)1.1770.001 (0.001–36.80)0.2640.190 (0.023–0.023)0.121
 Walking for pleasure200.5410.302 (0.105–0.872)0.0272.602 (0.001–142188.6)0.8660.458 (0.179–1.169)0.103
 Other exercises140.6050.510 (0.156–1.672)0.26724.27 (0.002–199691.2)0.2670.705 (0.284–1.749)0.451
ER-breast cancer as outcome
 Heavy DIY180.6021.410 (0.433–4.590)0.5670.789 (0.001–38910.5)0.9660.689 (0.166–2.850)0.607
 Light DIY131.0920.571 (0.067–4.858)0.60861.63 (0.014–261520.3)0.3540.576 (0.126–2.634)0.477
 Strenuous sports61.9630.146 (0.003–6.900)0.3280.013 (0.001–7032887)0.6950.605 (0.024–15.04)0.759
 Walking for pleasure200.5850.477 (0.151–1.504)0.20787.05 (0.001–9758134)0.4620.541 (0.129–2.266)0.400
 Other exercises140.7890.342 (0.072–1.610)0.17516.86 (0.001–2573806)0.1750.377 (0.083–1.695)0.203

[i] IVW – inverse variance weighted, WMM – weighed median method, SE – standard error, Breast cancer – overall breast cancer risk, ER – estrogen receptor-positive breast cancer risk, ER+− – estrogen receptor-negative breast cancer risk; Heavy DIY types of physical activity in last 4 weeks: Heavy DIY (e.g., weeding, lawn mowing, carpentry, digging), Light DIY types of physical activity in last 4 weeks: Light DIY (e.g., pruning, watering the lawn), Strenuous sports types of physical activity in last 4 weeks: Strenuous sports, Walking for pleasure types of physical activity in last 4 weeks: Walking for pleasure (not as a means of transport), Other exercises types of physical activity in last 4 weeks: Other exercises (e.g., swimming, cycling, keep fit, bowling)

Figure 2

Scatter plots depicting the MR analysis of physical activities on BC for investigating casual impacts. A – Heavy DIY. B – Light DIY. C – Strenuous sports. D – Walking for pleasure. E – Other exercises

https://www.archivesofmedicalscience.com/f/fulltexts/195271/AMS-20-6-195271-g002_min.jpg
Figure 3

Scatter plots depicting the MR analysis of physical activities on ER+ BC for investigating casual impacts. A – Heavy DIY. B – Light DIY. C – Strenuous sports. D – Walking for pleasure. E – Other exercises

https://www.archivesofmedicalscience.com/f/fulltexts/195271/AMS-20-6-195271-g003_min.jpg
Figure 4

Scatter plots depicting the MR analysis of physical activities on ER– BC for investigating casual impacts. A – Heavy DIY. B – Light DIY. C – Strenuous sports. D – Walking for pleasure. E – Other exercises

https://www.archivesofmedicalscience.com/f/fulltexts/195271/AMS-20-6-195271-g004_min.jpg

Reverse MR analysis

BC, ER+ BC and ER– BC were utilized as exposures, while physical activities were employed as outcomes for conducting the reverse analysis. According to the estimates derived from the reverse MR analysis, no reverse causal association was observed between physical activities and BC.

Discussion

With economic development and technological progress, there has been a gradual reduction in occupational, transportation, and daily physical activities, leading to a global issue of insufficient PA and increased sedentary behavior. This phenomenon has emerged as one of the most significant public health concerns of the 21st century. Research indicates that reduced PA is a crucial risk factor for cancer in women, including malignant tumors such as BC, endometrial cancer, ovarian cancer, cervical cancer, and fallopian tube tumors.

Clinical studies have investigated the correlation between PA and BC. Prior research examining the potential impact of domain-specific physical activities on BC risk exhibits considerable heterogeneity [30]. On the one hand, BC demonstrates substantial cellular, genetic, and molecular heterogeneity [31]. The diversity forms the basis for the current clinical classification reliant on estrogen and progesterone receptor expression (ER and PR), as well as human epidermal growth factor receptor (HER2/ERBB2), facilitating targeted therapeutic approaches [32]. Conversely, physical activities encompass a multitude of daily movement patterns varying in timing, setting, and intensity, potentially leading to varied effects depending on type, intensity, and duration [33]. Previous meta-analyses of prospective studies have suggested reduced BC risks associated with elevated PA levels. For instance, a case-control study in Addis Ababa showed significantly lower odds of BC among women engaging in moderate physical activities [34]. However, another study in Spain (the MCC-Spain study) reported that elevated levels of moderate-to-high-intensity household (HPA) and recreational PA (RPA) correlated with decreased BC risk, exhibiting heterogeneity by molecular type, while sitting time consistently emerged as an independent risk factor for BC. The positive correlation observed between OPA (occupational PA) and ER+/PR+ BC warrants further exploration Nevertheless, this review faces several limitations, including heterogeneity in the prescription of PA interventions (modality, frequency, duration, intensity, and timing), preclinical cancer models, and characteristics of human participants. However, it is important to acknowledge that this review has several limitations, notably the heterogeneity observed in the prescription of PA interventions, which encompasses variations in modality, frequency, duration, intensity, and timing. Additionally, the study is limited by the use of preclinical cancer models and the diverse characteristics of human participants. Moreover, a substantial proportion of the clinical studies analyzed are exploratory in nature, featuring small sample sizes, thereby hindering the formulation of definitive conclusions regarding the potential impact of PA on BC immune outcomes.

Several biological mechanisms have been postulated to elucidate the potential beneficial impacts of PA on BC progression. PA has been shown to decrease the concentrations of circulating insulin and insulin-like growth factor, stimulate cellular proliferation within breast tissues, and thereby inhibit cancer development in these tissues. Moreover, raised levels of PA result in reduced circulating estradiol levels and elevated sex hormone-binding globulin levels, both of which are recognized risk factors for BC. Notably, the significant associations observed pertain to ER (estrogen receptor)-positive cancers rather than ER cancers alone, indicating that non-hormonal mechanisms may contribute to the protective effects of PA. This is the rationale behind our decision to stratify BC based on ER+ and ER– status and to investigate the causal relationship between PA and these subtypes.

Several biologic mechanisms have been suggested to elucidate the relationship between PA and BC risk [3538]. They can be summarized as follows: Endogenous estrogen exposure, obesity, insulin-like growth factor I (IGF-I), and immune function [39]. 1) Endogenous estrogen exposure. Regular PA can reduce the occurrence of BC in women by reducing the accumulation and circulation of endogenous estrogen through late menarche age, early menopause, reduced frequency of menstruation, decreased estrogen levels in the follicular phase and progesterone levels in the luteal phase, as well as through steroid hormone pathway. There is abundant epidemiological and clinical evidence demonstrating that estrogen significantly promotes the late-stage growth of estrogen-sensitive tumors, activating estrogen receptors to promote the proliferation of BC cells. Menopausal women with higher levels of daily PA have lower levels of estrogen precursors and higher levels of sex hormone-binding globulin [40, 41]. 2) Obesity. PA can reduce energy intake, leading to reduced postmenopausal weight, controlled weight gain, and decreased abdominal fat. Plasma levels of free estrogen increase in obese women, and after menopause, most of the estrogen in the blood comes from fat. Pre-menopausal obese women often experience cessation of ovulation, resulting in lower levels of circulating estrogen and progesterone. Therefore, PA reduces the risk of BC in postmenopausal women more significantly than in premenopausal women. 3) Insulin-like growth factor I (IGF-I). There is substantial evidence suggesting that androgens may increase the risk of ovarian cancer and BC, while progesterone has a protective effect [4244]. IGF-I is a peptide hormone with functions and structures similar to insulin, stimulating all growth processes. Increased circulating concentrations of IGF-I can increase the risk of many cancers, such as BC [45]. Insulin indirectly increases the levels of biologically available estrogen and androgens by down-regulating sex hormone-binding globulin and up-regulating ovarian estrogen production, thereby increasing the risk of BC. A possible mechanism is that PA may lower serum levels of IGF-I by adjusting energy balance [46]. 4) Immune function. Many diseases are related to the body’s immune function, and moderate PA can increase natural killer cells, lymphocytes, macrophages, and monocytes, thereby enhancing immune function. However, excessive PA may actually decrease immune function [47, 48].

The limitations of our MR study necessitate careful consideration. Our analysis was exclusively reliant on GWAS summary statistics derived from European populations, which inevitably confines the generalizability of our findings to other ethnic groups, notably Asians. Furthermore, the inability to compute sample overlap within this study is another constraint; however, the utilization of robust instrumental variables served as a partial mitigation for this potential bias. The exclusion of potential confounding factors, encompassing environmental determinants, occupational impacts, and BC treatments, posed a considerable challenge. Despite conducting an extensive array of sensitivity tests tailored to detect horizontal pleiotropy, the complete elimination of pleiotropic mechanisms remains impractical in the absence of comprehensive functional validations of these genetic loci. This limitation is primarily attributed to our limited understanding of the biological activities associated with these SNPs.

While vertical pleiotropy, where a single exposure influences an outcome through intermediary variables along the same causal chain, can be managed with appropriate statistical adjustments, horizontal pleiotropy – where an exposure affects multiple outcomes through distinct causal pathways – poses a formidable obstacle to MR inference. Addressing this complexity necessitates further advancements in our biological comprehension of these SNPs and the development of more sophisticated analytical methodologies.

In conclusion, our study utilized two-sample Mendelian randomization to deduce a causal link between PA and BC, concluding that walking for pleasure reliably influences ER+ BC risk. Our findings may offer valuable insights for clinical decision-making, suggesting that walking for pleasure may contribute to mitigating BC risk. If walking for pleasure indeed reduces the incidence risk of ER+ BC, then promoting physical exercise would be beneficial, not only for the general population – where it can bring public health benefits in terms of enhancing productivity and reducing healthcare costs – but also for those at risk of developing ER+ BC.

In conclusion, in this study, we derived a robust conclusion through the implementation of the MR method, indicating a credible causal link between walking for pleasure and ER+ BC. Our findings imply a potential protective relationship between walking for pleasure and ER+ BC. Overall, our research lends support to the notion that undertaking walking for pleasure serves as an effective preventive measure against ER+ BC.