Introduction

Nonalcoholic fatty liver disease (NAFLD), which is one of the most common forms of chronic liver disorder, has an estimated prevalence of 20–35% in the general population [1]. It is defined as a diffuse accumulation of triglycerides (TGs) in hepatocytes after excluding other causes of hepatic steatosis (including other causes of liver disorder, excessive alcohol intake, and other conditions that could contribute to hepatic steatosis) [2]. Almost 20% of patients progress over time to nonalcoholic steatohepatitis (NASH), liver necrosis, cirrhosis, and ultimately hepatocellular carcinoma [3, 4]. Indeed, NAFLD is considered as a hepatic manifestation of metabolic syndrome, and substantial evidence suggests that NAFLD has side effects that extend well beyond the liver [5]. Multiple metabolic abnormalities and extrahepatic complications are associated with NAFLD, such as insulin resistance, cardiac metabolic illness, and chronic kidney disease, through enhanced low-grade systemic inflammatory mechanisms, which also play a causal role in the occurrence and development of NAFLD [612].

Lung function, as evaluated by forced expiratory volume in 1 s (FEV1) and forced vital capacity (FVC), has usually been used to forecast morbidity, mortality, and all-cause mortality related to different airway illnesses [1317]. Recent evidence has suggested that lung function impairment is associated with increased risk of metabolic syndrome, cardiovascular disorders, diabetes, as well as systemic inflammation [18, 19]. Thus, impaired lung function may be linked to an increased risk of NAFLD, which has been reported in several observational studies [2022]. Clarifying the nature and direction of the association between pulmonary function and NAFLD can help guide primary prevention. However, due to the limitations of study design, traditional epidemiological studies are usually susceptible to residual confounding, selection bias and other factors, resulting in difficulty in distinguishing causal effects from spurious associations [23]. Hence, it remains unclear whether the observed relationship between lung function and the risk of NAFLD is causal.

Mendelian randomization (MR) is an emerging and potent epidemiological technique, which employs genetic variants that are robustly associated with underlying modifiable risk factors to ascertain their causal contribution to disorder risk [24]. Since these variants are randomly assigned from parents to offspring at conception, MR studies are less vulnerable to reverse causality, confounding, and measurement biases [25]. Given that previous epidemiological studies have not clearly established a causal relationship between lung function and NAFLD, a bidirectional two-sample MR approach may help to clarify the complex causal relationship between the two conditions. Herein, we used published results from genome-wide association studies (GWASs) to investigate whether there is a causal association between lung function and NAFLD, and to assess and analyze the causal effects.

Material and methods

Study design

To evaluate the association of FEV1 (per SD) as well as FVC (per SD) with NAFLD, we conducted univariate MR, multivariate MR, and bidirectional two-sample MR analyses, using publicly available genetic data derived from the worldwide genetic consortium. No additional informed consent was required. Single nucleotide polymorphisms (SNPs) served as instrumental variables (IVs) according to three key assumptions. Firstly, the chosen genetic variants proposed as IVs should be strongly correlated with the exposure. Herein, R2 and F statistics were applied to represent the strength of the association between genetic variants and exposure. The latest and most rigorous methods of calculation were used. F = R2 (n – k – 1)/k (1 – R2). In this formula, R2 is equal to the cumulative explanatory variance of the selected SNPs on exposure, where k refers to the total number of SNPs selected and n is the sample size. An F value greater than 10 indicates that the correlation is robust enough to avoid bias due to weak IVs. Secondly, the genetic variants need to be independent of potential confounding factors. Thirdly, the genetic variants should affect the risk of outcome merely via the exposure, not through other alternative pathways, which would require no horizontal pleiotropy effect between IVs and outcome.

SNP-FEV1 and FVC association estimates

In this study, genetic IVs for FEV1 (SD, GWAS ID: ukb-b-19657) and FVC (SD, GWAS ID: ukb-b-7953) were derived from the summary GWAS results in the UK Biobank (https://gwas.mrcieu.ac.uk/), with 421,986 subjects of European ancestry. The UK Biobank is one of the most accessible, largest, and in-depth cohort studies in the world, collecting data from over half a million people (aged 40–69) at recruitment in the UK between 2006 and 2010. Trained healthcare staff performed pre-bronchiectasis pulmonary function tests using a Vitalograph Pneumtrac 6800 spirometer (Maids Moreton, UK). Genotyping was performed using the Affymetrix UK BiLEVE Axiom array (~50,000 participants) and the Affymetrix UK Biobank Axiom array (~450,000 participants). Statistical summaries were produced using linear mixed models to account for relatability and population stratification, and then adjusted for sex and genotyping array.

SNP-NAFLD association estimates

Summary-level statistical data for NAFLD were sourced from the FinnGen consortium. We limited the NAFLD data to samples of European ancestry to refrain from potential bias caused by population stratification. The dataset consists of 894 cases and 217,898 controls. The FinnGen project is a public-private partnership which integrates disorder endpoint genetic data from the Finnish biobanks and the Finnish health registries. NAFLD cases were defined using the International Classification of Diseases (ICD) 10 code K76.0 and determined by electronic medical records. Detailed documentation can be found at the FinnGen website (https://finngen.gitbook.io/documentation/v/r5/).

Instrumental SNP selection

Strict inclusion criteria were developed to screen qualified IVs. We selected the SNPs for exposure with a genome-wide significant association (for lung function: p < 5E-08; For NAFLD: p < 1E-05), and without linkage disequilibrium (LD) (r2 < 0.001, clumping distance = 10,000 kb). We searched for the exposure-related SNPs in the outcome dataset. When some genetic instruments were not provided in the outcome dataset, we chose proxy SNPs that were significantly correlated with these SNPs (r2 > 0.8). To ensure that the impact of SNPs on the studied exposure corresponds to the same allele as the effect on the studied outcome, palindromic SNPs with intermediate allele frequencies were eliminated from the IVs. SNPs which had the minor allele frequency (MAF) less than 0.01 were also not included. Eligible SNPs were finally identified as IVs for exposure to conduct the MR analysis.

Statistical analysis

The inverse-variance weighted (IVW) method was recognized as the main statistical model to explore the potential causal relationship between lung function and NAFLD [26]. The IVW method is a weighted regression of the effect of SNP exposure on SNP outcome, with a fixed y-intercept of zero [27]. Given that the presence of horizontal pleiotropy in IVW estimates can affect the results, three complementary methods were adopted to test the robustness of the conclusions, including the MR-Egger regression method, the weighted median method, and the weighted mode method. The MR-Egger regression method relies on the assumption that the polymorphism of the genetic instrument is independent of the instrument strength and can offer an effective estimate of the causal effect [28]. The weighted median method can permit high pleiotropy and generate robust estimates under the circumstance that at least 50% of the information contributing to the analysis comes from valid SNPs [29]. The weighted mode method can produce consistent outcomes, even though most IVs are noneffective [30]. These three complementary methods provided relatively robust outcomes at the expense of statistical power.

Sensitivity analysis

To guarantee the reliability of our conclusion, six sensitivity analyses were performed to check and adjust the presence of pleiotropy in the causal estimates. Firstly, the Cochran’s Q statistic was used to quantify the heterogeneity among the chosen genetic instruments. When heterogeneity was suspected (p < 0.05), a random-effects model of IVW was performed to calculate the effect size; otherwise, a fixed-effects model was applied. Secondly, previous studies have demonstrated that height is an established cause of pulmonary function and might bias the Mendelian randomization estimates for FEV1 and FVC [31], so multivariable MR (MVMR) was conducted to additionally adjust for height. Thirdly, the MR-Egger regression was used to test and correct the potential horizontal pleiotropy among the selected IVs by evaluating the intercept. Fourthly, the MR pleiotropy residual sum and outlier (MR-PRESSO) test was used to identify any outliers, and the MR-PRESSO global test of heterogeneity was used to detect potential horizontal pleiotropy. Fifthly, the asymmetry of the funnel plot was also used as an index to judge horizontal pleiotropy. Finally, leave-one-out sensitivity analysis was conducted by sequentially deleting each SNP to determine whether MR estimates were driven by certain strong SNPs.

Data analyses in the current study were completed using the “TwoSampleMR” and “MRPRESSO” packages of the R 4.0.3 software (www.r-project.org). Multiple comparisons were corrected by the Bonferroni approach, with a p-value of less than 0.025 (0.05/2) indicating convincing evidence of causality.

Results

Instrumental variables selection

For FEV1, 260 SNPs (p < 5E-08, r2 < 0.001) in total were identified, explaining up to 2.24% of the FEV1 variance. The F statistic of these instruments ranged 30–271 with the average value of 57, all exceeding the threshold value of 10. For FVC, 320 SNPs (all p < 5E-08, r2 < 0.001) were identified, explaining up to 2.76% of the FVC variance. The F statistic of these instruments ranged from 30 to 457 with the average value of 63, indicating low evidence for weak instrument bias. Supplementary Tables SI and SII describe the detailed information of the lung function instruments used in the current study.

MR analysis and sensitivity analysis

Since heterogeneity was suspected in the analysis of FEV1 and FVC with NAFLD (p for Cochran’s Q = 0.026 and 0.016, respectively) (Table I), the IVW approach with a multiplicative random effects model was applied. The primary results of IVW analysis revealed that each genetically predicted SD increase in FEV1 and FVC was associated with lower risk of NAFLD (FEV1: odds ratio (OR) = 0.434, 95% confidence interval (CI): 0.251–0.749, p = 0.003; FVC: OR = 0.363, 95% CI: 0.224–0.588, p < 0.001) (Table I, Figure 1). Similar relations were obtained in the MR-Egger regression (FEV1: OR = 0.145, 95% CI: 0.027–0.782, p = 0.026; FVC: OR = 0.260, 95% CI: 0.068–0.995, p = 0.050), weighted median (FEV1: OR = 0.408, 95% CI: 0.193–0.866, p = 0.020; FVC: OR = 0.448, 95% CI: 0.233–0.863, p = 0.016), and weighted mode methods (FEV1: OR = 0.535, 95% CI: 0.090–3.177, p = 0.492; FVC: OR = 0.670, 95% CI: 0.107–4.205, p = 0.669), at a cost of lower statistical power and broader confidence levels (Table I, Figures 1 and 2).

Figure 1

Mendelian randomization estimates of lung function with the risk of nonalcoholic fatty liver disease

https://www.archivesofmedicalscience.com/f/fulltexts/168475/AMS-21-1-168475-g001_min.jpg
Table I

MR estimates from each method of assessing the causal effects of lung function on NAFLD risk

ExposureOutcomeNSNPMethodsBetaSEOR (95% CI)P-valueHorizontal pleiotropyHeterogeneity
Egger interceptSEP-valueCochran’s QP-value
FEV1NAFLD218IVW–0.8350.2790.434 (0.251–0.749)0.0030.0180.0130.180259.3830.026
 MR Egger–1.9280.8580.145 (0.027–0.782)0.026
Weighted median–0.8950.3830.408 (0.193–0.866)0.020
Weighted mode–0.6260.9090.535 (0.090–3.177)0.492
MVMR–0.4340.1910.648 (0.446–0.941)0.023
FVCNAFLD277IVW–1.0130.2460.363 (0.224–0.588)< 0.0010.0050.0100.602328.6790.016
MR egger–1.3460.6840.260 (0.068–0.995)0.050
Weighted median–0.8020.3340.448 (0.233–0.863)0.016
Weighted mode–0.4010.9370.670 (0.107–4.205)0.669
MVMR–0.2560.2220.774 (0.501–1.195)0.248

[i] CI – confidence interval, FEV1 – forced expiratory volume in 1 s, FVC – forced vital capacity, IVW – inverse variance weighted, MR – Mendelian randomization, MVMR – multivariable Mendelian randomization, NAFLD – non-alcoholic fatty liver disease, OR – odds ratio, SNP – single nucleotide polymorphism, SE – standard error, NSNP – number of SNP, p < 0.025 was considered statistically significant.

Figure 2

Scatterplot of the effect size for each SNP on lung function and the risk of nonalcoholic fatty liver disease

https://www.archivesofmedicalscience.com/f/fulltexts/168475/AMS-21-1-168475-g002_min.jpg

For FEV1, results from sensitivity analysis demonstrated that the causal estimates were unlikely to be biased by outlier SNPs and pleiotropic effects (p for MR-Egger intercept = 0.180, p for MR-PRESSO global test = 0.060). The funnel plot showed a symmetric result, implying that the directional and horizontal pleiotropy did not reach significant levels (Supplementary Figure S1). The leave-one-out and forest plots yielded consistent results, and did not reveal noticeable effects of any single SNP that could drive the results (Supplementary Figures S2 and S3). For FVC, the MR-Egger regression analyses suggested that there was relatively weak evidence of horizontal pleiotropy, and one outlier SNP (rs11967850) was identified by MR-PRESSO approaches (p for MR-Egger intercept = 0.602, p for MR-PRESSO global test = 0.043). After excluding this outlier, FVC still showed a converse causal effect on the risk of NAFLD (OR = 0.376, 95% CI: 0.234–0.604, p < 0.001). The funnel plot was visually symmetric (Supplementary Figure S1). The leave-one-out and forest plots further verified the robustness of the MR results (Supplementary Figures S2 and S3). Furthermore, in the MVMR analysis adjusted for height, the converse causal associations of FEV1 and FVC with NAFLD were consistent with the main analyses, but the estimate for FVC was attenuated (FEV1: OR = 0.648, 95% CI: 0.446–0.941, p = 0.023; FVC: OR = 0.774, 95% CI: 0.501–1.195, p = 0.248) (Table I, Figure 1).

Further analyses

To deeply elucidate the causal effect of NAFLD on lung function, we further performed a two-sample MR analysis using NAFLD as exposure and lung function as outcome. Nineteen significant and independent SNPs (p < 1E-05, r2 < 0.001) were identified, explaining 69.33% of the variance for NAFLD. The F statistics of these SNPs ranged from 20 to 102 with the average F value of 28 for NAFLD, showing the absence of weak IV bias (Supplementary Table SIII).

According to the results of IVW, there was no evidence that genetic responsibility for NAFLD had a causal effect on FEV1 (OR = 1.000, 95% CI: 0.996–1.004, p = 0.987) and FVC (OR = 1.003, 95% CI: 0.999–1.006, p = 0.118). MR-Egger, weighted median, and weighted mode approaches also obtained similar resluts (Table II, Figures 3 and 4). Cochran’s Q test showed no significant heterogeneity across the estimates of incorporated SNPs (p for FEV1 = 0.332; p for FVC = 0.363, Table II). Moreover, no significant horizontal pleiotropy was detected by MR-Egger regression (p for FEV1 = 0.054; p for FVC = 0.769, Table II). The MR-PRESSO global test identified no outlier SNPs or a horizontal pleiotropic effect of NAFLD on FEV1 and FVC (p for FEV1 = 0.182; p for FVC = 0.511). The funnel plot also displayed a symmetrical distribution (Supplementary Figure S4). The leave-one-out as well as the forest plots further verified the reliability of the MR results (Supplementary Figures S5 and S6).

Figure 3

Mendelian randomization estimates of effect of nonalcoholic fatty liver disease on lung function

https://www.archivesofmedicalscience.com/f/fulltexts/168475/AMS-21-1-168475-g003_min.jpg
Figure 4

Scatterplot of the effect size for each SNP on nonalcoholic fatty liver disease and lung function

https://www.archivesofmedicalscience.com/f/fulltexts/168475/AMS-21-1-168475-g004_min.jpg
Table II

MR estimates from each method of assessing the causal effects of NAFLD on lung function

ExposureOutcomeNSNPMethodsBetaSEOR (95%CI)P-valueHorizontal pleiotropyHeterogeneity
Egger interceptSEP-valueCochran’s QP-value
NAFLDFEV115IVW0.0000.0021.000 (0.996–1.004)0.987–0.0040.0020.05414.6160.332
MR Egger0.0070.0041.007 (1.000–1.015)0.080
Weighted median0.0010.0031.001 (0.996–1.006)0.578
Weighted mode0.0020.0031.002 (0.996–1.008)0.489
NAFLDFVC14IVW0.0030.0021.003 (0.999–1.006)0.118–0.0010.0020.76913.0880.363
MR egger0.0040.0041.004 (0.995–1.012)0.396
Weighted median0.0020.0021.002 (0.997–1.006)0.485
Weighted mode0.0020.0031.002 (0.997–1.008)0.495

[i] CI – confidence interval, FEV1 – forced expiratory volume in 1 s, FVC – forced vital capacity, IVW – inverse variance weighted, MR – Mendelian randomization, NAFLD – non-alcoholic fatty liver disease, OR – odds ratio, SNP – single nucleotide polymorphism, SE – standard error, NSNP – number of NSP, p < 0.025 was considered statistically significant.

Discussion

To the best of our knowledge, the current MR study is the first to investigate the bidirectional causal relationship between pulmonary function and NAFLD. Our analyses revealed that each genetically predicted SD increase in FEV1 and FVC was associated with a lower risk of NAFLD. Multivariate MR analysis indicated that after adjusting for height, FEV1 and FVC maintained a reverse causal association with NAFLD, but only FEV1 reached a significant level. FEV1 is considered an indicator of the severity of airway obstruction, while FVC is a measure of overall lung capacity [32]. They reflect different aspects of pulmonary function, which might account for the differences in their effects on NAFLD. Furthermore, in the reverse MR analysis, there was no causal effect of NAFLD on FEV1 and FVC.

The association between lung function and NAFLD has been explored in several epidemiological studies. Qin et al. found that damaged lung function as measured by FEV1 and FVC was significantly negatively associated with the prevalence of NAFLD in middle-aged and elderly people without chronic lung disorders [20]. Jung et al. found that NAFLD was independently related to decreased lung function, and the severity of NAFLD was negatively correlated with pulmonary function [21]. Based on the data of a cross-sectional study, Peng et al. found that individuals with higher levels of hepatic steatosis were at higher risk for poor lung function, and damaged pulmonary function may be an extrahepatic complication of NAFLD [22]. However, these cross-sectional studies failed to elucidate the sequence of the occurrence of impaired lung function and NAFLD, making it unclear whether there was a causal association between them. In a large cohort of middle-aged South Koreans with more than half a million person-years of follow-up, Song et al. reported that the risk of NAFLD increased with the decrease of FEV1 and FVC in a dose-response manner, and they concluded that decreased pulmonary function was a risk factor for NAFLD [33]. Our findings were consistent with those of Song et al. indicating that lung function had a reverse causal association with NAFLD risk. Additionally, we also found no causal effect of NAFLD on lung function.

The exact mechanisms by which reduced lung function contributes to the increased risk of NAFLD remains unknown. However, low-grade chronic inflammation may play a critical role as it is strongly linked with both damaged pulmonary function and NAFLD. The lungs and the liver are known to be both highly vascular organs with a dual blood supply [34]. Environmental exposures, for example, occupational dusts, industrial chemicals, as well as particulate air pollution, are well-known adverse factors for the decline in lung function [35]. Inhalation of these harmful particulates from environmental pollution may promote airway inflammation and the production of pro-inflammatory cytokines (such as interleukin-6) from alveolar macrophages, finally leading to airway damage and impaired lung function [36]. Additionally, these inflammatory cytokines can also enter the circulation and induce low-grade inflammation throughout the body, then accelerate the occurrence and development of NAFLD [37].

The current study has a relatively large sample size, and is known to be the first MR study to investigate the bidirectional causal relationship between lung function and NAFLD. The usage of the MR approach freed this study from potential bias and reverse causality. Moreover, the findings of this study suggest that it is necessary to improve lung function in people at high risk of NAFLD. However, several limitations should also be noted. First, since all subjects included in this study were of European descent, one should be cautious in extrapolating the results to other ethnicities. Second, the variables were binary, so it is difficult to assess possible dose-response relationships. Third, the dataset in this study was derived from large-scale GWAS studies, so individual-level detailed information was lacking, which made it impossible to perform stratified analysis.

In a word, using genetic data, we provide strong evidence that reduced lung function, especially FEV1, is causally associated with NAFLD. Although the exact mechanism remains unclear, FEV1 could be considered when assessing NAFLD risk and as a potential target for NAFLD prevention.