Introduction

Tuberculous meningitis (TBM) has historically been a serious threat to public health. World Health Organization (WHO) estimated that in 2016 there were 10.4 million people who fell ill with tuberculosis (TB) worldwide and 1.3 million deaths after infection [1]. TBM, the most severe form of extrapulmonary tuberculosis, accounts for 1–2% of all cases of TB and kills or severely disables approximately one half of infected people [2]. Diagnosis of TBM is always a challenge due to its non-specific clinical presentation. As a result, appropriate treatment is often delayed, which is frequently associated with high mortality and long-term disability for patients [3]. Early diagnosis of TBM is vital for successful disease management. Thus, a rapid diagnostic approach to TBM with improved accuracy compared to existing methods is required.

Conventional methods, such as acid fast bacilli (AFB) on smear and culture of cerebrospinal fluid (CSF), are the absolute criteria to diagnose TBM. These methods are low-cost and widely used in resource-poor settings [2, 4]. Nevertheless, the sensitivity of smear microscopy is impaired by the low number of bacilli present in CSF. Culture is more sensitive but it may take up to 42 days, which often delays clinical decision-making [5, 6]. Another method, the tuberculin skin test (TST), widely used in developing countries, can be affected by a previous Bacillus Calmette-Guérin (BCG) vaccination and non-tuberculous mycobacteria (NTM) such as Mycobacterium marinum, M. kansasii and M. abscessus [7]. Molecular techniques are employed increasingly in low-income countries on account of their rapid diagnosis. However, the sensitivity of these assays is variable, ranging 0.32–1.00, which makes molecular tests less useful in clinical practice and offers little advantage over smear microscopy [8].

An immunodiagnostic method, namely interferon-γ release assays (IGRAs), which measures the release of interferon (IFN) after stimulation by Mycobacterium tuberculosis (M. Tb) antigens, has provided an alternative method for TBM diagnosis. Two IGRAs, TSPOT.TB and QuantiFERON-TB Gold In-Tube test (QFT-GIT), are now commercially available. The cost of these tests can be up to ten times higher than conventional methods. But, with the unique advantage of short turn-around time and accessibility, their use has increased substantially in the past decade. Many studies on application of IGRAs have been published [913]. Furthermore, the IGRAs have been shown to be a useful diagnostic tool with high specificity up to 0.96 when using CSF of patients with TBM in some studies [14, 15].

To date, whether IGRAs can be used to diagnose TBM is still controversial. A systematic review of 8 publications up to August 2015 concluded that the current evidence does not support the use of IGRAs to diagnose TBM [16]. Since then, a growing number of studies have evaluated the performance of IGRAs in TBM diagnosis, but no updated meta-analysis has been performed. Additionally, the review had a language restriction which may lead to the loss of relevant publications as TB is more prevalent in non-English speaking countries. Herein a meta-analysis with a broader search strategy was conducted to comprehensively assess the overall accuracy of IGRAs for TBM diagnosis.

Material and methods

Search strategy and study selection

Published studies were systematically collected by searching PubMed, Embase, Cochrane Library, Chinese Biomedical Literature Database (CBM), China National Knowledge Infrastructure (CNKI), China Science and Technology Journal (VIP) and Wanfang databases. The combination of key words was as follows: (tuberculous meningitis, or tuberculosis, meningeal, or extrapulmonary tuberculosis) and (QuantiFERON, or T-SPOT.TB, or TSPOT, or ELISPOT, or Interferon-γ assays, or Interferon-γ release assays, or IGRA, or T cell assays, or T cell response). Searching was investigated without restriction on language, as TB is more prevalent in non-English speaking countries. Reference lists were searched manually to further identify additional eligible literature.

Studies meeting the following criteria were included: (1) assessed the performance of IGRAs in TBM; (2) adopted predefined, specific diagnostic criteria of TBM, including microbiological and clinical criteria based on clinical presentation, CSF analysis, radiology and responsiveness to anti-tuberculosis chemotherapy; (3) final diagnosis of TBM was independent of the IGRAs result. Studies were excluded if they: (1) were case reports, comments, animal experiments, literature review or meta-analysis; (2) were not diagnostic tests; (3) lacked appropriate study design (e.g., took healthy people as the control group); (4) had fewer than 10 TBM patients; (5) reported insufficient data that were not able to calculate the sensitivity, specificity, positive likelihood ratio (PLR), and negative likelihood ratio (NLR).

Data extraction and quality assessment

Two reviewers independently checked all potentially relevant studies. Discrepancies were resolved by discussion with other investigators until a consensus was achieved. Data were collected from each study, including first author, year of publication, country, study design, number of cases, IGRA method, sample, true positive (TP), false-positive (FP), false-negative (FN) and true-negative (TN). For papers that evaluated more than one commercial IGRA or using two types of specimens simultaneously, each of them was considered as two independent studies in this meta-analysis and the data were extracted separately.

The methodological quality of the studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool by two independent researchers. All disagreements were resolved by consensus.

Comparison of diagnostic performance among ADA, new techniques and IGRAs for TBM

The adenosine deaminase assay (ADA) is a traditional test for TBM diagnosis. New techniques such as nested real-time polymerase chain reaction (nRT-PCR), one-tube nested PCR-lateral flow strip test (OTNPCR-LFST) and loop-mediated isothermal amplification (LAMP) have been designed for TBM diagnosis since 2016. The diagnostic accuracy of these techniques was reviewed to obtain a better understanding of the IGRA performance.

Statistical analysis

Standard methods recommended for meta-analyses of diagnostic test evaluations [17] were used. The data were analyzed using Meta-DiSc software (version 1.4) and Stata 12.0. The following measures of test accuracy were computed for each study: sensitivity, specificity, PLR, NLR and diagnostic odds ratio (DOR). In most circumstances, a PLR greater than 10 and an NLR less than 0.1 provide strong diagnostic evidence to rule in or rule out diagnoses respectively [18]. The DOR describes the odds of positive test results in patients with TBM compared with the odds of positive results in those without the disease. It is calculated as DOR = PLR/NLR [19]. Heterogeneity was assessed by chi-square (χ2) and I2 statistical tests. P < 0.05 was considered statistically significant for heterogeneity. For the I2 statistic, heterogeneity was defined as low, moderate, and high when I2 was more than 25%, 50%, and 75%, respectively [20]. In this study, a random-effect model was used to pool estimates. The threshold effect was assessed by the Spearman rank correlation test and considered significant if p < 0.05. To explore the sources of heterogeneity, subgroup analysis and meta-regression were conducted based on parameters such as IGRA method, TB prevalence, blinding method, sample size and reference standard. P < 0.05 indicates a contribution to heterogeneity. Potential presence of publication bias was tested using funnel plots and Egger’s test. A p-value < 0.05 was considered representative of statistical significance [21].

Results

General information

Study identification and selection were outlined in Figure 1, where “not primarily relevant to TBM” means publications focused on extrapulmonary tuberculosis but were irrelevant to TBM or did not provide valid TBM information. In the end, 26 [14, 15, 2245] out of 656 publications were available for the final analysis. Characteristics of the included publications are presented in Table I.

Table I

Principal characteristics of included studies

AuthorYearCountryStudy designTBM patientsIGRA methodSampleTest result
TPFPFNTN
Zhang2013ChinaRetrospective30T-SPOT.TBPB234726
CSF281229
Ling2015ChinaProspective12T-SPOT.TBPB105223
CSF112126
Thomas2008IndiaProspective11T-SPOT.TBPB9226
CSF9017
Park2016KoreaProspective49T-SPOT.TBPB386611115
CSF281612117
Kim2010KoreaProspective31T-SPOT.TBPB2220930
CSF131525
Lu2016ChinaProspective30T-SPOT.TBPB215934
QFT-GITCSF255534
Pan2017ChinaProspective53T-SPOT.TBPB489528
CSF3212136
Pan2015ChinaProspective26T-SPOT.TBPB261007
CSF241216
Lu2016ChinaProspective20T-SPOT.TBPB160428
CSF191127
Feng2009ChinaProspective15T-SPOT.TBPB120311
Han2008ChinaProspective13T-SPOT.TBPB10034
Cho2011KoreaProspective35T-SPOT.TBPB2647940
Zhang2017ChinaProspective62T-SPOT.TBPB567653
Wang2017ChinaProspective35T-SPOT.TBPB263927
Duan2017ChinaProspective45T-SPOT.TBPB416444
Zheng2015ChinaProspective12T-SPOT.TBPB12206
Jiang2016ChinaProspective58T-SPOT.TBPB502815
Wang2016ChinaRetrospective54T-SPOT.TBPB4510924
Cheng2017ChinaRetrospective61T-SPOT.TBPB4931229
Wang2014ChinaProspective26T-SPOT.TBCSF243221
Quan2008ChinaRetrospective25T-SPOT.TBCSF214423
Patel2010AfricaProspective38T-SPOT.TBCSF3213635
Chen2015ChinaRetrospective52QFT-GITPB4251042
Mu2015ChinaProspective32QFT-GITPB282428
Qian2012ChinaRetrospective32QFT-GITPB285451
Vidhate2011IndiaRetrospective36QFT-GITPB1662010

[i] CSF − cerebrospinal fluid, PB − peripheral blood, TP – true positive, FP – false positive, FN – false negative, TN – true negative, IGRA method – interferon-γ – release assay method.

Figure 1

Flow chart for studies identified and included in the present meta-analysis

https://www.archivesofmedicalscience.com/f/fulltexts/94520/AMS-17-5-94520-g001_min.jpg

According to the lists of 30 high TB burden countries from WHO Global tuberculosis report 2017 [1], studies conducted in the countries included in the list were defined as high TB burden settings. Apart from three studies [15, 25, 32] carried out in Korea, most of the studies were conducted in countries with high TB prevalence, such as India, China and Africa. Laboratory investigators were blinded to the clinical data and clinicians were blinded to the laboratory results in 7 publications [14, 15, 2325, 27, 32], whereas other publications did not report on blinding. Microbiological confirmation was used as the only reference standard to diagnose TBM in one publication [36], while in the remaining publications, clinical standards for TBM diagnosis were used as an alternative. A number of publications performed IGRAs on two types of specimens. As a consequence, 35 unique studies were defined from 26 articles. Peripheral blood (PB) IGRAs were used in 23 studies [15, 2239, 4245], while CSF IGRAs were used in 12 studies [14, 15, 2229, 40, 41].

Quality assessment

Figure 2 shows the quality of the studies included in the meta-analysis. Overall, the risk of bias was low across four domains. Index test and reference standard domain were judged to be at unclear risk of bias in the studies not reporting on the employment of a blinding method. Reference standard domain showed high concerns of applicability due to the limitations of traditional pathogenic inspection and clinical standards. Additional information about the quality of included studies is shown in Figure 2.

Figure 2

Methodological quality evaluation results of 26 publications using the QUADAS-2 tool

https://www.archivesofmedicalscience.com/f/fulltexts/94520/AMS-17-5-94520-g002_min.jpg

Overall meta-analysis of IGRAs

The pooled estimates are shown in Figure 3. For PB IGRAs: the sensitivities varied from 0.44 to 0.91; the specificity varied from 0.41 to 1.00. Pooled estimates for PLR, NLR, and DOR were 4.23 (95% CI: 2.95–6.07), 0.24 (95% CI: 0.19–0.32), and 21.06 (11.91–37.24), respectively. There was moderate heterogeneity in sensitivity and high heterogeneity in specificity between studies.

Figure 3

Forest plots of sensitivity and specificity for PB IGRAs (A) and CSF IGRAs (B)

https://www.archivesofmedicalscience.com/f/fulltexts/94520/AMS-17-5-94520-g003_min.jpg

As for CSF IGRAs: the sensitivities varied from 0.60 to 0.95; the specificity varied from 0.73 to 1.00. Pooled estimates for PLR, NLR, and DOR were 7.87 (95% CI: 4.98–12.46), 0.19 (95% CI: 0.13–0.29) and 47.74 (25.02-91.12), respectively. I2 values of these parameters indicated moderate heterogeneity among studies. When comparing overall diagnostic accuracy of PB IGRAs with CSF IGRAs, the latter showed significantly higher specificity and PLR (p < 0.05) than the former.

Heterogeneity analysis

The Spearman rank correlation test indicated no threshold effect in the PB IGRAs (coefficient = –0.056, p = 0.801) and CSF IGRAs studies (coefficient = 0.036, p = 0.939). The summary of subgroup analyses is shown in Table II.

Table II

Subgroup analysis for exploration of factors influencing heterogeneity

SubgroupNumber of studiesSensitivity (95% CI)I2 (%)Specificity (95% CI)I2 (%)DOR (95% CI)
PB IGRAs:
TB burden:
High200.82 (0.79–0.85)66.70.85 (0.82–0.88)63.129.18 (17.45–48.97)
Low
Blind:
Yes60.80 (0.74–0.86)26.40.62 (0.57–0.67)737.46 (3.36–16.57)
NR170.82 (0.78–0.85)70.40.86 (0.83–0.89)66.631.05 (16.98–56.76)
Sample size:
< 3070.87 (0.79–0.93)54.80.82 (0.73–0.89)81.035.40 (13.31–94.17)
≥ 30160.80 (0.77–0.83)66.50.75 (0.72–0.78)85.618.33 (9.48–35.44)
Method:
TSPOT.TB190.83 (0.80–0.86)47.10.73 (0.70–0.76)84.120.27 (11.36–36.15)
QFT-GIT40.75 (0.67–0.82)86.60.88 (0.82–0.93)64.322.94 (3.07–171.69)
Reference standard:
Clinical220.81 (0.78–0.84)62.50.76 (0.73–0.78)84.820.56 (11.52–36.67)
Microbiological
CSF IGRAs:
TB burden:
High100.83 (0.78–0.88)650.89 (0.85–0.92)56.659.28 (28.14–124.89)
Low
Blind:
Yes60.73 (0.66–0.80)55.70.88 (0.84–0.92)70.424.77 (12.84–47.79)
NR60.90 (0.84–0.94)0.00.91 (0.85–0.95)0.080.44 (31.87–203.03)
Sample size:
< 3060.91 (0.84–0.95)0.00.92 (0.85–0.96)0.084.33 (34.58–205.62)
≥ 3060.76 (0.69–0.81)68.90.88 (0.84–0.92)71.131.07 (14.30–67.50)

[i] DOR – diagnostic odds ratio, NR – not reported.

On subgroup analyses of PB IGRAs, the studies reporting a blinding method were associated with lesser heterogeneity in sensitivity and specificity. Correspondingly, on subgroup analyses of CSF IGRAs, studies with sample size < 30 and those in which a blinding method were not reported were related to obvious decreased heterogeneity in sensitivity and specificity. Table III assesses the potential factors associated with IGRAs’ diagnostic accuracy by using multivariate meta-regression. However, none of the factors significantly influenced the relative DOR of IGRAs for the diagnosis of TBM.

Table III

Multivariate meta-regression to evaluate factors associated with interferon-γ release assay accuracy in tuberculous meningitis

CovariateCoefficientP-valueRDOR (95% CI)
PB IGRAs:
TB burden–0.5250.56790.59 (0.09–3.98)
Blind1.5810.13714.86 (0.57–41.34)
Sample size-0.3410.67820.71 (0.13–3.93)
Method0.4740.49741.61 (0.38–6.84)
Reference standard–0.4580.83270.63 (0.01–58.41)
CSF IGRAs:
TB burden0.6660.36161.95 (0.39–9.76)
Blind–0.6140.36990.54 (0.12–2.46)
Sample size–0.8680.22270.42 (0.09–1.95)

[i] RDOR – relative diagnostic odds ratio.

Publication bias

Egger’s test was performed to assess the publication bias of included studies. There was a risk of publication bias in the meta-analysis of CSF IGRAs (p < 0.05). No evidence of publication bias was found in PB IGRAs (p > 0.05).

Comparison on diagnostic performance among ADA, new techniques and IGRAs for TBM

Studies aimed at analyzing the test accuracy in diagnosing TBM were reviewed. The diagnostic performance of ADA was extracted from a meta-analysis [46]. Applications in diagnosing TBM of four new techniques were also collected. As shown in Table IV, the sensitivity of the IGRA was slightly lower than ADA and the other four techniques while its specificity was much lower.

Table IV

Diagnostic performance of ADA, nRT-PCR, OTNPCR-LFST, LAMP and IGRAs for TBM

Test methodYearStudy numberTBM patientSampleSensitivity (95% CI)Specificity (95% CI)
ADA [46]201720741CSF0.89 (0.84–0.92)0.91 (0.87–0.93)
nRT-PCR [49]2017114CSF0.86 (0.60–0.96)1.00 (0.95–1.00)
OTNPCR-LFST [50]2017191CSF0.89 (0.82–0.95)1.00 (0.95–1.00)
IS6110 LAMP [51]20161150CSF0.83 (0.76–0.88)1.00 (0.96–1.00)
MPB64 LAMP [51]20161150CSF0.87 (0.80–0.92)1.00 (0.96–1.00)
PB IGRAsPresent26893PB0.81 (0.78–0.84)0.76 (0.73–0.78)
CSF IGRAsPresent26893CSF0.81 (0.76–0.85)0.89 (0.86–0.92)

[i] ADA – adenosine deaminase assay, nRT-PCR – nested real-time polymerase chain reaction, OTNPCR-LFST – one-tube nested PCR-lateral flow strip test, LAMP – loop-mediated isothermal amplification.

Discussion

The results of the present meta-analysis indicated that PB and CSF IGRAs could neither diagnose nor rule out TBM. Thus, in agreement with the previous meta-analysis [16], the diagnostic performance of IGRAs for TBM is suboptimal. Several reasons may explain this. First, PB IGRAs reflect systematic condition of the patients, which means that we may get false positive test results in non-TBM patients with tuberculosis at other sites than the brain. In addition, false positives owing to latent tuberculosis infection (LTBI) in non-TBM patients may also contribute to compromised specificity, which is one of the reasons why the World Health Organization recommends against the use of IGRAs as diagnostic markers for active tuberculosis in countries with high background LTBI rates [47]. In the present meta-analysis, the use of IGRAs on the CSF specimens displayed significant higher specificity than on the PB, which supports the theory that more M. Tb-specific lymphocytes are compartmentalized at the infected sites [48]. However, indeterminate results are common in CSF IGRAs. Relatively large CSF volumes (5–10 ml) that are necessary to provide sufficient cells for the detection are not always possible to achieve in clinical practice. These two factors impaired the diagnostic value of CSF IGRAs.

As there are many limitations in microbiological tests, new approaches were developed for detection of M. Tb [4952]. Compared with Ziehl-Neelsen staining and culture, IGRAs break new ground in TBM diagnosis as they are much more rapid and simple. However, in comparison to CSF ADA, IGRAs did not show remarkable superiority in terms of cost, turn-around time and diagnostic accuracy. Some new techniques for TBM diagnosis listed in Table IV showed obvious advantages in detection time, test procedures, accessibility and cost. Meanwhile, they showed promising diagnostic value for TBM. However, to date, few studies with large sample size have been conducted to investigate the role of these novel techniques in detection of TBM. More data are needed to validate the value of these new techniques in practical application in TBM diagnosis.

Currently, there are many guidelines on IGRAs for tuberculosis infection from different countries and supranational organizations but none for TBM [53]. According to the guidelines on IGRAs for tuberculosis combined with the results of this meta-analysis, some recommendations on IGRAs for TBM were summarized below. Clearly, IGRAs could not replace the microbiological tests. Given the significantly higher costs and disproportionate diagnostic performance, IGRAs were even inferior to some traditional tests such as CSF ADA in resource-limited and high TB burden settings. Even so, IGRAs could provide supplementary information in certain clinical situations, e.g. in immunocompromised patients where IGRAs are less affected by immunosuppression; highly suspected TBM patients but negative on microscopy and culture; or in the differential diagnosis of infection with NTM. Notably, compared with resource-limited and high-incidence settings, IGRAs played a more important role in their additive value in high-income and low-incidence countries.

Apart from the above, findings from this meta-analysis revealed no significant difference in the DOR of TSPOT.TB and QFT-GIT, which may be due to insufficient studies of QFT-GIT for TBM. In clinical practice, both of the methods have their strengths and weakness. The probability of laboratory error is higher for the TSPOT.TB on account of more complicated technical demand and processing steps than QFT-GIT. On the other hand, TSPOT.TB was deemed to be more stable and sensitive for diagnosing tuberculosis. But in fact, there are few reports comparing the performance of QFT-GIT and TSPOT.TB assays directly in patients with clinically suspected TBM. Thus, larger and parallel studies are required to compare the performance between the two methods for diagnosing TBM.

This study has several strengths. First, the search strategy was broader than the previous one as we incorporated many databases and no language restriction to minimize the number of missing publications. Nineteen additional studies were identified for review, including two publications ignored by the previous meta-analysis. Second, included studies met the predefined inclusion plan, reducing the bias produced by inclusion of patients to some extent. Third, this meta-analysis comprehensively evaluated the diagnostic value of TSPOT.TB and QFT-GIT, two types of IGRAs, as well as two types of samples (PB and CSF). Above all, the large sample size was the major strength of this meta-analysis.

This meta-analysis has some limitations. First, there were considerable heterogeneities of the selected studies, which may have led to overestimation of the pooled estimates. As the diagnostic criteria, cut-off value, disease prevalence and other populations characteristics varied among studies, the heterogeneity is to be expected. The factors included in the meta-regression analysis failed to explain the heterogeneity. Reference standards were divided into two groups in meta-regression: clinical and microbiological. As there is no unified criterion for TBM diagnosis, this factor is too mixed and difficult to make further subdivisions, which may be partly responsible for the inexplicable heterogeneity. Second, the absolute and most widely used diagnostic tools for TBM are smear microscopy and culture yet are negative in a significant proportion of TBM cases, leading researchers to use alternate clinical reference standards. In view of this, we did not restrict diagnostic criteria to microbiological confirmation, which is commonly not achievable in a routine clinical setting. This may produce bias. In order to minimize the occurrence of misdiagnosis of TBM, only studies with predefined and rigorous diagnostic criteria were included in the current analysis. Third, several studies did not report on blinding. And there was potential publication bias in the meta-analysis for CSF IGRAs.

In conclusion, since IGRAs suffer from problems such as high cost, rigorous technical requirements, existence of indeterminate results and suboptimal diagnostic value, these assays are unsuitable for use as biomarkers for standalone TBM diagnosis.