Introduction

Pancreaticobiliary maljunction (PBM) is a rare congenital anomaly in which the junction of the pancreatic and biliary ducts is located outside the duodenal wall, with a long common channel [13]. The resulting two-way reflux of bile and pancreatic juice leads to several complications, including bile retention with cholangitis, pancreatitis, and malignancies [46].

Cyst excision and Roux-en-Y hepaticojejunostomy are the first choice for PBM treatment [79]. However, these procedures are associated with postoperative complications affecting the residual bile ducts, such as cholangitis, hepatolithiasis, and carcinogenesis [8, 1013]. Identification of biomarkers associated with PBM complications of the bile duct could enable the early detection of these complications and support the development of effective diagnostic and therapeutic methods.

Long noncoding RNAs (lncRNAs), which are non-coding RNAs longer than 200 nucleotides, play an important role in regulating the occurrence and progression of many diseases, such as infectious and malignant tumors [1416]. They exert these effects by modulating biological processes such as cellular proliferation, motility, and immune response to diseases [1719]. In addition, some lncRNAs have been identified as biomarkers for diagnosis and prognosis of diseases [2022]. However, the genome-wide expression and functional roles of lncRNAs in PBM are unclear.

To the best of our knowledge, this is the first study focused on lncRNAs in the common bile duct of PBM. Here, we investigated the expression profile of lncRNAs in the common bile duct of patients with PBM, and we identified lncRNAs differentially expressed relative to controls. We explored the potential functions of the differentially expressed lncRNAs and assessed their ability to serve as biomarkers.

Material and methods

Material

The study protocol was approved by the Ethics Committee of our hospital and complied with the Helsinki Declaration. Written informed consent was obtained from the guardian of each subject before surgery. A total of 15 pediatric subjects with PBM and 15 control subjects were included in the study between January 2017 and September 2020. In all patients, diagnosis was confirmed by imaging and surgical pathological examination. PBM was diagnosed based on the following criteria: (1) the union of the pancreatic and biliary duct was located outside the sphincter of Oddi, based on magnetic resonance cholangiopancreatography (MRCP) or intraoperative cholangiography (IOC); (2) the common duct was longer than 5 mm; (3) the biliary amylase level was greater than 1000 U/l [23].

RNA microarray analysis

Microarray experiments to investigate lncRNAs and messenger RNA (mRNAs) differentially expressed between the PBM and control groups were performed by Western SCI Biotech Company (Chongqing, China, www.westernsci.cn). Agilent Human lncRNA Microarray 2018 Version (4*180k, Design ID: 085630), Agilent Technologies, Inc., was used.

The common bile duct tissue from PBM subjects was collected and stored at –80°C before RNA extraction. Total RNA was isolated using TRIZOL (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol and quantified using the NanoDrop ND-2000 (Thermo Scientific). RNA integrity was assessed using the Agilent Bioanalyzer 2100. Sample labeling, microarray hybridization, and washing were performed based on the manufacturer’s standard protocols. Briefly, total RNA was transcribed to double-stranded cDNA, then synthesized into cRNA and labeled with cyanine-3-CTP. The labeled cRNAs were then hybridized onto the microarray. After washing, the arrays were scanned using the Agilent Scanner G2505C (Agilent Technologies).

Array images were analyzed using Feature Extraction software (version 10.7.1.1, Agilent Technologies), and the raw data were further processed using Genespring (version 13.1, Agilent Technologies). First, the raw data were normalized using the quantile algorithm. The probes that had flags in “P” in at least 1 out of 2 conditions were chosen for further analysis. Differentially expressed mRNAs and lncRNAs were defined as those whose expression differed by ≥ 2.0-fold between PBM and control groups (p < 0.05). The potential roles of these differentially expressed mRNAs and lncRNAs were explored based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (see section 2.3). Finally, the differentially expressed genes were organized using hierarchical clustering.

Prediction of lncRNA function

Following the approach in a previous study [24], we examined differentially expressed mRNAs for enrichment of KEGG functional pathways, then we attributed the same functional pathways to lncRNAs that were co-expressed with those mRNAs. Co-expression of mRNAs and lncRNAs was defined as a Pearson correlation p-value < 0.05.

The enrichment of functional terms was analyzed using a hypergeometric cumulative distribution function, and the 200 and 500 most reliable predicted relationships were displayed in order to understand the distribution of functions of differentially expressed lncRNAs.

Fine mapping of co-expression of lncRNAs and adjacent coding genes

For each differentially expressed lncRNA, associated “cis-regulated mRNAs” were identified based on the following criteria: (1) the loci encoding the mRNAs were within 300 kb up- or downstream of the given lncRNA; and (2) the Pearson correlation coefficient for lncRNA-mRNA co-expression was significant (p ≤ 0.05). The co-expressed genes and enrichment significance for differentially expressed genes in each transcription factor (TF) entry were calculated for each differentially expressed lncRNA, using the hypergeometric distribution test. The calculated results return a p-value of enrichment significance, where a small p-value indicates that the gene is enriched.

The intersection between the coding gene set co-expressed by lncRNAs and the target gene set of the TF/chromatin regulatory complex was calculated, and the enrichment degree of the intersection was calculated using a hypergeometric distribution test. This calculation identified the TFs significantly related to differentially expressed lncRNAs, and thereby the TFs and chromatin regulatory factors that may regulate those lncRNAs. The network diagram was visualized based on the results of the hypergeometric distribution analysis.

According to hypergeometric distribution calculations, multiple lncRNA-TF pairs were obtained for each differentially expressed lncRNA. Each lncRNA-TF pair was the result of multiple gene enrichment.

Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) assay

In order to verify the microarray results, we randomly selected six differentially expressed lncRNAs that were associated with carcinogenesis or chronic inflammation in the common bile duct. We measured the lncRNA expression levels of these genes using qRT-PCR and the SYBR Green PCR Kit (Applied BI) according to the manufacturer’s instructions. Expression levels were calculated using the 2ΔΔCt method and normalized to those of β-actin lncRNA.

Statistical analysis

SPSS 20.0 software (IBM, Armonk, NY, USA) was used for statistical analysis. Data were expressed as mean ± standard deviation (SD), and inter-group differences were assessed for significance using the rank sum test. The diagnostic accuracy of certain differentially expressed lncRNAs was assessed in terms of the area under the receiver operating characteristic curve (AUC). Statistical significance was defined as p < 0.05.

Results

General information

Patients’ demographic and clinical data are summarized in Table I. There was no significant difference between the two groups in body weight, age, or sex distribution.

Table I

Clinicodemographic characteristics of study subjects

CharacteristicTodani types I (n = 7)Todani types IV (n = 8)
Abdominal pain64
Jaundice14
Mass00
Fever11
Vomiting53
Sex (male)03
Age [months]9–714.5–50
Pathological findings:
 Cyst wall hyperplasia78
 Gallbladder wall78
 Congestion

Differential gene expression between PBM and control subjects

The microarray analysis identified 2915 upregulated mRNAs, 2121 downregulated mRNAs, 173 upregulated lncRNAs, and 316 downregulated lncRNAs in subjects with PBM compared to healthy controls. Principal component analysis of differentially expressed lncRNAs and a heatmap are presented in Figure 1, while the 20 most up- or downregulated lncRNAs are presented in Table II. The most highly upregulated lncRNA was LOC105378608, which showed a 14.39-fold difference from controls.

Table II

The top 20 differentially expressed lncRNAs in pancreaticobiliary maljunction (PBM) (up and down regulated)

Gene symbolGene IDAccessionDescriptionLengthRegulation (A vs. B)FC (abs) (A vs. B)P-value (A vs. B)Type
LOC105371291105371291XR_933627.3Uncharacterized LOC105371291, transcript variant X31138down12.0562730.01451713ncRNA
LOC105374803105374803XR_940242.2Uncharacterized LOC1053748031305down9.06510.00494713ncRNA
LINC00261140828NR_001558.3Long intergenic non-protein coding RNA 2614912down7.91876940.01998967ncRNA
PSORS1C3100130889NR_152831.1Psoriasis susceptibility 1 candidate 3 (non-protein coding), transcript variant 5624down6.9932590.01583394ncRNA
LOC112267956112267956XR_002956345.1Uncharacterized LOC1122679562005down6.97184320.01816426ncRNA
LOC158434158434NR_132344.1Uncharacterized LOC1584342662down6.03629970.00271874ncRNA
LOC105372653105372653NR_134564.1Uncharacterized LOC105372653573down5.80001072.83E-04ncRNA
LINC01320104355288NR_126404.1Long intergenic non-protein coding RNA 13201131down5.66682670.01277497ncRNA
LOC105371825105371825XR_001752940.1Uncharacterized LOC1053718253133down5.45141940.02630284ncRNA
PP708025845NR_024158.1Uncharacterized LOC258451743down5.40811160.01141873ncRNA
LOC105369381105369381XR_001748496.1Uncharacterized LOC105369381, transcript variant X25254down5.3922520.00472638ncRNA
EMX2OS196047NR_144378.1EMX2 opposite strand/antisense RNA, transcript variant 27201down5.3251510.00589902ncRNA
SNORA25B109623459NR_145801.1Small nucleolar RNA, H/ACA box 25B127down5.3196020.00461494ncRNA
LOC101928875101928875XR_001748392.2Uncharacterized LOC101928875728down5.14039370.006804ncRNA
GBA357733NR_102356.1Glucosylceramidase beta 3 (gene/pseudogene), transcript variant 2, non-coding1243down5.1037030.00271457ncRNA
LOC101927269101927269NR_110075.1Uncharacterized LOC101927269507down4.9786010.03097457ncRNA
LOC105377924105377924NR_134603.1Uncharacterized LOC105377924415down4.938890.0150742ncRNA
LOC105374510105374510XR_001741602.1Uncharacterized LOC105374510, transcript variant X34802down4.90875670.03875371ncRNA
LOC102724834102724834XR_429171.4Uncharacterized LOC1027248343131down4.5886130.0016174ncRNA
LOC107985242107985242XR_001738352.1Uncharacterized LOC107985242, transcript variant X21899down4.52348850.00128315ncRNA
LOC105378608105378608XR_946886.2Uncharacterized LOC1053786083022up14.3879570.0011432ncRNA
LOC105376384105376384XR_930616.3Uncharacterized LOC105376384, transcript variant X31188up6.86610949.25E-04ncRNA
HECW1-IT1100127950NR_135295.1HECW1 intronic transcript 11839up6.842440.03381596ncRNA
LOC105371401105371401XR_002957935.1Uncharacterized LOC1053714013094up6.6938660.0253292ncRNA
CASC23103581031NR_125366.1Cancer susceptibility 23 (non-protein coding)1583up6.35940650.00331883ncRNA
LOC105374641105374641XR_925740.2Uncharacterized LOC105374641294up5.80322840.00589144ncRNA
LOC105379057105379057XR_001742782.1Uncharacterized LOC105379057, transcript variant X38261up5.35601474.11E-04ncRNA
LOC107986570107986570XR_001743976.1Uncharacterized LOC107986570, transcript variant X2770up5.33145760.01600899ncRNA
LINC00578100505566NR_047568.1Long intergenic non-protein coding RNA 5781222up5.3205640.03225355ncRNA
MIR4435-2HG541471NR_136164.1MIR4435-2 host gene, transcript variant 62046up5.30238530.00741963ncRNA
LOC107984270107984270XR_001747591.1Uncharacterized LOC107984270, transcript variant X13786up5.15397550.01410949ncRNA
MIR503HG84848NR_024607.1MIR503 host gene786up5.0490580.00200934ncRNA
ADAM5255926NR_001448.1ADAM metallopeptidase domain 5 (pseudogene)1718up5.0481169.81E-04ncRNA
MIR8061102466251NR_107028.1MicroRNA 806175up4.84438855.84E-04ncRNA
LOC105375855105375855XR_928920.2Uncharacterized LOC1053758557758up4.66851470.01672733ncRNA
SH3RF3-AS1100287216NR_029193.1SH3RF3 antisense RNA 12793up4.1449250.01781914ncRNA
LOC100506688100506688NR_104615.1Uncharacterized LOC100506688, transcript variant 23451up4.0806340.00154016ncRNA
LOC107986185107986185XR_001741392.1Uncharacterized LOC107986185513up3.95325524.36E-05ncRNA
LOC107986821107986821XR_001745275.1Uncharacterized LOC107986821, transcript variant X13257up3.87774350.0174942ncRNA
LOC105373592105373592XR_001739685.1Uncharacterized LOC105373592, transcript variant X2877up3.81241750.00616176ncRNA

[i] A – PBM, B – controls, FC (abs) – absolute value of the fold change.

Figure 1

A – Principal component analysis of differentially expressed lncRNAs. Principal component 1 explained 65.7% of the observed variance; component 2, 8.9%; and component 3, 6.5%. B – Heatmap of differentially expressed lncRNAs

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g001_min.jpg

Functional analysis of differentially expressed genes

The most enriched GO categories associated with the differentially expressed transcripts were extracellular matrix (p = 2.9E-12, up-regulated), extracellular region (p = 0.000427, up-regulated), and kinetochore (p = 2.29E-8, up-regulated) (Figure 2). The most enriched KEGG pathway was protein digestion and absorption, in which 250 differentially expressed genes were involved. Some pathways were associated with cancer, such as pathways in cancer (associated with 11 genes) and the PI3K-Akt signaling pathway (associated with 15 genes, Figure 3).

Figure 2

Functional analysis of differentially expressed genes based on Gene Ontology (GO) enrichment analysis

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g002_min.jpg
Figure 3

Analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment in differentially expressed lncRNAs

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g003_min.jpg

Co-expression analysis and target prediction

The functions of lncRNAs are performed by interacting with their targets. We predicted both cis- and trans-acting mRNAs that may interact with differentially expressed lncRNAs. The cis analysis searched for mRNAs 300 kb up- or downstream of the lncRNAs, and it identified 5 lncRNAs related to those cis target genes (Figure 4). The trans analysis of lncRNAs was performed by constructing co-expression networks of dysregulated mRNAs and lncRNAs based on Pearson expression correlation (Figure 5). Most of the differentially expressed lncRNAs acted in a trans manner. The top 200 predicted relationships with the highest prediction reliability were used to determine which TFs occurred most frequently, and the functions of these TFs were then attributed to the differentially expressed lncRNAs (Figure 6). Among the 964 differentially expressed lncRNAs and the predicted 9358 mRNA targets, more than one mRNAs were predicted to be regulated by one lncRNA and one mRNA corresponded to several lncRNAs.

Figure 4

Association of five differentially expressed lncRNAs with cis-target genes in pancreaticobiliary maljunction (PBM)

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g004_min.jpg
Figure 5

Prediction of trans-target for differentially expressed lncRNAs in PBM

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g005_min.jpg
Figure 6

Functional distribution of differentially expressed lncRNAs based on transcription factor analysis

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g006_min.jpg

Validation by qRT-PCR

Six up-regulated lncRNAs associated with carcinogenesis or chronic inflammation in the common bile duct were verified by qRT-PCR: NR_110876, NR_132344, XR_946886, XR_002956345, NR_135295, and XR_002957935. The expression differences between patients and controls for NR_110876, NR_132344, XR_946886, and XR_002956345 were consistent with the microarray results, and the difference was statistically significant for NR_132344, XR_946886, and XR_002956345. However, the qRT-PCR results for NR_135295 and XR_002957935 were inconsistent with the microarray results (Figure 7).

Figure 7

Validation of microarray results by qPCR

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g007_min.jpg

To assess the potential of NR_110876, NR_132344, XR_946886, and XR_002956345 as PBM biomarkers, we examined their ability to predict PBM in our sample. Only XR_946886 gave a significant AUC (0.837, p < 0.001) (Figure 8).

Figure 8

Receiver operating characteristic curves to assess the ability of NR_110876, NR_132344, XR_946886, and XR_002956345 to predict pancreaticobiliary maljunction

https://www.archivesofmedicalscience.com/f/fulltexts/145482/AMS-20-2-145482-g008_min.jpg

Discussion

Here we identified several lncRNAs that are differentially regulated in PBM, and we explored their potential biological functions.

Congenital dilation of the common bile duct has been shown to involve numerous alterations in the transcriptome, and PBM has been associated with downregulation of several genes in the gallbladder, as well as upregulation of some noncoding RNAs, which may contribute to biliary carcinogenesis [25]. In our previous study [26], we found 876 differentially expressed genes in children with PBM, of which 530 were up-regulated and 346 were down-regulated. Wong et al. reported on patients with congenital dilation of the common bile duct, human developmental disorders, cholangiocarcinoma or hepatocellular carcinoma [27]. However, we are unaware of investigations of lncRNAs in PBM.

In the present study, we identified 489 lncRNAs and 5036 mRNAs in bile duct tissues that were differentially expressed in PBM. These differentially expressed lncRNAs may affect many pathways, including those involved in protein digestion and absorption, interactions between extracellular matrix and surface receptors, focal adhesion, PI3K-Akt signaling, carcinogenesis, and inflammation. These predictions are consistent with previous studies [22, 24, 25]. In this way, our study implicates lncRNAs in PBM-associated chronic inflammation and carcinogenesis in the common bile duct.

We explored the potential functions of differentially expressed lncRNAs based on the functions of their target genes. We found that 5 cis-target genes were related to differentially expressed lncRNAs, and that most lncRNAs acted in a trans manner. For example, we found that one lncRNA (NR_027038.1) was closely associated with E2F4, which is consistent with previous reports linking E2F4 to development and progression of cancers [28, 29].

Of the many lncRNAs that we identified as differentially expressed, validation of a subset using qRT-PCR showed that not all microarray results were accurate. Nevertheless, we were able to validate upregulation of NR_110876, NR_132344, XR_946886, and XR_002956345. Among these lncRNAs, XR_946886 showed significant diagnostic potential. Thus, this lncRNA seems likely to be involved in the development of PBM complications, and it may serve as a novel biomarker for carcinogenesis or chronic inflammation involving the common bile duct [30].

This is the first study focused on lncRNAs in the common bile duct of PBM. Our findings may provide a basis for developing novel diagnostic and therapeutic targets in PBM. There were some limitations to the present study. The sample was relatively small, and only six lncRNAs identified in microarray experiments were validated using qRT-PCR. Future studies should be conducted with larger samples and more detailed validation and analysis of candidate genes. Future studies should also explore the potential mechanism of XR_946886 in PBM and its clinical potential as a biomarker.

In conclusion, the present study provided an overall analysis of lncRNAs in the common bile duct in PBM. By further exploring potential functions and pathways in which differentially expressed lncRNAs are involved, we anticipate that lncRNAs such as XR_946886 will be validated as biomarkers or even potential therapeutic targets for PBM treatment.