In-silico cell sorting revealed granulocyte-specific single-cell-type gene expression from peripheral blood bulk expression data and its application as host response biomarkers to discriminate bacterial and viral infections
In-silico cell sorting revealed granulocyte-specific single-cell-type gene expression from peripheral blood bulk expression data and its application as host response biomarkers to discriminate bacterial and viral infections
Tang, N. L.-s.; Kwan, T.-K.; Huang, J.; Tang, M. L.; Wang, X.; Wu, J.; Lai, C.; Lui, G.; Ma, S.-L.; Leung, K.-S.
AbstractPeripheral Blood transcriptome analysis evaluated the bulk transcript abundance (TA) covering all leukocyte cell populations. However, there are 2 main problems in using bulk expression as biomarkers: (1) A long list of differential expression genes (DEGs) was found, and (2) DEGs cannot be attributed to a host response of any specific cell-type. TA assays after conventional cell sorting, as the gold-standard method, is too tedious for routine use. Recently, we showed that by using a ratio-based biomarker, RBB (ratio of two stringently selected genes), it is feasible to interrogate the gene expression of a single cell-type (monocyte and B lymphocyte) in peripheral whole blood (WB) directly. Here, we apply this in-silico cell sorting algorithm (DIRECT LS-TA, Direct Leukocyte Single cell-type Transcript Abundance) to granulocytes in WB samples to reveal RBBs specific to granulocytes. This DIRECT LS-TA approach without the need for cell-sorting was applied to public datasets to differentiate the 2 types of infection (bacterial vs viral infection). The following RBBs measured in WB correlate with the expression of target (numerator) genes in purified granulocytes, thus cell-sorting can be avoided by using these RBBs: ARG1/SRGN, ANXA3/SRGN, RSAD2/SRGN. Together with monocyte DIRECT LS-TA biomarkers, IFI27/PSAP, direct quantification of 4 genes provided optimal differentiation of viral from bacterial infection. Meta-analysis and unsupervised machine learning classification confirmed the superior performance of DIRECT LS-TA biomarkers. These RBBs found by prior In-silico cell-sorting identified pairs of genes that are used to formulate as ratio-based biomarkers (RBBs) to represent gene expression of granulocytes inside whole blood cell-mixture samples which was useful to triage febrile patients into two major categories of febrile diseases between viral and bacterial infection with high degree of sensitivity and specificity.