Journal: NPJ digital medicine
This study presents a multimodal machine learning approach using metagenomic next-generation sequencing (mNGS) data from bronchoalveolar lavage fluid to differentiate lung cancer from various pulmonary infections, including bacterial, fungal, and tuberculosis.
By analyzing multiple factors, the integrated model examines:
- Microbial profiles
- Host gene expression
- Immune cell composition
- Tumor-related genetic alterations
The model demonstrated high diagnostic performance, achieving area under the curve (AUC) values of 0.937 in the training cohort and 0.847 in the testing cohort.
Additionally, implementing a rule-in/rule-out strategy further improved accuracy, reaching over 89% across different infection types.
These results suggest that mNGS-based multimodal analysis is a promising, cost-effective method for early and precise differential diagnosis in lung diseases.