Abstract
Tuberculosis (TB) and pneumonia remain major global public health challenges, necessitating accurate and efficient diagnostic tools. This study proposes a novel deep learning framework, Large Adaptive Filter and Aligning Normalized Network (LAFAN-Net), designed to improve chest X-ray (CXR) diagnosis by integrating visual and textual information. The framework comprises three key components: (1) a report-guided multi-level alignment mechanism that aligns CXR features with radiology reports at the token, sample, and disease levels; (2) a large adaptive filter block for capturing multi-scale visual patterns; and (3) AlignNorm, a new normalization technique that mitigates oversmoothing and enhances feature separation. LAFAN-Net is evaluated on three publicly available CXR datasets, achieving accuracies of 97.14 %, 95.35 %, and 89.39 %, and F1 scores of 90.77 %, 96.32 %, and 88.33 %, respectively. Extensive ablation studies confirm the model's robustness. The results underscore LAFAN-Net's ability to extract clinically meaningful features while maintaining interpretability, supported by singular value distributions and Gradient-weighted Class Activation Mapping visualizations. Future work will explore extending the model to broader disease categories and multi-class classification tasks to enhance clinical utility. In addition, improving computational efficiency and ensuring real-time applicability are essential for deployment in resource-limited settings.
| Original language | English |
|---|---|
| Article number | 111575 |
| Journal | Engineering Applications of Artificial Intelligence |
| Volume | 158 |
| DOIs | |
| State | Published - 15 Oct 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Chest X-ray
- Computer-aided diagnosis
- Multi-modal
- Pneumonia
- Tuberculosis
Fingerprint
Dive into the research topics of 'Tuberculosis and pneumonia diagnosis in chest X-rays by large adaptive filter and aligning normalized network with report-guided multi-level alignment'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver