JPM, Vol. 14, Pages 545: Automatic Classification and Visualization of Text Data on Rare Diseases

3 months ago 16

JPM, Vol. 14, Pages 545: Automatic Classification and Visualization of Text Data on Rare Diseases

Journal of Personalized Medicine doi: 10.3390/jpm14050545

Authors: Luis Rei Joao Pita Costa Tanja Zdolšek Draksler

More than 7000 rare diseases affect over 400 million people, posing significant challenges for medical research and healthcare. The integration of precision medicine with artificial intelligence offers promising solutions. This work introduces a classifier developed to discern whether research and news articles pertain to rare or non-rare diseases. Our methodology involves extracting 709 rare disease MeSH terms from Mondo and MeSH to improve rare disease categorization. We evaluate our classifier on abstracts from PubMed/MEDLINE and an expert-annotated news dataset, which includes news articles on four selected rare neurodevelopmental disorders (NDDs)—considered the largest category of rare diseases—from a total of 16 analyzed. We achieved F1 scores of 85% for abstracts and 71% for news articles, demonstrating robustness across both datasets and highlighting the potential of integrating artificial intelligence and ontologies to improve disease classification. Although the results are promising, they also indicate the need for further refinement in managing data heterogeneity. Our classifier improves the identification and categorization of medical information, essential for advancing research, enhancing information access, influencing policy, and supporting personalized treatments. Future work will focus on expanding disease classification to distinguish between attributes such as infectious and hereditary diseases, addressing data heterogeneity, and incorporating multilingual capabilities.

Read Entire Article