Loading…

Identification of preclinical dementia according to ATN classification for stratified trial recruitment: A machine learning approach

Background There is a strong case for de‐risking neurodegenerative agent development through highly informative experimental medicine studies early in the disease process. These types of studies are dependent on a research infrastructure that includes volunteer registries holding highly granular phe...

Full description

Saved in:
Bibliographic Details
Published in:Alzheimer's & dementia 2022-12, Vol.18 (S7), p.n/a
Main Authors: Koychev, Ivan G, Marinov, Evgeniy, Young, Simon, Lazarova, Sophia, Grigorova, Denitsa, Palejev, Dean
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background There is a strong case for de‐risking neurodegenerative agent development through highly informative experimental medicine studies early in the disease process. These types of studies are dependent on a research infrastructure that includes volunteer registries holding highly granular phenotypic and genotypic data to allow stratified study selection. Examples of such registries include the Brain Health Registry, Great Minds and PROTECT cohorts which rely on remote cognitive, self‐reported medical history and genetic data. This requires the development of effective algorithms to predict the presence of preclinical dementia pathology. In this study we sought to address this need by building a machine learning (ML) ATN risk prediction algorithm which incorporates data typically collected in such registries. Methods To build a ML algorithm that is validated against an existing regression‐based model (Calvin et al. 2020), we used the EPAD LCS cohort (V1500.0). We excluded participants with 1) known diagnosis of dementia or Mild Cognitive Impairment or Clinical Dementia Rating scale ≥ 0.5 and 2) no cerebrospinal fluid biomarkers. Participants were categorised into 5 ATN categories: (i) Normal AD biomarkers: A−T−(N)−; (ii) Alzheimer’s pathologic change: A+T−(N)−; (iii) Alzheimer’s disease: A+T+(N)±; (iv) Alzheimer’s and concomitant non‐Alzheimer’s pathologic change: A+T−(N)+; (v) Non‐AD pathologic change: A−T ± (N)+; A−T+(N)−. Using a Weight of Evidence and Information Value method we identified 13 significant features for testing differences between each of the four neurodegeneration‐related groups vs. controls (A‐T‐N‐). Random Forest and XGBoost with 5‐fold cross validations were used to optimise the Area Under the Curve (AUC) metric. Result The study dataset included 927 individuals. Our optimal results outperformed the regression models in the Calvin et al. 2020 paper by between 2 and 12%. The optimal feature sets were not consistent across the 4 models with the A+T−(N)+ vs A−T−(N)− differing the most from the rest. Conclusion Our study demonstrates the gains offered by ML in generating ATN risk prediction over logistic regression models among pre‐dementia individuals. The reliance of the model on variables that can be collected remotely demonstrates its utility for research registers. An openly available version of the ML algorithm for use by research registries is under development.
ISSN:1552-5260
1552-5279
DOI:10.1002/alz.062184