Research / Research Highlights

Research Highlights

Research Highlights /

Research Highlights

Prof. Seunggeun Lee

Addressing overfitting bias due to sample overlap in polygenic risk scoring

Numerous studies on Alzheimer’s polygenic risk scores (PRS) overlook sample overlap between IGAP and target datasets like ADNI. To address this, we developed Overlap-Adjusted Polygenic Risk Score (OA-PRS) and tested it on simulated data to assess biases from different scenarios by varying training, testing, and overlap proportions. OA-PRS was used to adjust for sample bias in simulations, then we applied OA-PRS to IGAP and ADNI datasets and validated through visual diagnosis. OA-PRS effectively adjusted for sample overlap in all simulation scenarios, as well as for IGAP and ADNI. The original IGAP PRS showed an inflated AUROC(0.915) on overlapping samples. OA-PRS reduced the AUROC to 0.726, closely aligning with the AUROC of non-overlapping samples(0.712). Further, visual diagnostics confirmed the effectiveness of our adjustments. With OA-PRS, we were able to adjust the IGAP summary-based PRS for the overlapped ADNI samples, allowing the dataset to be fully utilized without the risk of overfitting.

more >> https://doi.org/10.1002/alz.70109