Simulating GWAS Data with SimGWAS to Improve Polygenic Risk Score Accuracy in Prostate

Session Number

1

Advisor(s)

Dr. Alex Rodriguez, Dr. Mitchell Conery, Ravi Madduri, Argonne National Laboratory

Location

A113

Discipline

Medical and Health Sciences

Start Date

15-4-2026 10:15 AM

End Date

15-4-2026 11:00 AM

Abstract

Polygenic risk scores (PRS) for prostate cancer estimate an individual’s genetic risk by summing the effects of thousands of genetic variants like SNPs across the genome. Most PRS methods are developed using genome-wide association study (GWAS) datasets composed mainly of individuals of European ancestry. This limits the accuracy of PRS in non-European populations, including those of African ancestry. Regulatory and logistical barriers restrict access to diverse GWAS datasets, which makes it difficult to evaluate approaches that could improve PRS accuracy. Hence, the goal of this project is to identify PRS methods that maintain accuracy for scores across ancestries using simulated GWAS summary statistics from diverse populations rather than restricted real-world datasets. After the completion of the SimGWAS, pilot simulations on Chromosome 22 confirmed that the SimGWAS functions correctly. Manhattan plot comparisons between simulated and real GWAS results showed similar association patterns which indicates that the simulated results correlate well with real data. These results demonstrate that SimGWAS can generate realistic GWAS data and can be used to calculate PRS scores for future comparisons with real GWAS results to determine which methods are more accurate across populations.

Share

COinS
 
Apr 15th, 10:15 AM Apr 15th, 11:00 AM

Simulating GWAS Data with SimGWAS to Improve Polygenic Risk Score Accuracy in Prostate

A113

Polygenic risk scores (PRS) for prostate cancer estimate an individual’s genetic risk by summing the effects of thousands of genetic variants like SNPs across the genome. Most PRS methods are developed using genome-wide association study (GWAS) datasets composed mainly of individuals of European ancestry. This limits the accuracy of PRS in non-European populations, including those of African ancestry. Regulatory and logistical barriers restrict access to diverse GWAS datasets, which makes it difficult to evaluate approaches that could improve PRS accuracy. Hence, the goal of this project is to identify PRS methods that maintain accuracy for scores across ancestries using simulated GWAS summary statistics from diverse populations rather than restricted real-world datasets. After the completion of the SimGWAS, pilot simulations on Chromosome 22 confirmed that the SimGWAS functions correctly. Manhattan plot comparisons between simulated and real GWAS results showed similar association patterns which indicates that the simulated results correlate well with real data. These results demonstrate that SimGWAS can generate realistic GWAS data and can be used to calculate PRS scores for future comparisons with real GWAS results to determine which methods are more accurate across populations.