Development and implementation of polygenic risk score in Vietnamese population

Abstract: Recent technological advancements and availability of genetic databases have facilitated the integration of genetic factors into risk prediction models. A Polygenic Risk Score (PRS) combines the effect of many Single Nucleotide Polymorphisms (SNP) into a single score. This score has lately been shown to have a clinically predictive value in various common diseases. Some clinical interpretations of PRS are summarized in this review for coronary artery disease, breast cancer, prostate cancer, diabetes mellitus, and Alzheimer’s disease. While these findings gave support to the implementation of PRS in clinical settings, the populations of interest were derived mainly from European ancestry. Therefore, applying these findings to non-European ancestry (Vietnamese in this context) requires many efforts and cautions. This review aims to articulate the evidence supporting the clinical use of PRS, the concepts behind the validity of PRS, approach to implement PRS in Vietnamese population, and cautions in selecting methods and thresholds to develop an appropriate PRS
Download
Trang 1
Trang 2
Trang 3
Trang 4
Trang 5
Trang 6
Trang 7
Trang 8
Trang 9
9 trang xuanhieu 20180
Download
Bạn đang xem tài liệu "Development and implementation of polygenic risk score in Vietnamese population", để tải tài liệu gốc về máy hãy click vào nút Download ở trên
Tóm tắt nội dung tài liệu: Development and implementation of polygenic risk score in Vietnamese population

he higher
the relative risk of said individual is, the more justified the
medical intervention becomes.
PRS analysis can also be represented based on age
(Figure 2). The cumulative risk of disease stratified by the
PRS can guide the decision at which age an individual can
benefit the most from a screening test [55]. This age-based
criterion can spotlight the balance of average risk of breast
cancer and the risk of harm due to the false-positive result.
4. Validating PRS Performance
A common concern in PRS analysis is whether the most
optimized PRS overfits the training data [56]. As a result,
applying said PRS to the general population can lead to
inflated results and false conclusions. The best strategy
to prevent overfitting of the PRS-based prediction model
is to validate its accuracy on an independent data set.
In the absence of an independent data set, the training
data can be divided into 2 separated data sets, one for
optimizing the PRS and the other for performing out-of-
sample prediction [57].
VI. CONCLUSION
The cost of reading DNA is becoming more and more
affordable through advancement of genotyping and se-
quencing technologies. Alongside the development of data
storage, new computing methods and abundance of dis-
ease databases, the PRS has provided better accuracy to
existing models of risk prediction for common diseases.
Consequently, individual clinical management (e.g., disease
screening and therapeutic intervention) can be personalized
based on individual genetic information. This genetic infor-
mation can be obtained at any point in life with a minimally
invasive procedure (e.g., blood draw or saliva sample)
and a single genotype data can be analyzed to provide
estimations for many diseases simultaneously. Although the
medical community still has doubt and hesitation regarding
implementation of the PRS, it will continue to improve and
have larger impact in the near future.
REFERENCES
[1] T. A. Manolio, “Genomewide association studies and as-
sessment of the risk of disease,” New England Journal of
Medicine, vol. 363, no. 2, pp. 166–176, 2010.
[2] T. A. Manolio, F. S. Collins et al., “Finding the missing
heritability of complex diseases,” Nature, vol. 461, no. 7265,
pp. 747–753, 2009.
[3] N. Chatterjee, B. Wheeler, J. Sampson, P. Hartge, S. J.
Chanock, and J.-H. Park, “Projecting the performance of
risk prediction based on polygenic analyses of genome-wide
association studies,” Nature Genetics, vol. 45, no. 4, pp. 400–
405, 2013.
[4] J. N. Cooke Bailey and R. P. Igo Jr, “Genetic Risk Scores,”
Current Protocols in Human Genetics, vol. 91, no. 1, pp.
1.29.1–1.29.9, 2016.
[5] A. C. J. Janssens, “Validity of Polygenic Risk Scores: Are
we measuring what we think we are?” Human Molecular
Genetics, vol. 28, no. R2, pp. R143–R150, 2019.
[6] 1000 Genomes Project Consortium and others, “A global
reference for human genetic variation,” Nature, vol. 526, no.
7571, pp. 68–74, 2015.
[7] J. MacArthur, E. Bowler et al., “The new NHGRI-EBI cata-
log of published genome-wide association studies (GWAS
catalog),” Nucleic Acids Research, vol. 45, no. D1, pp.
D896–D901, 2017.
[8] J. Euesden, C. M. Lewis, and P. F. O’Reilly, “PRSice:
Polygenic risk score software,” Bioinformatics, vol. 31, no. 9,
pp. 1466–1468, 2015.
[9] F. Privé, H. Aschard, and M. G. Blum, “Efficient implemen-
tation of penalized regression for genetic risk prediction,”
Genetics, vol. 212, no. 1, pp. 65–74, 2019.
[10] G. Versmée, L. Versmée, M. Dusenne, N. Jalali, and P. Avil-
lach, “dbgap2x: An R package to explore and extract data
from the database of Genotypes and Phenotypes (dbGaP),”
Bioinformatics, vol. 36, no. 4, pp. 1305–1306, 2020.
[11] C. Bycroft, C. Freeman et al., “The UK biobank resource
with deep phenotyping and genomic data,” Nature, vol. 562,
no. 7726, pp. 203–209, 2018.
[12] A. Torkamani, N. E. Wineinger, and E. J. Topol, “The
personal and clinical utility of polygenic risk scores,” Nature
Reviews Genetics, vol. 19, no. 9, pp. 581–590, 2018.
[13] S. A. Lambert, G. Abraham, and M. Inouye, “Towards
clinical utility of polygenic risk scores,” Human Molecular
Genetics, vol. 28, no. R2, pp. R133–R142, 2019.
[14] P. W. Wilson, R. B. D’Agostino, D. Levy, A. M. Belanger,
H. Silbershatz, and W. B. Kannel, “Prediction of coro-
nary heart disease using risk factor categories,” Circulation,
vol. 97, no. 18, pp. 1837–1847, 1998.
[15] G. Abraham, A. S. Havulinna et al., “Genomic prediction of
coronary heart disease,” European Heart Journal, vol. 37,
no. 43, pp. 3267–3278, 2016.
[16] R. S. Rosenson and C. C. Tangney, “Antiatherothrombotic
properties of statins: Implications for cardiovascular event
reduction,” JAMA, vol. 279, no. 20, pp. 1643–1650, 1998.
[17] P. Natarajan, R. Young et al., “Polygenic risk score identifies
subgroup with higher burden of atherosclerosis and greater
relative benefit from statin therapy in the primary prevention
setting,” Circulation, vol. 135, no. 22, pp. 2091–2101, 2017.
81
Research and Development on Information and Communication Technology
[18] J. G. Elmore, “Screening for breast cancer: Strategies
and recommendations,” Retrieved from the Up to Date
website, 2019. [Online]. Available: 
com/contents/screening-for-breast-cancer
[19] N. Pashayan, S. Morris, F. J. Gilbert, and P. D. Pharoah,
“Cost-effectiveness and benefit-to-harm ratio of risk-
stratified screening for breast cancer: A life-table model,”
JAMA Oncology, vol. 4, no. 11, pp. 1504–1510, 2018.
[20] P. Maas, M. Barrdahl et al., “Breast cancer risk from mod-
ifiable and nonmodifiable risk factors among white women
in the United States,” JAMA Oncology, vol. 2, no. 10, pp.
1295–1302, 2016.
[21] A. Lee, N. Mavaddat et al., “BOADICEA: A comprehensive
breast cancer risk prediction model incorporating genetic
and non-genetic risk factors,” Genetics in Medicine: Offi-
cial Journal of the American College of Medical Genetics,
vol. 21, no. 8, pp. 1708–1718, 2019.
[22] R. M. Hoffman, “Screening for prostate cancer,”
2019. [Online]. Available: www.uptodate.com/contents/
screening-for-prostate-cancer
[23] T. M. Seibert, C. C. Fan et al., “Polygenic hazard score to
guide screening for aggressive prostate cancer: Development
and validation in large scale cohorts,” BMJ, vol. 360, 2018.
[24] M. J. Redondo, S. Geyer et al., “A type 1 diabetes genetic
risk score predicts progression of islet autoimmunity and
development of type 1 diabetes in individuals at risk,”
Diabetes Care, vol. 41, no. 9, pp. 1887–1894, 2018.
[25] J. M. Sosenko, J. P. Krischer et al., “A risk score for type 1
diabetes derived from autoantibody-positive participants in
the diabetes prevention trial–type 1,” Diabetes Care, vol. 31,
no. 3, pp. 528–533, 2008.
[26] K. Lall, R. Magi, A. Morris, A. Metspalu, and K. Fischer,
“Personalized risk prediction for type 2 diabetes: The poten-
tial of genetic risk scores,” Genetics in Medicine, vol. 19,
no. 3, pp. 322–329, 2017.
[27] R. S. Desikan, C. C. Fan et al., “Genetic assessment of
age-associated Alzheimer disease risk: Development and
validation of a polygenic hazard score,” PLoS Medicine,
vol. 14, no. 3, p. e1002258, 2017.
[28] A. T. Marees, H. de Kluiver et al., “A tutorial on con-
ducting genome-wide association studies: Quality control
and statistical analysis,” International Journal of Methods
in Psychiatric Research, vol. 27, no. 2, p. e1608, 2018.
[29] P. G. Bagos, “Genetic model selection in genome-wide
association studies: Robust methods and the use of meta-
analysis,” Statistical Applications in Genetics and Molecular
Biology, vol. 12, no. 3, pp. 285–308, 2013.
[30] D. Thomas, “Methods for investigating gene-environment
interactions in candidate pathway and genome-wide asso-
ciation studies,” Annual Review of Public Health, vol. 31,
no. 1, pp. 21–36, 2010.
[31] J. H. Moore, “Computational analysis of gene-gene inter-
actions using multifactor dimensionality reduction,” Expert
Review of Molecular Diagnostics, vol. 4, no. 6, pp. 795–803,
2004.
[32] P. W. Wilson, J. B. Meigs, L. Sullivan, C. S. Fox, D. M.
Nathan, and S. D’Agostino, R. B., “Prediction of incident di-
abetes mellitus in middle-aged adults: The framingham off-
spring study,” Archives Internal Medicine, vol. 167, no. 10,
pp. 1068–1074, 2007.
[33] B. J. Keating, “Advances in risk prediction of type 2
diabetes: Integrating genetic scores with Framingham risk
models,” Diabetes, vol. 64, no. 5, pp. 1495–1497, 2015.
[34] L. Duncan, H. Shen et al., “Analysis of polygenic risk
score usage and performance in diverse human populations,”
Nature Communications, vol. 10, no. 1, pp. 1–9, 2019.
[35] M. Khoury, “Is it time to integrate polygenic risk scores
into clinical practice? Let’s do the science first and
follow the evidence wherever it takes us,” Centers for
Disease Control and Prevention, 2019. [Online]. Available:
https://blogs.cdc.gov/genomics/2019/06/03/is-it-time/
[36] S. W. Choi, T. S. H. Mak, and P. O’reilly, “A guide to
performing Polygenic Risk Score analyses,” BioRxiv, 2018.
[37] K. Wetterstrand, “The cost of sequencing a human genome,”
National Human Genome Research Institute, 2019. [On-
line]. Available: https://www.genome.gov/about-genomics/
fact-sheets/Sequencing-Human-Genome-cost
[38] NHGRI, “DNA sequencing fact sheet,” National Human
Genome Research Institute, 2015. [Online]. Avail-
able: https://www.genome.gov/about-genomics/fact-sheets/
DNA-Sequencing-Fact-Sheet.
[39] M. Francisco and C. D. Bustamante, “Polygenic risk scores:
A biased prediction?” Genome Medicine, vol. 10, no. 1, pp.
1–3, 2018.
[40] V. S. Le, K. T. Tran et al., “A Vietnamese human genetic
variation database,” Human Mutation, vol. 40, no. 10, pp.
1664–1675, 2019.
[41] M. D. Mailman, M. Feolo et al., “The NCBI dbGaP database
of genotypes and phenotypes,” Nature Genetics, vol. 39,
no. 10, pp. 1181–1186, 2007.
[42] E. M. Ramos, D. Hoffman et al., “Phenotype–Genotype
Integrator (PheGenI): Synthesizing genome-wide association
study (GWAS) data with existing genomic resources,” Euro-
pean Journal of Human Genetics, vol. 22, no. 1, pp. 144–
147, 2014.
[43] S. Purcell, B. Neale et al., “PLINK: A tool set for whole-
genome association and population-based linkage analyses,”
The American Journal of Human Genetics, vol. 81, no. 3,
pp. 559–575, 2007.
[44] U. Drepper, S. Miller, and D. Madore, “md5sum: Verify
compact digital fingerprint of a file (GNU GPL version
3 or later),” Free Software Foundation, 2010. [Online].
Available: linux.die.net/man/1/md5sum
[45] R. M. Kuhn, D. Haussler, and W. J. Kent, “The UCSC
genome browser and associated tools,” Briefings in Bioin-
formatics, vol. 14, no. 2, pp. 144–161, 2013.
[46] B. K. Bulik-Sullivan, P.-R. Loh et al., “LD score regression
distinguishes confounding from polygenicity in genome-wide
association studies,” Nature Genetics, vol. 47, no. 3, p. 291,
2015.
[47] F. Dudbridge, “Power and predictive accuracy of polygenic
risk scores,” PLoS Genetics, vol. 9, no. 3, 2013.
[48] T. S. H. Mak, R. M. Porsch, S. W. Choi, X. Zhou, and
P. C. Sham, “Polygenic scores via penalized regression on
summary statistics,” Genetic Epidemiology, vol. 41, no. 6,
pp. 469–480, 2017.
[49] B. J. Vilhjálmsson, J. Yang et al., “Modeling linkage dise-
quilibrium increases accuracy of polygenic risk scores,” The
American Journal of Human Genetics, vol. 97, no. 4, pp.
576–592, 2015.
[50] A. V. Khera, M. Chaffin et al., “Genome-wide poly-
genic scores for common diseases identify individuals with
risk equivalent to monogenic mutations,” Nature Genetics,
vol. 50, no. 9, pp. 1219–1224, 2018.
[51] S. H. Lee, M. E. Goddard, N. R. Wray, and P. M. Visscher,
“A better coefficient of determination for genetic profile
analysis,” Genetic Epidemiology, vol. 36, no. 3, pp. 214–
224, 2012.
[52] A. P. Bradley, “The use of the area under the ROC curve
in the evaluation of machine learning algorithms,” Pattern
Recognition, vol. 30, no. 7, pp. 1145–1159, 1997.
[53] P. M. Ridker, J. G. MacFadyen et al., “Rosuvastatin for
primary prevention among individuals with elevated high-
sensitivity C-reactive protein and 5% to 10% and 10% to
82
Vol. 2019, No. 2, December
20% 10-year risk,” Circulation: Cardiovascular Quality and
Outcomes, vol. 3, no. 5, pp. 447–452, 2010.
[54] P. C. Gøtzsche and O. Olsen, “Is screening for breast cancer
with mammography justifiable?” The Lancet, vol. 355, no.
9198, pp. 129–134, January 2000.
[55] G. A. Colditz and B. Rosner, “Cumulative risk of breast
cancer to age 70 years according to risk factor status:
Data from the Nurses’ Health Study,” American Journal of
Epidemiology, vol. 152, no. 10, pp. 950–964, 2000.
[56] B. A. Goldstein, L. Yang, E. Salfati, and T. L. Assimes,
“Contemporary considerations for constructing a genetic
risk score: An empirical approach,” Genetic Epidemiology,
vol. 39, no. 6, pp. 439–445, 2015.
[57] S. Michiels, S. Koscielny, and C. Hill, “Prediction of cancer
outcome with microarrays: A multiple random validation
strategy,” The Lancet, vol. 365, no. 9458, pp. 488–492, 2005.
Nguyen Tran The Hung received his doc-
tor of medicine degree from Universities of
Medicine and Pharmacy of Ho Chi Minh
city (Viet Nam) in 2016. He then got a
master degree in biomedical science from
China Medical Universities (Taichung, Tai-
wan) in 2019. His research field is human
genetic and diabetes mellitus. He worked
briefly as a pediatrician before pursuing his career in academia as
a research scientist at Vingroup Big Data Institute from 2019 until
now. His thesis on type 2 diabetic nephropathy and the application
of polygenic risk score made him believe in the potential impact
that genetic research can make in healthcare.
Le Duc Hau obtained his PhD degree
in Bioinformatics from University of Ul-
san, Republic of Korea in 2012. He is
now leading the Department of Compu-
tational Biomedicine, Vingroup Big Data
Institute, Vietnam. He has been focus-
ing on proposing computational methods
for disease- and drug-related problems in
personalized medicine, especially on identification of disease-
associated biomarkers, prediction of drug targets and response.
In parallel, he has been developed bioinformatics tools. So far,
he has been published more than fifty papers in well-recognized
journals and conferences, nearly a half of those are in ISI-indexed
journals. In addition, he has been a member of program com-
mittees and reviewer of several international conferences/journals.
Moreover, he is a principal investigator and a key member of some
national/ministry-level projects. Specially, he is the principal in-
vestigator of the biggest genome project in Vietnam (i.e., building
databases of genomic variants for Vietnamese population). Finally,
he has been collaborating with some well-recognized international
research institutes.
83
File đính kèm:
development_and_implementation_of_polygenic_risk_score_in_vi.pdf