Computational personalized medicine in cancer research in the-omics data era

Abstract: Omics data (e.g., genomics, transcriptomics, proteomics, epigenomics, etc . . . ) generated from high-Throughput next-generation sequencers in the big human genome, and cancer genome projects have changed the way to study personalized medicine. In the future, personalized medicine will not be limited to diagnosis and treatment based on a few known disease-associated mutations on some genes, but will rely on whole molecular characteristics of patients by integrating their –omics data. In this study, we draw a big picture of personalized medicine research in cancer research of the –omics data era, including –omics databases, challenges of data fusion to solve two major problems in personalized medicine, i.e., personalized diagnosis and treatment. These problems are approached as patient stratification and drug response prediction based on the –omics data by computational methods
Download
Trang 1
Trang 2
Trang 3
Trang 4
Trang 5
Trang 6
Trang 7
Trang 8
8 trang xuanhieu 7840
Download
Bạn đang xem tài liệu "Computational personalized medicine in cancer research in the-omics data era", để tải tài liệu gốc về máy hãy click vào nút Download ở trên
Tóm tắt nội dung tài liệu: Computational personalized medicine in cancer research in the-omics data era

rent molecular profiles. Thus, the prediction of
drug response using the –omics data is an important step
for selecting the right drugs (Figure 4).
Similar to the patient stratification problem, many meth-
ods have also been proposed for predicting drug response
for patients/cell lines [30]. The drug response is measured
by dose level to inhibit 50% of the disease’s bioactivity
(IC50), or they are under the dose-response curve (AUC).
They are both continuous values. Thus, the drug response
prediction is often approached by regression techniques.
However, response values can be categorized into some
levels, such as good response, no response, and bad side
effects (Figure 4); thus, it can be formulated as a classifi-
cation problem. The main difference between the two main
5
Research and Development on Information and Communication Technology
problems in personalized medicine is that the patient strati-
fication is usually based on tumor data from tumor/patient-
based projects such as TCGA. Meanwhile, the drug re-
sponse prediction uses cell line and drug response data from
drug trial projects such as CCLE and GDSC.
Generally, machine learning- and network-based meth-
ods are often proposed for the drug response prediction.
Network-based methods are usually based on similarity net-
works of drugs and cell lines and local [31] or global [32]
graph traversal algorithms. In contrast to a few network-
based methods, many machine learning-based methods have
been proposed for the drug response prediction problem.
Indeed, a challenge was organized for research groups over
the world [33]. Interestingly, the winner over 44 submis-
sions is a method integrating the –omics data (including
single point mutation, structural mutation, gene expression
by microarray and RNA-Seq technologies, methylation, and
protein data) using multiple kernel learning technique [34].
Other methods also show that response prediction for
multiple drugs simultaneously achieve better performance
than that for a single drug, because functional and structural
similarity among drugs is taken into account [34, 35]. In
addition, gene expression data is more dominant than the
others [33]. Finally, until now, computational methods for
the drug response prediction have been proposed mostly
for cancer cell lines. Thus to translate them to clinical
application, a recent method has built the prediction model
using the data from cell lines in GDSC, then use the
built model for predicting drug response for patients in
TCGA [36].
VII. CONCLUSIONS
Nowadays, the rapid development of high-throughput
technologies and large-scale genome projects have gen-
erated a large amount of the –omics data (i.e., the
–omics era). This has changed the ways to computationally
approach the problems in personalized medicine. To fully
understand the biological characteristics of patients, their
molecular profiles at the –ome scale has been studied.
Thus, the –omics data has been integrated into compu-
tational methods to solve the problems in personalized
medicine. The two major problems in medicine (i.e., di-
agnosis and treatment) are formulated as two problems in
computational space (i.e., patient stratification and drug
response prediction, respectively). Although current studies
of the two problems target different objects, i.e., the patient
stratification mainly focuses on patient data from the patien-
t/tumor projects; meanwhile, the drug response prediction
mostly works with artificial patients/tumors (i.e., cell lines).
However, they are both personalized based on molecular
profiles of each patient/tumor/cell line. Integration of the
–omics data algorithmically faces with the “small 𝑛, large
𝑝” problem. The object (i.e., cancer) itself is a complex
disease, which is heterogeneous between cancer types and
even cells in the same tumor. In addition, unexpected
changes in characteristics of cell lines during culture may
limit the translation of research results on cell lines to
patients. Fortunately, many big human genome and disease
genome projects have been launched and freely published
the data for the research community. In parallel, state-of-
the-art techniques in computational sciences (e.g., artifi-
cial intelligence, statistics) have fostered the application
of computational methods to study problems in medicine.
This could open a brighter future for personalized medicine
in cancer research of the –omics data era. Personalized
medicine is a broad research area and application. Indeed,
besides biological characteristics of the patients, their clin-
ical data, environment, and lifestyles are also important
factors in tailoring the individual treatments. In addition,
personalized medicine approaches are not only limited to
cancers, but also be used to diagnose and treat other
disorders such as rare diseases, which are strongly linked to
molecular alterations. Furthermore, besides the abovemen-
tioned –omics data, metagenomics and metatranscriptomics
should also be worthy of studying personalized medicine
since there exist interactions between humans and the
microbiome.
ACKNOWLEDGMENT
This research is funded by Vietnam National Foundation
for Science and Technology Development (NAFOSTED)
under grant number 102.01-2017.14.
REFERENCES
[1] J. C. Venter, M. D. Adams, E. W. Myers, P. W. Li, R. J.
Mural, G. G. Sutton et al., “The sequence of the human
genome,” science, vol. 291, no. 5507, pp. 1304–1351, 2001.
[2] C. Manzoni, D. A. Kia, J. Vandrovcova, J. Hardy, N. W.
Wood, P. A. Lewis et al., “Genome, transcriptome and
proteome: The rise of omics data and their integration in
biomedical sciences,” Briefings in Bioinformatics, vol. 19,
no. 2, pp. 286–302, 2018.
[3] J. Harrow, A. Frankish, J. M. Gonzalez, E. Tapanari,
M. Diekhans, F. Kokocinski et al., “GENCODE: The refer-
ence human genome annotation for the ENCODE project,”
Genome Research, vol. 22, no. 9, pp. 1760–1774, 2012.
[4] R. P. Horgan and L. C. Kenny, “Omic technologies: Ge-
nomics, transcriptomics, proteomics and metabolomics,” The
Obstetrician & Gynaecologist, vol. 13, no. 3, pp. 189–195,
2011.
[5] G. N. Samuel and B. Farsides, “The UK’s 100,000 Genomes
Project: Manifesting policymakers’ expectations,” New Ge-
netics and Society, vol. 36, no. 4, pp. 336–353, 2017.
[6] GenomeAsia100K Consortium et al., “The GenomeAsia
100K Project enables genetic discoveries across Asia,” Na-
ture, vol. 576, no. 7785, pp. 106–111, 2019.
[7] J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. M. Shaw,
B. A. Ozenberger, K. Ellrott et al., “The cancer genome
6
Vol. 2020, No. 01, September
atlas pan-cancer analysis project,” Nature Genetics, vol. 45,
no. 10, p. 1113, 2013.
[8] S. Deorowicz, A. Danek, and M. Niemiec, “GDC 2: Com-
pression of large collections of genomes,” Scientific Reports,
vol. 5, p. 11565, 2015.
[9] S. A. Forbes, D. Beare, P. Gunasekaran, K. Leung, N. Bindal,
H. Boutselakis et al., “COSMIC: Exploring the world’s
knowledge of somatic mutations in human cancer,” Nucleic
Acids Research, vol. 43, no. D1, pp. D805–D811, 2015.
[10] J. Zhang, J. Baran, A. Cros, J. M. Guberman, S. Haider,
J. Hsu et al., “International Cancer Genome Consortium
Data Portal – a one-stop shop for cancer genomics data,”
Database, vol. 2011, 2011.
[11] J. Barretina, G. Caponigro, N. Stransky, K. Venkatesan,
A. A. Margolin, S. Kim et al., “The Cancer Cell Line
Encyclopedia enables predictive modelling of anticancer
drug sensitivity,” Nature, vol. 483, no. 7391, pp. 603–607,
2012.
[12] W. Yang, J. Soares, P. Greninger, E. J. Edelman, H. Light-
foot, S. Forbes et al., “Genomics of Drug Sensitivity in Can-
cer (GDSC): A resource for therapeutic biomarker discovery
in cancer cells,” Nucleic Acids Research, vol. 41, no. D1, pp.
D955–D961, 2012.
[13] S. Huang, K. Chaudhary, and L. X. Garmire, “More is better:
Recent progress in multi-omics data integration methods,”
Frontiers in Genetics, vol. 8, p. 84, 2017.
[14] Y. Li, F.-X. Wu, and A. Ngom, “A review on machine learn-
ing principles for multi-view biological data integration,”
Briefings in Bioinformatics, vol. 19, no. 2, pp. 325–340,
2018.
[15] J. Yan, S. L. Risacher, L. Shen, and A. J. Saykin, “Net-
work approaches to systems biology analysis of complex
disease: Integrative methods for multi-omics data,” Briefings
in Bioinformatics, vol. 19, no. 6, pp. 1370–1381, 2018.
[16] C. Meng, O. A. Zeleznik, G. G. Thallinger, B. Kuster,
A. M. Gholami, and A. C. Culhane, “Dimension reduction
techniques for the integrative analysis of multi-omics data,”
Briefings in Bioinformatics, vol. 17, no. 4, pp. 628–641,
2016.
[17] F. Rohart, B. Gautier, A. Singh, and K.-A. Lê Cao,
“mixOmics: An R package for ‘omics feature selection and
multiple data integration,” PLoS Computational Biology,
vol. 13, no. 11, p. e1005752, 2017.
[18] M. Bersanelli, E. Mosca, D. Remondini, E. Giampieri,
C. Sala, G. Castellani et al., “Methods for the integration
of multi-omics data: Mathematical aspects,” BMC Bioinfor-
matics, vol. 17, no. S2, p. S15, 2016.
[19] C. Dimitrakopoulos, S. K. Hindupur, L. Ha¨fliger, J. Behr,
H. Montazeri, M. N. Hall et al., “Network-based integration
of multi-omics data for prioritizing cancer genes,” Bioinfor-
matics, vol. 34, no. 14, pp. 2441–2448, 2018.
[20] Q. Zhao, X. Shi, Y. Xie, J. Huang, B. Shia, and S. Ma,
“Combining multidimensional genomic measurements for
predicting cancer prognosis: Observations from TCGA,”
Briefings in Bioinformatics, vol. 16, no. 2, pp. 291–303,
2015.
[21] Q. Mo, F. Nikolos, F. Chen, Z. Tramel, Y.-C. Lee, K. Hayashi
et al., “Prognostic power of a tumor differentiation gene
signature for bladder urothelial carcinomas,” Journal of the
National Cancer Institute, vol. 110, no. 5, pp. 448–459,
2018.
[22] M. Cortet, A. Bertaut, F. Molinié, S. Bara, F. Beltjens,
C. Coutant et al., “Trends in molecular subtypes of breast
cancer: Description of incidence rates between 2007 and
2012 from three French registries,” BMC Cancer, vol. 18,
no. 1, p. 161, 2018.
[23] L. Zhao, V. H. Lee, M. K. Ng, H. Yan, and M. F. Bijlsma,
“Molecular subtyping of cancer: Current status and moving
toward clinical applications,” Briefings in Bioinformatics,
vol. 20, no. 2, pp. 572–584, 2019.
[24] M. Hofree, J. P. Shen, H. Carter, A. Gross, and T. Ideker,
“Network-based stratification of tumor mutations,” Nature
methods, vol. 10, no. 11, pp. 1108–1115, 2013.
[25] Z. He, J. Zhang, X. Yuan, Z. Liu, B. Liu, S. Tuo et al.,
“Network based stratification of major cancers by integrating
somatic mutation and gene expression data,” PloS One,
vol. 12, no. 5, 2017.
[26] B. Wang, A. M. Mezlini, F. Demir, M. Fiume, Z. Tu,
M. Brudno et al., “Similarity network fusion for aggregating
data types on a genomic scale,” Nature Methods, vol. 11,
no. 3, pp. 333–337, 2014.
[27] F. Zhang, C. Ren, K. K. Lau, Z. Zheng, G. Lu, Z. Yi et al., “A
network medicine approach to build a comprehensive atlas
for the prognosis of human cancer,” Briefings in Bioinfor-
matics, vol. 17, no. 6, pp. 1044–1059, 2016.
[28] M. Le Morvan, A. Zinovyev, and J.-P. Vert, “NetNorM:
Capturing cancer-relevant information in somatic exome
mutation data with gene networks for cancer stratification
and prognosis,” PLoS Computational Biology, vol. 13, no. 6,
p. e1005573, 2017.
[29] C. R. Planey and O. Gevaert, “CoINcIDE: A framework
for discovery of patient subtypes across multiple datasets,”
Genome Medicine, vol. 8, no. 1, pp. 1–17, 2016.
[30] G. Yu, X. Yu, and J. Wang, “Network-aided Bi-Clustering
for discovering cancer subtypes,” Scientific Reports, vol. 7,
no. 1, pp. 1–15, 2017.
[31] F. Azuaje, “Computational models for predicting drug re-
sponses in cancer research,” Briefings in Bioinformatics,
vol. 18, no. 5, pp. 820–829, 2017.
[32] N. Zhang, H. Wang, Y. Fang, J. Wang, X. Zheng, and
X. S. Liu, “Predicting anticancer drug responses using a
dual-layer integrated cell line-drug network model,” PLoS
Computational Biology, vol. 11, no. 9, 2015.
[33] D.-H. Le and V.-H. Pham, “Drug response prediction by
globally capturing drug and cell line information in a hetero-
geneous network,” Journal of Molecular Biology, vol. 430,
no. 18, pp. 2993–3004, 2018.
[34] J. C. Costello, L. M. Heiser, E. Georgii, M. Go¨nen, M. P.
Menden, N. J. Wang et al., “A community effort to assess
and improve drug sensitivity prediction algorithms,” Nature
Biotechnology, vol. 32, no. 12, pp. 1202–1212, 2014.
[35] M. Ammad-ud din, S. A. Khan, D. Malani, A. Muruma¨gi,
O. Kallioniemi, T. Aittokallio et al., “Drug response predic-
tion by inferring pathway-response associations with kernel-
ized Bayesian matrix factorization,” Bioinformatics, vol. 32,
no. 17, pp. i455–i463, 2016.
[36] D. Le and D. Nguyen-Ngoc, “Multi-task regression learning
for prediction of response against a panel of anti-cancer
drugs in personalized medicine,” in Proceedings of the In-
ternational Conference on Multimedia Analysis and Pattern
Recognition, Ho Chi Minh City, Vietnam, Apr. 2018.
7
Research and Development on Information and Communication Technology
Le Duc Hau obtained his PhD degree
in Bioinformatics from University of Ul-
san, Republic of Korea in 2012. He is
now leading the Department of Compu-
tational Biomedicine, Vingroup Big Data
Institute, VietNam. He has been focus-
ing on proposing computational methods
for disease- and drug-related problems in
personalized medicine, especially on identification of disease-
associated biomarkers, prediction of drug targets and response.
In parallel, he has been developing bioinformatics tools. So far,
he hasmore than fifty papers published in well-recognized journals
and conferences, nearly a half of those are in ISI-indexed journals.
In addition, he has been a member of program committees
and reviewer of several international conferences/journals. More-
over, he is a principal investigator and a key member of some
national/ministry-level projects. Specially, he is the principal in-
vestigator of the biggest genome project in Vietnam (i.e., building
databases of genomic variants for Vietnamese population). Finally,
he has been collaborating with some well-recognized international
research institutes.
Quynh Diep Nguyen obtained her PhD
degree in Information Technology from the
Institute of Information Technology - The
Vietnam Academy of Science and Tech-
nology in 2015. She is a lecturer at the
School of Computer Science and Engi-
neering, Thuyloi University. She has been
focusing on computational methods for re-
constructing the metabolic networks. So far, she has more than
fifteen papers in journals and conferences published . Moreover,
she is a member of some national/ministry-level projects which re-
search on computational methods for uncovering latent knowledge
from high-throughput biological data.
8
File đính kèm:
computational_personalized_medicine_in_cancer_research_in_th.pdf