Integrative analysis of DNA methylation and gene expression in Papillary Renal Cell Carcinoma

Abstract

DNA methylation is an epigenetic mark that is significantly altered in cancer. Interpreting the functional consequences of DNA methylation requires integration of multiple forms of data. The recent advancement in the next generation sequencing can help to decode this relationship and in biomarker discovery. In this study, we investigated the methylation patterns of Papillary renal cell carcinoma (PRCC) and its relationship with the gene expression using The Cancer Genome Atlas (TCGA) multi-omics data. We found that the promoter and body of tumor suppressor genes, microRNAs and gene clusters and families including cadherins, protocadherins, claudins and collagens are hypermethylated in PRCC. Hypomethylated genes in PRCC are associated with the immune function. The gene expression of several novel candidate genes including interleukin receptor IL17RE and immune checkpoint genes HHLA2, SIRPA and HAVCR2 shows significant correlation with the DNA methylation. We also developed machine learning models using features extracted from single and multi-omics data to distinguish early and late stages of PRCC. A comparative study of different feature selection algorithms, predictive models, data integration techniques and representations of methylation data was performed. The Group Lasso (GL) model using both the gene expression and DNA methylation features shows the overall best performance in distinguishing tumor stages. In summary, our study identifies PRCC driver genes and proposes predictive models based on both DNA methylation and gene expression. These results on PRCC will aid in targeted experiments and provide a strategy to improve the classification accuracy of tumor stages.

Publication
Molecular Genetics and Genomics