open access publication

Article, 2021

The landscape and driver potential of site-specific hotspots across cancer genomes

NPJ GENOMIC MEDICINE, ISSN 2056-7944, 2056-7944, Volume 6, 1, 10.1038/s41525-021-00197-6

Contributors

Juul, Randi Istrup 0000-0001-7829-9071 (Corresponding author) [1] Nielsen, Morten 0000-0002-8972-7577 [1] Juul, Malene 0000-0001-9722-0461 [1] Feuerbach, Lars 0000-0003-1503-437X [2] [3] Pedersen, Jakob S 0000-0002-7236-4001 (Corresponding author) [1]

Affiliations

  1. [1] Aarhus Univ Hosp, Dept Mol Med, Aarhus, Denmark
  2. [NORA names: AU Aarhus University; University; Denmark; Europe, EU; Nordic; OECD];
  3. [2] German Canc Res Ctr, Div Appl Bioinformat, Heidelberg, Germany
  4. [NORA names: Germany; Europe, EU; OECD];
  5. [3] German Canc Res Ctr, Div Appl Bioinformat, Heidelberg, Germany
  6. [NORA names: Germany; Europe, EU; OECD]

Abstract

Large sets of whole cancer genomes make it possible to study mutation hotspots genome-wide. Here we detect, categorize, and characterize site-specific hotspots using 2279 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes project and provide a resource of annotated hotspots genome-wide. We investigate the excess of hotspots in both protein-coding and gene regulatory regions and develop measures of positive selection and functional impact for individual hotspots. Using cancer allele fractions, expression aberrations, mutational signatures, and a variety of genomic features, such as potential gain or loss of transcription factor binding sites, we annotate and prioritize all highly mutated hotspots. Genome-wide we find more high-frequency SNV and indel hotspots than expected given mutational background models. Protein-coding regions are generally enriched for SNV hotspots compared to other regions. Gene regulatory hotspots show enrichment of potential same-patient second-hit missense mutations, consistent with enrichment of hotspot driver mutations compared to singletons. For protein-coding regions, splice-sites, promoters, and enhancers, we see an excess of hotspots associated with cancer genes. Interestingly, missense hotspot mutations in tumor suppressors are associated with elevated expression, suggesting localized amino-acid changes with functional impact. For individual non-coding hotspots, only a small number show clear signs of positive selection, including known sites in the TERT promoter and the 5' UTR of TP53. Most of the new candidates have few mutations and limited driver evidence. However, a hotspot in an enhancer of the oncogene POU2AF1, which may create a transcription factor binding site, presents multiple lines of driver-consistent evidence.

Data Provider: Clarivate