A study published in the Cell Reports Journal provides an online resource for studying the relationship between mutational signatures and topographical characteristics across human cancer.
Study: Topography of mutational signatures in human cancer. Image Credit: Gorodenkoff/Shutterstock.com
Background
Various endogenous and exogenous mutational processes are vital in imprinting somatic mutations in a cancer genome. Each mutational process imprints a characteristic pattern of somatic mutations, also known as mutational signatures, which can be affected by the genome architecture.
Evidence indicates that the accumulation rate of mutations is higher in late-replicating and condensed chromatin regions than in early-replicating regions, actively transcribed regions, and open chromatin regions.
The rate varies across some cancer genomes due to differential DNA repair. Moreover, chromatin features of cells undergoing neoplastic transformation can affect somatic mutations' mutation rate and distribution.
In this study, scientists have explored the interactions between mutational signatures and topographical genomic features across human cancer.
They have determined the effects of topographical genomic features, including nucleosome occupancy, histone modifications, CCCTC-binding factor (CTCF) binding sites, replication timing, transcription strand asymmetry, and replication strand asymmetry on the cancer-specific accumulation of somatic mutations from distinct mutational processes.
Study design
The study utilized the complete set of known COSMIC (Catalog of Somatic Mutations in Cancer) signatures and conducted mutational analysis of 5,120 whole-genome-sequenced tumors from 40 cancer types integrated with 516 topographical features to evaluate the effects of topographical genomic features on the cancer-specific accumulation of somatic mutations from distinct mutational processes.
The COSMIC signatures database included 78 single-base substitution, 11 doublet-base substitution, and 18 insertion or deletion mutational signatures.
Important observations
The analysis of 28 single-base substitution in DNA polymerase epsilon (POLE) deficient and proficient samples revealed similar trinucleotide patterns.
However, about 97.7% of all 28 single-base substitution mutations were detected in POLE-deficient samples, and these mutations were clearly enriched in late-replicating regions and depleted in nucleosomes and at CTCF binding sites.
The 28 single-base substitution showed a robust replication strand bias on the leading strand and a strand-coordinated mutagenesis with 11 consecutively mutated substitutions in POLE-deficient samples.
In POLE-proficient samples, an enrichment of 28 single-base substitution was observed in early-replicating regions. The mutations lacked depletion in nucleosomes and CTCF binding sites.
While it showed a weak replication strand bias on the lagging strand, no strand-coordinated mutagenesis was observed.
Considering these topographical differences, scientists categorized 28 single-base substitution into two district signatures, including POLE deficiency-related 28a single-base substitution found in ultra-hypermutate colorectal and uterine cancers and 28b single-base substitution with unknown etiology found in lung and stomach cancers.
In 288 whole-genome-sequenced B cell malignancies, significant differences in topographical features were observed between clustered and non-clustered somatic mutations.
Specifically, topographical features of single-base substitutions were analyzed after separating them into non-clustered mutations, diffuse hypermutation of substitutions (omikli), and longer clusters of strand-coordinated substitutions (kataegis).
In most cancer types, APOBEC3 deaminases are predominantly responsible for generating omikli and kataegis. However, in B cell malignancies, activation-induced deaminase was found to exclusively drive these clustered mutational events.
Compared to clustered mutations, non-clustered mutations exhibited a significantly different trinucleotide pattern. Considering a single malignant B cell lymphoma, non-clustered exhibited some minor periodicity regarding nucleosome occupancy.
However, no such periodicity was observed for clustered mutations. Moreover, a slight depletion for non-clustered mutations and a very high depletion for clustered mutations were observed around CTCF binding sites.
In contrast to kataegis events which were highly enriched in early-replicating regions, non-clustered and omikli events showed more enrichment in late-replicating regions.
Both omikli and kataegis mutations exhibited distinct patterns of enrichment near the promoter and enhancer sites delineated by histone marks of H3K4me3, H3K9ac, H3K27ac, H3K36me3, and H4K20me1.
Regarding transcription or replication strand asymmetries, only a slight difference was observed between clustered and non-clustered mutations across the 288 whole-genome-sequenced B cell malignancies.
Study significance
The study provides a detailed topographical analysis of mutational signatures including 82,890,857 somatic mutations from 40 cancer types integrated with 516 tissue-matched topographical features derived from the ENCyclopedia of DNA Elements (ENCODE) project.
The integrated mutations were derived from 5,120 whole-genome-sequenced tumor samples. The ENCODE project is a public research consortium established to identify all functional elements in the human genome.
The study used topographical features derived from previously generated ENCODE database. Thus, the scientists mentioned that…
…these topographical features were mapped in samples unrelated to the examined cancers and do not provide a perfect representation of the genome topography throughout the lineage of a cancer cell.”