Exploring the genetic blueprint of cerebrospinal fluid proteins, this study uncovers new markers and therapeutic targets that may unlock advancements in Alzheimer’s diagnosis and care.
Study: Proteogenomic analysis of human cerebrospinal fluid identifies neurologically relevant regulation and implicates causal proteins for Alzheimer’s disease. Image Credit: Kateryna Kon / Shutterstock
In a recent study published in Nature Genetics, researchers investigated the genomic signature of the human cerebrospinal fluid (CSF) proteome.
Genome-wide association studies (GWASs) have become common over the past 15 years, with thousands of people studying many diseases and traits, revealing disease-associated loci. Nonetheless, translating associations to pathways and therapies is challenging, as identifying causal genes and their interactions requires integrating omics data and further downstream analyses.
Analyses of the genetic regulation of gene expression have described loci affecting mRNA levels; however, such analyses miss disease-relevant biology. Besides, the correlation between mRNA levels and their encoded proteins is weak, and therefore, the overlap between expression quantitative trait loci (eQTLs) and protein QTLs (pQTLs) is also low.
Studies investigating genetic associations of proteins have predominantly focused on plasma, and reports suggest little overlap between plasma and brain proteogenomics. However, targeting CSF proteins has successfully elucidated causal genes at some disease loci. Nevertheless, studies are limited by small sample sizes.
The study and findings
In the present study, researchers investigated the genomic signature of the CSF proteome. First, a proteogenomic analysis of the CSF was performed using genetic and proteomic data from 3,506 unrelated Europeans. These included 1,021 subjects with late-onset Alzheimer’s disease (AD), 1,242 patients with other neurodegenerative disorders, and 1,243 cognitively typical controls. This analysis identified 2,477 pQTL associations for 2,042 aptamers.
Of these, 48.6% were trans-pQTLs, and 51.4% were cis-pQTLs. Next, given that neurological disease was prevalent in the dataset, the researchers determined whether the pQTLs were consistent across cognitively healthy and affected subjects. They stratified subjects into AD-relevant biomarkers groups and examined associations for each group. This revealed a robust correlation between groups, suggesting that pQTLs were consistent across disease states.
Further, conditional analyses were performed on all index single nucleotide polymorphisms to identify independent signals in a locus. Overall, 3,885 conditionally independent associations were identified. While most proteins (54.4%) had single associations, two proteins, glutathione S-transferase μ1 and signal regulatory protein β1, had ≤ 16 independent cis associations.
Next, the researchers examined the overlap of CSF pQTLs with plasma pQTLs derived from 5,000 proteins. Overall, 4,735 aptamers overlapped, covering 73.5% of CSF pQTLs. Of these, 67.6% failed to colocalize, indicating CSF-specific signals. Further, the researchers compared cis-pQTL associations to eQTLs from neurologically relevant tissues and whole blood. They noted the highest overlap with cortex/cerebellum eQTLs.
Nearly half of cis-pQTLs did not colocalize with eQTLs. Among those that overlapped, 78.9% colocalized with neurologically relevant tissue. Overall, 33.6% of CSF cis-pQTLs were new and did not colocalize with eQTLs. Next, index pQTL variants from each association were grouped using linkage disequilibrium to identify genomic regions that regulated multiple proteins. In total, 166 regions were associated with at least two proteins.
Notably, three regions were associated with over 50 proteins. These were chr19q13.32, chr3q28, and chr6p22.2-21.32.
Cell-type and pathway enrichment analyses were performed for these three genomic regions to explore the cellular context of the regulated proteins. Only one pQTL from the chr3q28 region was observed in plasma, suggesting it was a CSF-specific hotspot. The apolipoprotein E (APOE) region at chr19q13.32 was associated with the most proteins in CSF. More associations were observed in this region in CSF than in plasma.
Proteins with associations in the chr19q13.32 region included known AD biomarkers. Furthermore, the team integrated pQTL associations with AD through proteome-wide association study (PWAS), Mendelian randomization (MR), and colocalization. The PWAS revealed significant associations between 125 pQTLs (for 108 proteins) and AD.
MR suggested 17 proteins as putatively causal for AD. Thirty-two proteins had QTLs that colocalized with AD risk. Of proteins prioritized by MR, colocalization, and PWAS, eight were significant across all three methods, while 38 were significant in at least two. A DrugBank search was performed to identify therapeutic compounds for AD-associated proteins.
Of the 38 causal proteins, drugs were available for 15 proteins. Finally, the team developed a proteomic risk score to select predictors of AD status in a training dataset and evaluated its predictive ability in an independent testing dataset. The prediction model accurately stratified participants in both datasets and performed better than a polygenic risk score. Model performance remained consistent across APOE genotypes and ages.
Conclusions
In sum, the researchers identified 3,885 significant pQTL associations for 1,883 proteins, which were highly protein- and CSF-specific. They also observed highly pleiotropic, CSF-biased genomic regions on chromosomes 19q13.32 and 3q28. Integrating pQTLs with AD revealed 38 putatively causal variants and several drug-repurposing candidates for AD. In addition, a predictive model based on the AD-associated proteins improved upon PRS in all aspects, underscoring the proximity of proteins to disease relative to genetics.
Journal reference:
- Western D, Timsina J, Wang L, et al. Proteogenomic analysis of human cerebrospinal fluid identifies neurologically relevant regulation and implicates causal proteins for Alzheimer’s disease. Nature Genetics, 2024, DOI: 10.1038/s41588-024-01972-8, https://www.nature.com/articles/s41588-024-01972-8