Using MR-SPI and AlphaFold3, scientists unravel the molecular underpinnings of Alzheimer’s disease, identifying key protein changes that could reshape future treatments.
Study: Deciphering proteins in Alzheimer’s disease: A new Mendelian randomization method integrated with AlphaFold3 for 3D structure prediction. Image Credit: Shutterstock AI
In a recent study published in the journal Cell Genomics, a group of researchers developed Mendelian Randomization with Selection and Post-selection Inference (MR-SPI), a method integrated with AlphaFold3, to identify causal protein biomarkers and structural changes in Alzheimer’s disease.
Background
Alzheimer’s disease (AD), the leading cause of dementia worldwide, presents a significant healthcare challenge, with its etiology and pathogenesis remaining unclear. Current therapies targeting amyloid-beta (Aβ) production or aggregation offer only symptomatic relief, failing to halt disease progression.
MR provides an approach to identify causal protein biomarkers by leveraging genetic variants as instrumental variables. However, conventional MR methods face challenges with invalid instruments and horizontal pleiotropy, potentially biasing results.
Advanced MR techniques that address these limitations are critical for uncovering causal proteins and understanding their structural impacts. The MR-SPI method uniquely applies the "Anna Karenina principle," assuming that valid instrumental variables (IVs) behave similarly while invalid IVs deviate in distinct ways. Further research is urgently needed to facilitate effective therapeutic development.
About the study
In two-sample Mendelian Randomization (MR) studies, genetic associations between protein quantitative trait loci (pQTLs) and phenotypic outcomes are analyzed using genome-wide association study (GWAS) summary statistics.
This process involves identifying independent pQTLs through linkage disequilibrium (LD) clumping, retaining only one representative pQTL per LD region. These pQTLs are modeled to estimate causal relationships between proteins and health outcomes while addressing potential violations of instrumental variable (IV) assumptions.
MR-SPI is a novel method designed to overcome challenges in selecting valid pQTL IVs. It leverages the "plurality rule," which assumes valid IVs produce similar ratio estimates of causal effects, distinguishing them from invalid instruments.
Through a voting procedure, MR-SPI identifies the largest subset of pQTLs with consistent ratio estimates as valid IVs, ensuring causal inference despite limited pQTL availability or IV assumption violations. This approach, in contrast to methods requiring the "majority rule" or strict assumptions like InSIDE (Instrument Strength Independent of Direct Effect), is particularly robust for proteomics data with small pQTL sets.
MR-SPI estimates causal effects using zero-intercept ordinary least squares regression and constructs confidence intervals robust to finite-sample errors. By addressing the limitations of conventional MR methods, MR-SPI provides a framework for identifying causal protein biomarkers, advancing the integration of large-scale proteomics and phenotypic outcome data in causal inference studies.
Study results
The proposed pipeline for identifying causal protein biomarkers and predicting their 3D structural alterations consists of three primary steps.
First, for each protein, the MR-SPI method is employed to select valid pQTLs as IVs. This is achieved by integrating GWAS summary data for proteomics and disease outcomes, allowing for the estimation of the causal effect of each protein on the disease.
Second, Bonferroni correction is applied to the estimated causal effects to identify statistically significant protein biomarkers. Third, AlphaFold3 is utilized to predict and compare the 3D structures of the wild-type and mutated versions of these proteins resulting from missense pQTLs.
MR-SPI operates through a multi-step process. It first identifies relevant pQTLs with strong protein associations. Each relevant pQTL provides a ratio estimate of the causal effect, and other pQTLs "vote" for its validity if their degrees of assumption violations (independence and exclusion restrictions) fall below a threshold. A voting matrix is constructed to summarize mutual validation among pQTLs, with valid IVs identified via majority/plurality voting or the maximum clique method.
The causal effect is then estimated using zero-intercept ordinary least squares regression, and confidence intervals are constructed to address potential finite-sample IV selection errors.
This approach was compared against several established MR methods, including inverse-variance weighting (IVW), MR-Robust Adjusted Profile Score (MR-RAPS), MR-Pleiotropy Residual Sum and Outlier (MR-PRESSO), weighted median estimation, and mode-based estimation. MR-SPI outperformed these methods in simulation studies under conditions with locally invalid IVs, demonstrating superior accuracy and robustness.
Applying MR-SPI to United Kingdom (UK) Biobank proteomics data and AD GWAS data identified seven significant protein biomarkers (Cluster of Differentiation (CD)33, CD55, Erythropoietin-Producing Hepatocellular Carcinoma Receptor A1 (EPHA1), Paired Immunoglobulin-Like Type 2 Receptor Beta (PILRA), PILRB, Rearranged during Transfection (RET), and Triggering Receptor Expressed on Myeloid Cells 2 (TREM2)).
Structural alterations in these proteins, predicted by AlphaFold3, revealed changes due to missense mutations in associated pQTLs. For instance, CD33 was found to undergo structural changes that may influence microglial function and amyloid plaque accumulation, highlighting its potential role in AD pathology. This finding underscores the method’s potential to link genetic variations to disease mechanisms.
Gene Ontology (GO) analysis linked these proteins to critical biological processes, including phosphorus metabolism and immune regulation. Notably, some of the identified proteins, such as CD33 and TREM2, have existing FDA-approved drugs targeting them, suggesting potential for drug repurposing in AD treatment.
Conclusions
This study introduces a novel pipeline integrating MR-SPI and AlphaFold3 to identify causal protein biomarkers and predict 3D structural changes induced by missense pQTLs.
MR-SPI employs a voting-based approach under the plurality rule condition to select valid pQTLs and constructs confidence intervals immune to finite-sample errors. Applied to 912 plasma proteins, MR-SPI identified seven proteins linked to AD, with structural insights provided by AlphaFold3.
The findings also open avenues for drug development, including repurposing FDA-approved drugs targeting identified proteins, such as gemtuzumab ozogamicin for CD33 and RET inhibitors like pralsetinib for potential AD treatment.
Journal reference:
- Yao, M., Miller, G. W., Vardarajan, B. N., Baccarelli, A. A., Guo, Z., & Liu, Z. (2024). Deciphering proteins in Alzheimer’s disease: A new Mendelian randomization method integrated with AlphaFold3 for 3D structure prediction. Cell Genomics, 100700. DOI: 10.1016/j.xgen.2024.100700, https://www.sciencedirect.com/science/article/pii/S2666979X2400329X