In a recent study published in the journal Nature Medicine, researchers developed a proteomic age clock using plasma proteins to predict biological age and the associated health risks. They found that this clock accurately predicts age and is linked to the risk of major chronic diseases, multimorbidity, and mortality across diverse populations.
Study: Proteomic aging clock predicts mortality and risk of common age-related diseases in diverse populations. Image Credit: kiehlord / Shutterstock
Background
Aging is a key factor in the onset of chronic diseases like heart disease, stroke, diabetes, and cancer, though the timing and severity vary across individuals. While chronological age is often used to estimate biological aging, it may not be an accurate surrogate measure. This study is significant as it is the first to validate a proteomic age clock across large and diverse populations, offering a robust tool to predict age-related diseases and mortality. More accurate estimations can be achieved using 'omics data, which reflect an individual's biological functioning. Biological aging influences the risk of chronic diseases, disability, and healthcare demands. Although deoxyribonucleic acid methylation (DNAm) clocks have been used previously to measure biological age, protein levels may potentially offer more direct insights into aging mechanisms. Although prior studies have developed proteomic age clocks to predict disease risk and mortality, none have done so in large, diverse populations. Therefore, researchers in the present study addressed this gap by developing and validating a proteomic age clock across different populations and assessing its predictive power for the risk of chronic diseases, mortality, and aging-related traits.
About the study
In the present study, data were obtained from three large biobank cohorts—United Kingdom Biobank (UKB), China Kadoorie Biobank (CKB), and FinnGen. The researchers developed and validated a proteomic age clock via the Olink Explore 3072 platform. The clock could predict a person's biological age based on the expression levels of specific proteins, which may be different from their chronological age. The difference, termed "ProtAgeGap," was analyzed to explore its relationship with aging, frailty, and disease.
A total of 45,441 participants from UKB (age 39–71 years, 54% women), 3,977 from CKB (age 30–78 years, 54% women), and 1,990 from FinnGen (age 19–78 years, 52% women) were included. Proteomic data were processed and normalized across cohorts, with 2,897 proteins selected for analysis after quality control. A gradient-boosting model (LightGBM) was employed, outperforming other machine-learning models in predicting chronological age. Recursive feature elimination helped to identify the 20 most important proteins, forming a minimal predictive model (ProtAge20) that maintained high accuracy. The model was trained and validated using fivefold cross-validation in the UKB and applied to the CKB and FinnGen cohorts to calculate the ProtAgeGap. Statistical analysis involved the use of linear or logistic regression, Cox proportional hazards models, functional enrichment analysis, Shapley additive explanations (SHAP) interaction analysis, Kaplan-Meier survival analysis, and protein-protein interaction (PPI) network visualization.
a, UKB participants were split into training and test sets at a 70:30 ratio. In the training set, a LightGBM model was trained to predict chronological age using 2,897 plasma proteins and fivefold cross-validation. We identified 204 proteins relevant for predicting chronological age using the Boruta feature selection algorithm and retrained a refined LightGBM model using these 204 proteins, which was then evaluated in the UKB test set. b, Independent data from the CKB and FinnGen were used for further independent validation of the proteomic age clock model. c, Protein-predicted age (ProtAge) was calculated in the full UKB sample using fivefold cross-validation and LightGBM. ProtAgeGap was calculated as the difference between ProtAge and chronological age. We used linear and logistic regression to test associations between ProtAgeGap and a comprehensive panel of biological aging markers and measures of frailty and physical/cognitive status. Further, we used Cox proportional hazards models to test associations between ProtAgeGap and mortality, 14 common diseases and 12 cancers. Most association analyses were carried out only in the UKB, due to the smaller sample size in the CKB and the lack of disease cases in FinnGen. Figure created with BioRender.com.
Results and discussion
During the follow-up period of 11–16 years, there were 10.6%, 36%, and 1% deaths in the CKB, UKB, and FinnGen cohorts, respectively. A total of 204 aging-related proteins were identified, and the associations between age and these proteins were found to be stable over time.
ProtAgeGap was found to correlate with biological aging markers and clinical outcomes. It was shown to be a strong predictor of the risk of multimorbidity, all-cause mortality (hazard ratio [HR] = 1.15 per year ProtAgeGap), and 14 non-cancer diseases, including Alzheimer's disease (HR = 1.11), chronic kidney disease (HR = 1.14), and type 2 diabetes (HR = 1.13). Additionally, ProtAgeGap also showed associations with cancer risks, including breast cancer (HR = 1.12), lung cancer (HR = 1.09), and prostate cancer (HR = 1.08). ProtAgeGap was also found to be associated with various biological aging markers (e.g., telomere length, insulin-like growth factor-1) and measures of cognitive and physical function. Sensitivity analyses, including non-smokers and normal-weight individuals, confirmed these associations.
According to the study, the proteomic age clock is majorly affected by proteins involved in diverse biological functions, such as extracellular matrix interactions, immune response and inflammation, hormone regulation, reproduction, neuronal development, and differentiation. The proteomic clock showed limited overlap with DNAm clocks, highlighting new aging-related proteins and providing additional insights into aging biomarkers. The study is strengthened by the use of gradient-boosting models which allow for nonlinear associations and interactions between proteins, providing better generalizability compared to other models. However, the study is limited by the sole use of the Olink Explore 3072 platform, limiting protein coverage, and the lack of DNAm data for direct comparisons with DNAm age clocks.
Conclusion
In conclusion, the proteomic age clock developed in this study provides a robust prediction system for biological aging that can offer insights into age-related diseases, frailty, and mortality mechanisms. The study suggests plasma proteomics is a reliable method for measuring biological age, thereby guiding drug targets, novel interventions, or lifestyle changes to potentially reduce premature mortality and delay the onset of major age-related health conditions.