In a recent study published in Nature Medicine, researchers systematically investigated the genetic makeup of nearly 20,000 women and men in terms of >900 metabolites.
Metabolites circulating in the human body reflect human physiology and an individual's chemical uniqueness. Human metabolism is dysregulated in several diseases and is affected by multiple dietary, genetic, drug-associated, and disease-associated factors. A wide range of high-throughput biomedical technologies is available that enable assessment of the genetic factors affecting human physiology; however, coregulation data of different metabolites is limited.
About the study
In the present study, researchers investigated genetic determinants of variations in human physiology using untargeted metabolomic data.
The team analyzed the genetic architecture of 913 metabolites among >14,000 individuals. The data were used to define genetically influenced metabotypes (GIMs) or groups of metabolites influenced by ≥1.0 shared genetic signal. Samples from two United Kingdom (UK)-based cohort studies: INTERVAL and EPIC-Norfolk, were analyzed. The metabolites were measured by liquid chromatography and mass spectrometry and classified as related to lipid, amino acid, xenobiotic, nucleotide, peptide, carbohydrate, cofactor and vitamin, and energy metabolism.
Compounds with undetermined chemical identities were referred to as unannotated compounds. Multivariable linear regression modeling was performed for the analysis. Metabolomic measurements were made between 2015 and 2017 for the EPIC-Norfolk samples. Metabolite levels were assessed in two sets of about 6,000 samples each. The team validated regional sentinel variant–metabolite associations by meta-analyzing the discovery set and validation set data.
Among EPIC-Norfolk study participants, 5,698 and 5,841 individuals were categorized in the validation and discovery sets, respectively. Genotyping and imputation analyses were performed wherein the team imputed genetically predicted metabolite levels (‘metabolite scores’) in UK Biobank participants using weighted genetic scores and estimated their associations with 1,457 collated disease terms (‘phecodes’). Genome-wide association analysis (GWAS) was performed for each metabolite separately for the samples. Further, conditional analysis, colocalization analysis and enrichment analysis for IEM (inborn errors of metabolism)-causing genes were performed.
Allelic heterogeneity was assessed, and the genetic co-regulation of different metabolites was evaluated. The team also performed phenotypic analysis for metabolite-associated genetic variants, and phenome-wide metabolic associations were determined. The results were technically validated using whole exome sequence (WES) data from 3,924 INTERVAL study samples.
The most likely causal genes were determined, and the novelty of variant association was assessed based on comparing the findings with those of two previously conducted studies. Based on the genetic associations identified and manually curated scientific literature, high-confidence causal genes regulating the metabolites were defied, and their clinical relevance was assessed across >1,400 phenotypes.
Results
Convergence of phenotypic and metabolic presentations of rare IEM-causing genes was observed with genetic variants of the genes identified in the general population. In total, 423 GIMs were identified, including mainly ≤15 genetic variants and ≤89 metabolites. For 62% (n=264) GIMs, a gene out of 253 likely causal genes were assigned based on extensive data mining. GIMs such as steroid 5α-reductase 2 (SRD5A2) and dihydropyrimidine dehydrogenase (DPYD) showed important clinical implications.
Higher SRD5A2 activity was associated with greater male-pattern baldness risks. Genetic associations were consistent with lesser SRD5A2 activity and lesser levels of androsterone, epiandrosterone, 3α-androstanediol, and 3β-androstanediol conjugates. Shared genetic signals were observed between various androgen metabolites and male-pattern baldness, with rs112881196 as the causal variant. The fatty acid desaturase (FAD)S1/S2 locus was associated with the most annotated metabolites.
The mean phenotypic variance explained by conditionally independent variants was 5.2%, the highest for amino acid and energy classes. Lower SRD5A inhibitor levels were associated with more significant depression risks, with rs62142080 as the likely causal variant. The rs72977723 variant involved uracil breakdown, whereas rs184097503 and rs28933981 increased thyroxine transport abilities. GIMs capturing multiple gene functions, such as those of SLC7A2 (Slc7a2 solute carrier family 7) transporters associated with arginine or lysine levels, were observed.
An 8.0-fold enrichment of IEM-causing genes was observed with IEM variants mapped to genes causing disorders related to mitochondria, amino acids, and fatty acids. Lower vanillylmandelate levels were associated with lower hypertension risks, with rs6271 as the causal variant. Causal genes were also identified for coronary artery disease [PCSK9 (Proprotein convertase subtilisin/kexin type 9), SORT1 (Sortiliin 1) and LDLR (low-density lipoprotein receptor)], macular degeneration [LIPC (hepatic lipase) and apolipoprotein E (APOE)/apolipoprotein C (APOC) 1,2,4], Crohn’s disease [GCKR (glucokinase regulator) and FADS2] and chronic kidney disease [GATM (Glycine amidinotransferase)].
Association between metabolites and illnesses, such as urate levels with gout [odds ratio (OR) of 2.2], bile acids with cholelithiasis (OR of 0.6 for glycohyocholate), and complex lipids with hypercholesterolemia [OR of 1.8 for 1-dihomo-linoleoyl-GPC (20:2)] were observed. Plasma homoarginine was found to have a key role in chronic kidney disease pathology and 3-methylglutarylcarnitine protected against the development of benign neoplasms in the colon.
Overall, the study findings highlighted the genetic determinants of human metabolite variations and could guide future metabolome-wide association assessments.