In a recent study published in the journal Science Advances, researchers in the United States used 3D transport-based morphometry (TBM) to identify and visualize brain changes linked to 16p11.2 genetic copy number variation (CNV), enhancing prediction accuracy and advancing precision medicine in autism.
Study: Discovering the gene-brain-behavior link in autism via generative machine learning. Image Credit: jittawit21 / Shutterstock
Background
Autism, characterized by social, communication, and behavioral impairments, is influenced by genetic and environmental factors, with heritability estimates up to 90%. Despite this, diagnosis is mainly behavioral, and genetic testing is infrequent. Over 200 autism-linked CNVs have been identified, notably the 16p11.2 region. Endophenotypes can bridge genetics and behavior. Emerging machine learning techniques, such as 3D TBM, have the potential to uncover gene-brain-behavior relationships, advancing precision medicine. Further research is essential to enhance understanding and develop better diagnostic and treatment approaches.
About the study
In the present study, subjects were recruited from the Simons VIP project, reviewed by the Johns Hopkins Institutional Review Board, and acknowledged as exempt as subjects were deidentified from a preexisting database. Participants were referred by clinical genetic centers, testing laboratories, web-based networks, and self-referral. Screening and medical record reviews were conducted by Geisinger and Emory University, with 16p11.2 CNV tested via fluorescent in situ hybridization. Inclusion criteria included recurrent breakpoints of 16p11.2 without other pathogenic CNVs or unrelated syndromes. Exclusion criteria included environmental neurocognitive impacts, severe birth asphyxia, prematurity, and lack of fluency in English.
Behavioral testing involved the Autism Diagnostic Observation Schedule, Autism Diagnostic Interview, and Social Responsiveness Scale. Core phenotyping sites included the University of Washington Medical Center, Baylor University Medical Center, and Boston Children's Hospital, using the Diagnostic and Statistical Manual of Mental Disorders, fourth edition, text revision (DSM-IV-TR) criteria. Cognitive measures assessed full-scale Intelligence Quotient (IQ) with standardized tests. High-resolution brain imaging was performed at the University of California and Children's Hospital of Philadelphia.
Controls were recruited locally near imaging sites, matched for age, sex, handedness, and nonverbal IQ, excluding major DSM-IV diagnoses, Autism Spectrum Disorder (ASD) family history, other developmental disorders, dysmorphic features, or genetic abnormalities. The study cohort included brain images from 206 individuals: controls (N = 118), deletion (N = 48), and duplication (N = 40).
T1-weighted magnetization-prepared gradient-echo image (MPRAGE) images were collected using standardized protocols. Preprocessing involved excluding non-brain tissues, segmenting gray and white matter, and normalizing brain size. The 3D TBM technique, based on optimal mass transport, transformed images to identify and visualize tissue patterns linked to 16p11.2 CNV, combined with machine learning for automated discovery and visualization.
Study results
Duplication and deletion carriers exhibited a range of diagnoses, often multiple per individual. Analysis of variance (ANOVA) revealed significant differences in brain tissue volume among the groups, but volume alone was insufficient for cohort distinction. Deletion carriers were generally younger, likely due to earlier medical attention. Despite efforts to age-match cohorts, this difference persisted.
Age and gender did not accurately differentiate 16p11.2 CNV, nor did adding brain parenchymal volume significantly improve classification accuracy.
The study utilized T1-weighted MPRAGE images (n = 206) from the Simons VIP dataset. Images were coregistered and segmented into gray and white matter tissues using Statistical Parametric Mapping software. After normalizing tissue mass, TBM transformed each image into the transport domain relative to a reference image, generating transport maps that were analyzed.
TBM enabled efficient data representation, capturing 96% of white matter variance with 132 components and 96% of gray matter variance with 46 components, compared to 184 and 182 components, respectively, in the image domain.
Canonical correlation analysis revealed a significant relationship between gray and white matter distribution (Pearson correlation coefficient = 0.56, P < 0.01), justifying separate analyses. After adjusting for covariates, no significant correlation was found between brain parenchymal volume and tissue distribution for gray or white matter.
Genetic cohorts were highly separable in the transport domain using penalized linear discriminant analysis (pLDA) for white and gray matter. Genetic cohorts were more separable based on white matter distribution, with direction 1 showing a dose-dependent influence of 16p11.2 CNV on brain structure. Classification performance on the test set using 10-fold cross-validation showed 94.6% accuracy for white matter and 88.5% for gray matter.
3D TBM allowed direct visualization of brain endophenotypes driving CNV classification. Visualizations showed that 16p11.2 CNV impacts brain regions diffusely rather than locally, with characteristic tissue shifts highlighted by inverse TBM transformation. These shifts showed a reciprocal pattern of tissue expansion/contraction among deletion and duplication carriers.
Significant associations were found between TBM scores and articulation disorders, with direction 1 scores being highly sensitive and specific for detecting these disorders among deletion carriers. TBM scores showed a strong relationship with IQ, highlighting TBM's potential in linking brain endophenotypes with behavioral outcomes. This technique advances the understanding of gene-brain-behavior relationships and supports the development of targeted therapies.
Conclusions
To summarize, this research reveals new details regarding brain structural patterns linked to genetic CNV in autism. These patterns can accurately predict CNV from brain images alone in new individuals. Furthermore, the discovered patterns are sensitive to articulation disorders and explain some IQ variability. The results were enabled by 3D TBM, a generative machine learning approach that directly probes biological mechanisms affecting brain mass distribution. By revealing structural networks underpinning CNV-related endophenotypes, this research advances our understanding of autism's biological basis.