Genome-wide association studies are used to observe the genetics of multifactorial genetic diseases. Identifying the cause of such genetic diseases can be challenging, as the contribution of each genetic defect is often very small. Pathway analysis is used to further elucidate the impact of each genetic variant, thus increasing the physiological relevance and significance of the study.
arleksey | Shutterstock
How is pathway analysis carried out?
To increase the significance of a genome-wide association study, one technique used is to map heritability. This technique can be used to eliminate variants that are not likely to contribute to the condition of interest. However, this is not always possible when the condition involves multiple genes that are individually insignificant so aren't seen as significant in single-variant statistical analyses.
Pathway analysis complements single-variant analysis. By bringing together weaker, related signals of single variance, the final statistics have the potential to be improved, if the variants are all related to the phenotype. This concept is useful for pilot studies that therefore use smaller sample sizes, to allow for the investigators to prioritize specific variants for follow-up analysis. In addition, these kinds of studies allow for the discovery of new sets of variants that happen to have related functions – helping to explain the final data.
This analytical method combines the signals of a selection of genetic variants. Nonetheless, what is the biological significance of this technique? Often, biomedical research focuses on developing our understanding of the molecular mechanisms underlying a selected disease/phenotype, and the discovery of new drugs for the treatment of these diseases. To reach these goals, the bolidy external changes and inherited genetic background have to be considered collectively.
Previously, these experiments were evaluated in a manner considered reductionist, where only one level of raw data was observed at once time, due to the lack of analysis tools. For genome-wide association studies, a set of variants can be acquired by extracting the genetic variant that pass a pre-determined p-value in other association tests.
Nonetheless, the biological meaning of these genetic variants cannot be inferred by p-values alone. This type of pathway analysis technique can serve as a intermediary: filling in the space to accurately infer the relationships between the chosen set of genes represented by these major variants, as well as the strengths of said relationship. Subsequently, the final results of these association tests can be interpreted easily.
Pathway analysis examples
Genome-wide association studies have been extensively used to pinpoint common type 2 diabetes genetic variants. At present, known variants explain less than 20% of the estimated overall genetic contribution to type 2 diabetes. Pathway analysis techniques have been used in terms of type 2 diabetes genome-wide association study datasets to explore the potential biological mechanisms and have therefore given some novel type 2 diabetes risk pathways. However, very few of these pathways were revealed in those previous studies.
In 2017, a pathway analysis study was performed by using the summary of results from a much larger scale meta-analysis of type 2 diabetes genome-wide association studies to explore more genetic signals in cells belonging to type2 diabetes patients. PLNK and VEGAS tests were selected to execute the experiment, and WebGestalt technologies were utilized to complete the pathway analysis test.
A total of 8 shared KEGG (a genetic variant database) pathways were discovered after final corrections for a selection tests in both of the methods used. This experiment highlighted a selection of new type 2 diabetes risk pathways – with these results potentially being used to learn more about the disease, and ho to treat/cure it.
The fatty acids metabolism is thought to play a significant role in the initial stages of lung cancer, which was recently explored in a study by conducting pathway‐based analysis tests. A meta‐analysis of some previously published data-sets of six genome-wide association studies were used, taken from the Transdisciplinary Research in Cancer of the Lung consortium – including 12,160 cases of patients with confirmed lung cancer, and 16 838 cancer‐free control patients.
This experiment revealed a total of 30,722 single‐nucleotide polymorphisms from 317 genes, all of which were relevant to fatty acid metabolic pathways. The final results suggested that a potentially functional Single-nucleotide polymorphism in the CYP4F3 gene might contribute to the causes of lung cancer, particularly in patients that smoke.
Sources
- Kao Y.P. et al. (2017). Pathway analysis of complex diseases for GWAS, extending to consider rare variants, multi-omics and interactions. doi.org/10.1016/j.bbagen.2016.11.030.
- Yang l. et al. (2017) A pathway analysis of genome-wide association study highlights novel type 2 diabetes risk pathways. doi: 10.1038/s41598-017-12873-8
- Yin J. et al. (2017) Pathway‐analysis of published genome‐wide association studies of lung cancer: A potential role for the CYP4F3 locus. doi.org/10.1002/mc.22622
Further Reading