Contents:
By linking genetic markers or SNPs to IMPs, functional consequences of genetic perturbations and mechanistic insights can be inferred. When cQTL and eQTL overlap, that is, a genetic locus or multiple genetic loci are in association with both disease phenotype and the expression levels of a gene, there is a higher probability that the gene associated with the disease-linked genetic loci is also the causal gene.
Based on this principle, Meng et al. The identification of SORT1 as the gene behind the chromosome 1p Tested in both transgenic and knockdown mouse models, SORT1 was successfully validated as the causal gene for LDL and CVD via modulation of hepatic lipoprotein export by two groups [ 49 , 50 ]. In cases where the GWAS locus is within an intergenic region, eSNPs can reflect genetic control of gene expression regardless of the genomic position. As IMPs can be upstream i. The three relationships can be formally tested via statistical and mathematical inference by using DNA or genetic variation information as the anchor Fig.
To this end, a likelihood-based causality model selection LCMS procedure was developed by Schadt et al.
In the obesity study, perturbation of eight top obesity candidate causal genes for obesity - Zfp90, Lpl, Tgfbr2, C3ar1, Gpx3, Gas7, Lactb, and Gyk - was found to alter adiposity or fat pad mass in knockout or transgenic mouse models via modulation of genes involved in metabolic pathways and a liver network of genes involved in lipid metabolism [ 51 ]. Knockout mouse model of a candidate causal gene, C3ar1, was found to reduce aortic lesions. In addition, the expression levels of most causal genes in the aortic arch altered accompanying lesion progression in two independent atherosclerosis mouse models.
These validation experiments strongly support the validity and power of LCMS in predicting reliable causal genes for complex MetDs.
All the methodologies outlined above yield lists of molecular markers that are linked to disease development, but they offer little information on how genes and other IMPs are organized and how they operate together in complex biological systems. Networks have emerged as appealing tools to address this complexity; they depict the active agents in the systems as nodes, and their interactions as edges that connect the nodes. Notably, the edges can represent different types of relationships such as correlation, physical binding, biochemical reactions or transcriptional regulation, thereby transcending the boundaries of conventional statistics.
Networks can be constructed based on curated knowledge knowledge-driven or computational modeling of large-scale genomic data data-driven. These networks can capture literature-supported relationships but are far from being comprehensive and novel relationships or insights will not be covered. On the other hand, data-driven network reconstruction or reverse-engineering approaches systematically and objectively scan through and integrate all data points to uncover novel relationships among IMPs within a cell or a tissue, or even across tissues [ 53 ].
Information obtained from correlations, cQTLs, eQTLs, and causality inference discussed above can all be efficiently incorporated and utilized in various network reconstruction approaches Fig. The construction of co-expression network starts with a Pearson correlation matrix between all gene pairs, followed by transformation of the correlation matrix into an adjacency matrix [ 54 , 55 ].
The adjacency matrix is further transformed into a topological overlap matrix based on the direct interactions between genes as well as the indirect interactions with all the other genes [ 73 ]. An average linkage hierarchical clustering algorithm is then applied to the topological overlap matrix , which is followed by a dynamic cut-tree algorithm to identify gene modules [ 74 ]. Correlations between the principal components of each module and phenotypic traits measured in the same individuals can be calculated to derive informative modules that link to the disease of interest.
Alternatively, co-expression networks can be constructed separately for disease cases and controls, and network modules that demonstrate differential network topology and connectivity between cases and controls can be identified [ 75 ]. In contrast to simple clustering algorithms where genes are grouped based on the strength of pair-wise correlations, WGCNA searches for higher-level co-regulation structures. Importantly, the gene memberships of a module are determined not only by their direct correlations but also by the similarity in their relationships with the other genes [ 54 , 73 ].
The network structure derived is hence comprised of more cohesive and biologically more meaningful modules that contain genes with shared regulatory mechanisms, involved in similar biological functions or pathways, or enriched for disease associated genes [ 44 , 54 , 65 — 69 , 76 — 78 ]. As exemplified in two parallel studies, Chen et al. This module is termed macrophage-enriched metabolic network MEMN.
Several novel genes in MEMN, including Lactb, Lpl, and Ppm1l, were experimentally confirmed to affect adiposity in knockout and transgenic mouse models [ 65 ].
Genes in the obesity-linked liver module are involved in adipogenesis and fatty acid metabolism whereas genes in the obesity- and T2D-associated brain subnetworks are involved in diverse processes including RNA splicing, circadian rhythm, and lipid metabolism. By constructing co-expression networks in six metabolically related tissues in a mouse population with varying T2D susceptibility, Keller et al. In two studies on a Finnish cohort, Inouye et al. Although WGCNA is highly informative for deriving the overall organization of genes or other IMPs and for linking particular co-expression modules to disease phenotypes, the detailed relationships among genes within a module or between modules can be less descriptive.
Graphical network modules such as BNs can provide more granular views of the interactions and directionalities between genes. BNs define a partitioned joint conditional probability distribution over all nodes genes or other IMPs in a network where the probability distribution of states of a node depends only on the states of its parent nodes [ 87 ].
Therefore, BNs are probability-based directed acyclic graphs. The conditional probabilities reflect not only relationships between genes, but also the stochastic nature of these relationships. Due to computational constraints, thousands of plausible BNs can be generated using Monte Carlo Markov chain MCMC simulations [ 88 ] rather than an exhaustive search for all possible network structures. The posterior probability of each BN model given observed data can be calculated using the Bayes formula.
A consensus BN that contain nodes and edges appearing in a large proportion of all plausible network models is then derived. As probability distributions are bi-directional and can lead to mathematically equivalent structures, it is not possible to infer causal directions between nodes. Fortunately, BN framework can incorporate a variety of prior information, ranging from literature, genetic, transcription factor binding, metabolomics, to proteomic data, to break the symmetry among nodes and infer causal directions [ 56 , 57 ]. As the BN algorithm imposes heavy computing burden and only conserved nodes and edges across plausible networks are kept, BNs are sparser than co-expression networks and not all genes profiled are included in the BN model.
A number of studies in a variety of species have demonstrated that BNs can capture fundamental properties of molecular interactions in complex systems and can infer mechanisms [ 44 , 56 , 57 , 78 , 89 , 90 ]. In searching for the mechanisms underlying the previously discussed lipid and CVD locus 1p In addition, the neighborhood subnetworks of the three genes, particularly that of SORT1, are enriched for genes involved in multiple biological processes relevant to lipid regulation and CVD development, thus providing mechanistic support on the involvement of the candidate genes in CVD [ 44 ].
To illustrate how candidate causal genes identified via the LCMS causality test described above interact and affect obesity, causal genes were mapped to a liver BN and they were found to be highly connected in a subnetwork, with the top causal gene Zfp90 being upstream of the other causal genes [ 20 ]. In a follow-up validation study, by mapping the liver genes perturbed by the overexpression or knockout of top obesity candidate causal genes to liver BNs, a liver core subnetwork that is highly enriched for genes involved in lipid metabolism and fat cell differentiation pathways was identified, further elucidating the mechanisms underlying obesity development [ 51 ].
However, each methodology also carries intrinsic limitations. To maximize our ability to discover novel insights, higher level integrative approaches that take advantage of different combinations of the above-mentioned methods have been recently explored and we highlight two such methodologies. To harness the strengths of data-driven regulatory networks, the information from gene expression profiling, causality testing and GWAS can be overlaid onto co-expression networks and BNs to infer disease mechanisms and key regulatory genes. This inflammatome gene set was integrated with the GWAS catalog and the metabolic disease-related MEMN to confirm the causal nature of the gene signature.
The identification of the common inflammatome signature, its network architecture, and key drivers via this highly integrative approach sheds light on the shared etiology and potential therapeutic targets between MetDs and other common diseases or pathophysiological conditions, a level of mechanistic insights far beyond that of what IMP profiling could offer by itself.
Other types of networks such as PPI networks can certainly also be integrated, as exemplified by Mori et al. By leveraging tissue-specific gene expression with PPI networks, they identified an inflammation- and immune system-related adipose subnetwork that contributes to the differences in diabetes risk. In order to explore the candidate causal genes and mechanisms behind GWAS, several novel functional genomics and network-driven methodologies have been recently developed.
It is of note that most genes in the significant pathways or subnetworks only showed modest association in GWAS and therefore were missed by the traditional GWAS analysis. These results support the hypothesis that a large number of genes in relevant biological processes with modest effect sizes, rather than only a handful of individual genes with strong effects, collectively contribute to disease development. In another method, Kang et al. These new methodologies not only provide mechanistic explanations for GWAS findings but also demystify a significant amount of the missing heritability.
Systems biology approaches that leverage genetic, tissue-specific IMP profiling data, and disease phenotypes have evolved rapidly in the past decade. Through their applications in various MetDs in both animal models and human populations, these highly integrative systems biology approaches have unveiled unprecedented insights into disease etiology and uncovered a large number of candidate novel genes, pathways, and subnetworks associated with MetDs. By far, inflammation and immune response related genes and processes have been the most consistent signal across tissue types, across studies, and across MetDs, and thus convincingly represent a key shared component of MetDs.
The systems integration of tissue-specific molecular data also revealed many tissue- and disease-specific processes, such as liver-centric lipid metabolism and transport pathways for obesity and CVD; liver- and adipose-specific oxidative phosphorylation, fatty acid oxidation, PPAR signaling, fat cell differentiation for obesity, insulin resistance, and T2D; liver-specific glucagon signaling and islet-specific cell cycle regulation for T2D; and circadian rhythm and RNA splicing processes in brain for obesity and T2D.
These findings highlight how tissue-specific gene networks and their cross-tissue interactions, rather than individual genes, mediate MetD etiology. It is therefore critical to shift from a traditional view of disease mechanisms as independent actions of individual genes to a network view, where a large number of genes coordinately define a particular network state in individual tissues and the interactions of gene networks in multiple tissues ultimately lead to MetD onset.
Although proven predictive and informative, the existing systems biology methodologies are far from being comprehensive and accurate. Further refinement of existing methods and development of more advanced approaches are thus warranted. First of all, incorporation of next generation sequencing, DNA methylome, microRNA, metabolomics, and other types of data into the systems biology framework has become more pressing than ever, as such data are being rapidly generated and poured into data depositories in the past couple of years.
Although some of the existing methodologies can be easily adapted for additional data types, innovative approaches guided by biological insights are still in great need. For instance, based on the regulatory relationships across data types, it is necessary to develop methodologies that leverage multiple levels of IMPs simultaneously to construct more sophisticated networks such as co-regulatory microRNA-gene-metabolite networks.
Second, the involvement of multiple cell types, tissues, and organs in MetDs demands methodologies that explore cross-tissue interactions. As demonstrated elegantly in a recent study by Dutta et al. This type of relationship can only be revealed when data integration reaches organism-wise systems level. Although the construction of cross-tissue networks has been sporadically attempted [ 84 , 96 ], such efforts have to be further expanded to increase tissue coverage and to develop more efficient methodologies. Third, most of the current methodologies capture static information that only represents snapshots of disease status at a given time.
Dynamic models that take IMP data generated from time-course experiments are therefore needed to capture the dynamic nature of disease progression. All these different levels of technical and biological challenges have to be properly addressed in the future to allow a full dissemination of MetD etiology.
Only when a comprehensive understanding is achieved, can effective diagnostic, preventative, and therapeutic strategies toward these disabling and deadly diseases become a reality. No potential conflicts of interest relevant to this article were reported. This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author s and the source are credited.
National Center for Biotechnology Information , U. Current Cardiovascular Risk Reports. Curr Cardiovasc Risk Rep. Published online Oct Author information Copyright and License information Disclaimer. This article has been cited by other articles in PMC. Abstract The metabolically connected triad of obesity, diabetes, and cardiovascular diseases is a major public health threat, and is expected to worsen due to the global shift toward energy-rich and sedentary living.
Metabolic disorders, Obesity, Diabetes, Cardiovascular diseases, Systems biology, Integrative genomics, Functional genomics, Causality inference, Network biology. Introduction Common metabolically connected diseases MetDs such as cardiovascular disease CVD , type 2 diabetes T2D , and obesity impose a substantial health burden worldwide, as demonstrated by the fact that both CVD and T2D are among the top ten leading causes of death in Europe and the United States.
Open in a separate window. Traditional Approaches-Association and Correlative Analyses When a particular level of molecular data is generated, the most straightforward approach is to estimate the correlations between the molecular traits and clinical phenotypes.
Identification of Genetic Risks of MetDs by Linkage Studies and GWAS Genetic association studies between genetic markers and disease phenotypes could infer causality to a certain degree under the central dogma that heritable disease risks flow from DNA to other downstream molecular and physiological events. Functional Genomics Once a genetic locus has been linked to a disease phenotype, the most intuitive step is to search for candidate genes in the neighborhood of the locus. Construction of Molecular Networks via Integration of Genetics, IMPs, and Disease Phenotypes All the methodologies outlined above yield lists of molecular markers that are linked to disease development, but they offer little information on how genes and other IMPs are organized and how they operate together in complex biological systems.
Bayesian Network BN Although WGCNA is highly informative for deriving the overall organization of genes or other IMPs and for linking particular co-expression modules to disease phenotypes, the detailed relationships among genes within a module or between modules can be less descriptive. Table 1 Comparison of integrative methodologies discussed in the manuscript. Integration of Disease-Related Gene Sets and Networks To harness the strengths of data-driven regulatory networks, the information from gene expression profiling, causality testing and GWAS can be overlaid onto co-expression networks and BNs to infer disease mechanisms and key regulatory genes.
Conclusion Systems biology approaches that leverage genetic, tissue-specific IMP profiling data, and disease phenotypes have evolved rapidly in the past decade. Future Directions Although proven predictive and informative, the existing systems biology methodologies are far from being comprehensive and accurate.
Disclosure No potential conflicts of interest relevant to this article were reported. Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author s and the source are credited.
Papers of particular interest, published recently, have been highlighted as: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Epigenetics and cardiovascular disease. Genome-wide association studies and systems biology: Mass spectrometry-based proteomics and network biology.
Dynamism in gene expression across multiple studies. Disease signatures are robust across tissues and experiments. Expression-based genome-wide association study links the receptor CD44 in adipose tissue with type 2 diabetes. Hong F, Breitling R.
A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments. Comparison and meta-analysis of microarray data: Creation and implications of a phenome-genome network. Butte AJ, Chen R. Finding disease-related genomic experiments within an international repository: Evaluation and integration of 49 genome-wide experiments and the prediction of previously unknown obesity-related genes. Davey Smith G, Ebrahim S. An integrative genomics approach to infer causal associations between gene expression and disease.
Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity.
W Russell L Hoyles. Snoopy--a unifying Petri net framework to investigate biomolecular networks. Mucosal-associated invariant T MAIT cells are depleted and prone to apoptosis in cardiometabolic disorders. Analysis of metabolic network reconstructions Although reconstruction itself can provide insight into the properties of a network, the biologist wants to be able to understand and ultimately predict, the impact of perturbations on the system through simulations. I accept cookies from this site Agree.
Disruption of the aortic elastic lamina and medial calcification share genetic determinants in mice. Mapping, genetic isolation, and characterization of genetic loci that determine resistance to atherosclerosis in C3H mice.
Arterioscler Thromb Vasc Biol. Positional cloning of Sorcs1, a type 2 diabetes quantitative trait locus. Genome scan for human obesity and linkage to markers in 20q Am J Hum Genet. Identification of an obesity quantitative trait locus on mouse chromosome 2 and evidence of linkage to body fat and insulin on the human homologous region 20q. Write a customer review.
The aim of this book is to provide the target audience, specifically students of Medicine, Biology, Systems Biology and Bioinformatics, as well as experienced. Request PDF on ResearchGate | A Systems Biology Approach to Study Metabolic Syndrome | The aim of this book is to provide the target.
There's a problem loading this menu right now. Learn more about Amazon Prime. Get fast, free shipping with Amazon Prime. Get to Know Us. English Choose a language for shopping. Explore the Home Gift Guide. Amazon Music Stream millions of songs. Amazon Advertising Find, attract, and engage customers. Amazon Drive Cloud storage from Amazon. Alexa Actionable Analytics for the Web. AmazonGlobal Ship Orders Internationally.
Amazon Inspire Digital Educational Resources. Amazon Rapids Fun stories for kids on the go.