Where systems biology meets bioinformatics
July 7, 2018 | Chicago, IL
Flux balance analysis
Harvard Medical School
Advances in genomics are creating new opportunities to understand biology that require both systems modeling and bioinformatics. The third annual SysMod meeting will be a forum for discussion about combined use of systems biology modeling and bioinformatics to understand biology and disease. The meeting will take place on July 7, 2018 during the 2018 ISMB conference in Chicago. The meeting will feature several keynote talks and contributed presentations.
Flux balance analysis
|8:30-9:30||ISMB keynote talk|
|9:30-10:00||Coffee break with exhibitors|
|10:15-12:40||Session I: Models of human disease
Moderator: Jonathan Karr, Icahn School of Medicine at Mount Sinai, US ✉
Jonathan Karr ✉
Icahn School of Medicine at Mount Sinai, US
|10:20-11:00||Variability and phenotype selection in invasive cancer spread
Andre Levchenko ✉
Yale University, US
Biological processes are highly variable, with individual cells capable of adopting distinct states and dividing roles within complex and structured populations. Furthermore, individual cells can also transition between different states as they perform complex functions. Here, using the example of invasive cancer spread, I introduce a new model and new analysis method, allowing one to characterize how inherent cell variability translates into a choice among several available phenotypes, and how the dynamic switching between these phenotypes allows individual cells to undergo dynamically complex processes. I will how such single cell behavior can result in a successful search by cancer cells for lymphatic or blood vessels, and initiation of metastasis. I will also suggest that this method can be broadly translated to other biological phenomena involving complex decision making on cell and population levels.
|11:00-11:20||Advancing systems immunology using a hybrid agent-based model (ABM) of Helicobacter pylori infection
Meghna Verma ✉ , Josep Bassaganya-Riera, Andrew Leber, Nuria Tubau-Juni, Vida Abedi, Xi Chen, Stefan Hoops and Raquel Hontecillas
Virginia Tech, US
Background: Helicobacter pylori, although, known to cause gastric cancer in 1-2% of cases, exerts beneficial effects including protection against allergies and gastroesophageal diseases. Motivation: To examine the double edge sword of H. pylori as a pathogen and a beneficial organism, and investigate the immunoregulatory responses during H. pylori infection we utilized a high-performance computing (HPC)-driven ENteric Immunity SImulator (ENISI).Method: The multiscale model simulated (> 10^cells) of the gut mucosal immune system. We performed simulations integrating various spatiotemporal scales encompassing ABM- (tissue), ODE- (cellular) and partial differential equation- (cytokine gradients) based methods. The modeling data were analyzed by building a metamodel using stochastic-kriging and the design was based on a space filled Latin Hyper Cube matrix. A spatiotemporal metamodel-based variance and partial rank correlation coefficient-based regression type sensitivity analyses was conducted to analyze the parameters influencing the initiation, peak and recovery stages of the infection.Results: The data analytics methods identified the parameters related to epithelial cell death and epithelial cell proliferation validating the findings from the experiment models, and highlighted the crucial role of IL-12 in influencing the host responses to infection.Conclusion: Thus, the ENISI identified factors critical for the survival of H. pylori and lesion development.
|11:20-11:40||Data-driven modeling of cancer subtypes
Kazunari Iwamoto, Hiroaki Imoto, Shigeyuki Magi, Suxiang Zhang and Mariko Okada-Hatakeyama ✉
Osaka University, JP
Dysregulation of signaling network is a major cause of human cancer. Behavior of this network is highly nonlinear, therefore a prediction of drug responses targeting this network is very difficult. To solve this problem, we currently develop a mathematical model based on the experimental data obtained from four cancer subtypes, integrating the growth factor receptor signaling pathways, early transcriptional regulation, cell cycle and p53 regulatory networks, which allows us mechanistic re-classification of cancer subtypes and prediction of combinatorial drug effects. One of the challenges of this modeling approach is a large-scale parameter estimation. To minimize the estimation efforts, we also develop quantitative experimental methods to infer cell cycle stages and heterogeneic cellular responses from high-content imaging data. Using the model, we found that the most of the signaling dynamics and molecular machineries are commonly shared in different cancer subtypes, yet there exist unique molecular regulations specific to particular subtypes. Our model will be able to integrate clinical gene/protein expression data in public databases as model parameters to predict cancer malignancy.
|11:40-12:00||A computational model to track patterns of evolutionary dynamics in cancer
Neil Coleman, Anchal Sharma, Greg Riedlinger and Subhajyoti De ✉
Rutgers University, US
Cancer is a complex disease marked by somatic evolution of clonal cell population, which can grow for decades and yet only becomes symptomatic at a relatively advanced stage. During diagnosis histopathological and molecular assessment essentially provides only a late and limited snapshot of this evolutionary process. Common experimental model systems such as in vitro systems or mouse models do not always adequately capture the multi-scale complexity of spontaneously developed human tumors – which is typically marked by long latency period and intra-tumor heterogeneity. We have developed Temish, a computational model of tumor evolution based on principles of stochastic reaction diffusion system in the 3D that enables generating testable hypotheses regarding aspects of tumor evolution. The model is fast, scalable to capture tumor growth dynamics during tumor initiation, progression, and during treatment. We validate utility of the model by comparing testable predictions with histopathological data. We then apply the model to lung cancer and compare the model predictions with single and multi-region sequencing data to infer modes of cancer progression and their clinical implications. We suspect that regional differences in subclonal driver mutations, coupled with ongoing genome instability and clonal dynamics present challenges for successful intervention of non-small cell lung cancer.
|12:00-12:20||Model driven discovery of drug targets for effective treatment of prostate cancer
Beste Turanli ✉ , Alen Lovric, Rui Benfeitas, Gholamreza Bidkhori, Cheng Zhang, Kazim Yalcin Arga and Adil Mardinoglu
KTH Royal Institute of Technology, SE
Insights yielded from genome-scale metabolic models (GEMs) providing information on cancer-specific metabolism have been used for identifying potential therapeutic agents and drug targets. Moreover, repositioning drug for any cancer has utmost importance in the context of drug discovery. We aimed to reconstruct a generic prostate cancer (PRAD) specific model for not only exploring the metabolism but also repurposing new therapeutic agents. RNA-Seq data for 495 individuals suffering from PRAD as well as 52 noncancerous prostate samples from The Cancer Genome Atlas database and proteome data from the Human Protein Atlas v18 were retrieved. Besides, all personalized GEMs based on PRAD transcriptomes were acquired from the Human Pathology Atlas to reconstruct a generic model covering all individual variations as well as proteome and transcriptome. tINIT and reporter metabolites algorithm via RAVEN toolbox were used to reconstruct the model and identify reporter metabolites, respectively. Differentially expressed genes in PRAD specific metabolic model were used as metabolic signatures for drug repurposing. Gene expression profiles from CMap2 were analyzed and statistically evaluated. Consequently, eleven novel drug candidates were repurposed for PRAD. Reversal effect of drug candidates are still under investigation through PRAD specific GEM.
|12:20-12:40||Logical model simulations predict drug synergies across different cancer cell lines
Barbara Niederdorfer ✉ , Miguel Vazquez, Liv Thommesen, Martin Kuiper, Astrid Lægreid and Åsmund Flobak
Norwegian University of Science and Technology, NO
Combination therapies are hoped to overcome the challenge of emerging cancer drug resistance by blocking malicious acquired bypass mechanisms. Due to the vast therapeutic space, undirected testing will not suffice to identify effective combinations. Computational network approaches have previously been used to successfully predict effective drug combinations (e.g. Flobak, 2015). Many of these rely on data from perturbation experiments, which is not transferable to a clinical setting. Here we present the use of logical modeling informed by baseline molecular data from unperturbed systems. We manually curated a prior knowledge network of 144 nodes encompassing 19 drug targets, which we experimentally screened in single and double perturbation across eight human cancer cell lines of different origin. Cell-line specific logical models were tailored to agree with transcriptomic and literature curated baseline data. Our in silico modelling experiments suggest that network refinement accounting for subtle biologically founded mechanism increases the models’ predictive capability, and that it is more important to accurately describe activity of nodes with high- rather than low- out-degree. Our work implies that models informed by baseline data could be used to economize screening efforts by enriching screening design for beneficial drug combinations.
|12:40-2:00||Lunch and community discussion|
|2:00-4:00||Session II: Comprehensive models of cells and tissues
Moderator: Andreas Dräger, DE ✉
Peter Sorger ✉
Harvard Medical School, US
|2:40-3:00||Deciphering cell’s robustness by a multi‐scale framework integrating cell cycle and metabolism in budding yeast
Lucas van der Zee, Hans Westerhoff, Jens Nielsen and Matteo Barberis ✉
University of Amsterdam, NL
Cell cycle and metabolism are coupled networks. Cell growth and division require synthesis of macromolecules which is dependent on metabolic cues. Conversely, metabolites involved in storage metabolism have been observed to fluctuate periodically as a function of cell cycle progression. Computer models of cell cycle and metabolism are being developed for some time. However, to date no effort has been made to integrate, and to investigate the mutual regulation of, these two systems in any organism. Here, we present a multi-scale framework that integrates a Boolean cell cycle model with a constraint-based model of metabolism. Directionality and effect are incorporated for mechanistic interactions. Conversely, an evolutionary optimization algorithm has been developed to generate models that incorporate high-throughput interactions iteratively. Model results are verified against metabolic pathway activity and enzyme concentrations. The first computer model that integrates cell cycle to metabolic networks reveals marked changes in flux distributions through different cell cycle changes, and highlights the importance of storage metabolites for metabolic changes during the growing phase of the cell cycle. Our integrative, multi-scale framework may be employed to capture the mechanistic basis of robustness of cell cycle networks by highlighting metabolic causes of cell cycle arrest.
|3:00-3:20||Towards a virtual immune system: multi-scale modeling of CD4+ T lymphocytes
Kenneth Wertheim ✉ , Bhanwar Lal Puniya, Alyssa La Fleur, Ab Rauf Shah, Matteo Barberis and Tomas Helikar
University of Nebraska–Lincoln, US
The immune system is regulated by biological and biochemical networks integrated across multiple scales (e.g., signal transduction, metabolism, etc). There are networks within each individual cell and at the cell population level. In order to understand the dynamics of the immune system under healthy and diseased conditions, multi-scale models are needed to fully leverage mathematical and computational tools. Herein, we discuss the first step we have taken towards describing the immune system in such a computational, system-level framework, exemplified by a multi-scale model of CD4+ T lymphocytes, including naive, effector (Th1, Th2, and Th17), regulatory, and memory cells. Within this framework, the following scales about CD4+ T lymphocytes are integrated: metabolism (described by constraint-based models), gene regulation and signal transduction (logical model), the population level (agent-based model), and extracellular cytokine concentrations (ordinary differential equations). Furthermore, the framework is oriented in space within three compartments, namely an infection site, a draining lymph node, and the circulatory system. The model was validated by reproducing known phenomena using a Monte Carlo method, including the phenotypic plasticity of CD4+ T lymphocytes, the effects of IL-2 on their proliferation and survival, and the effects of chronic inflammation.
|3:20-3:40||How accurate is automated gap filling of metabolic models?
Peter Karp ✉ and Mario Latendresse
SRI International, US
Reaction gap filling is a computational technique for proposing the addition of reactions to genome-scale metabolic models to permit those models to run correctly. Gap filling completes a reaction network by adding reactions that enable biosynthesis of all required metabolic products from available nutrients. The models are incomplete because they are derived from annotated genomes in which not all enzymes have been identified.We present two studies of gap-filling accuracy. In the first study we compared the results of applying an automated likelihood-based gap filler (MetaFlux) within the Pathway Tools software with the results of manually gap filling the same metabolic model. Both gap-filling exercises were applied to the same genome-derived qualitative
metabolic reconstruction for Bifidobacterium longum. The MetaFlux gap filler attained recall of 61.5% and precision of 66.6%. In the second study we generated degraded versions of the EcoCyc-20.0-GEM model by randomly removing flux-carrying reactions from a growing model. We gap-filled the degraded models using several variations of MetaFlux and compared the resulting gap-filled models with the original model. The best MetaFlux variation showed a best average precision of 87% and a best average recall of 61%.
|3:40-4:00||Datanator: toolkit for discovering and aggregating data for whole-cell modeling
Saahith Pochiraju ✉ , Yosef Roth, Balazs Szigeti and Jonathan Karr
Icahn School of Medicine at Mount Sinai, US
Whole-cell (WC) models are needed to guide medicine and bioengineering. These models require data about each gene, RNA, protein, complex, and reaction. Unfortunately, this data is hard to collect because it is scattered across repositories and articles; described with different formats, identifiers, and units; and obtained from different methods, organisms, and conditions.To accelerate WC modeling, we developed Datanator, an integrated database, search engine, and web interface for data for modeling. The database includes metabolite, RNA, and protein concentrations; protein complex subunit compositions; and rate laws and kinetic constants from ArrayExpress, CORUM, ECMDB, PaxDB, and SABIO-RK. The search engine finds data for modeling specific compounds, reactions, organisms and environments, including data from similar compounds, reactions, organisms, and environments. The web interface helps modelers explore the database.In addition to using Datanator to build a WC model of Mycoplasma pneumoniae, we have shown that Datanator can find missing parameters for ODE models, augment FBA models with kinetic bounds, and recalibrate models to similar organisms.We believe that Datanator will accelerate WC modeling, and enable more predictive models. To continue to accelerate WC modeling, we plan to integrate additional data sources into Datanator and integrate Datanator with model design tools.
|4:00-4:40||Coffee break with exhibitors|
|4:40-6:10||Session III: Detailed models of individual cellular processes
Moderator: Tomas Helikar, University of Nebraska-Lincoln, US ✉
|4:40-5:00||Cyclic attractor estimation in Boolean networks
Ulrike Münzner ✉ , Edda Klipp, Marcus Krantz and Tatsuya Akutsu
Kyoto University, JP
Signal transduction networks, such as the cell division cycle, are prone to the combinatorial complexity. While the number of microstates increases exponentially in such a system, the empirical data describing these states tends to be scarce. These two characteristics challenge mathematical descriptions in terms of scalability and data congruence. We developed a large-scale, mechanistically detailed and executable bipartite Boolean network of the cell cycle in Saccharomyces cerevisiae. We based this network on the reaction-contingency language, which scales with and captures the measured elemental states. Analyzing the attractors of this Boolean network enables the study of phenotypes which lead to normal or abnormal growth of a cell. Determining cyclic attractors in such a Boolean network is an NP-hard problem, and hence, cannot be solved exhaustively. We address this challenge by using partial information to reduce the number of trials in a heuristic search. We use this method to study the behavior of a reduced version of the original network. Such an analysis enables us to explore which components in the network control the cyclic behavior of the cell cycle network. In the future, analysis tools for mechanistically detailed Boolean models enable the development of whole-cell models, and ultimately personalized medicine.
|5:00-5:05||CrossPlan: systematic planning of genetic crosses to validate mathematical models
Aditya Pratapa ✉ , Neil Adames, Pavel Kraikivski, Nicholas Franzese, Jean Peccoud, John Tyson and T. M. Murali
Virginia Tech, US
Mathematical models of cellular processes can systematically predict the phenotypes of novel combinations of multi-gene mutations. Searching for informative mutants is challenging since the number of possible combinations grows explosively. Moreover, keeping track of the genetic crosses needed to make new mutants and planning sequences of experiments is unmanageable when there are hundreds of predictions to test.We present CrossPlan, an algorithm for systematically planning genetic crosses to make a set of target mutants from a set of source mutants. We base our approach on a generic experimental workflow used in performing genetic crosses in budding yeast. CrossPlan uses an integer linear program to maximize the number of target mutants that we can make under certain experimental constraints.We apply our method to a comprehensive mathematical model of the protein regulatory network controlling cell division in budding yeast. The number of target mutants we can plan increases linearly with the number of batches planned. Interestingly, planning two or three batches at a time is nearly as optimal as planning all batches simultaneously. The experimental flow that underlies our work is quite generic and our algorithm is easy to modify. Hence, our framework should be relevant in mammalian systems as well.
|5:05-5:10||Distribution shapes govern the discovery of predictive models for gene regulation
Gregor Neuert ✉
Vanderbilt University, US
Despite substantial experimental and computational efforts, mechanistic modeling remains more predictive in engineering than in systems biology. The reason for this discrepancy is not fully understood. Although randomness and complexity of biological systems play roles in this concern, we hypothesize that significant and overlooked challenges arise due to specific features of single-molecule events that control crucial biological responses. Here we demonstrate why modern statistical tools to disentangle complexity and stochasticity, which assume normally distributed fluctuations or enormous datasets, don’t apply to the discrete, positive, and non-symmetric distributions that characterize spatiotemporal mRNA fluctuations in single-cells. As an example, we integrate single-molecule measurements and advanced computational analyses to explore Mitogen Activated Protein Kinase induction of multiple stress response genes. Through systematic comparisons of the same model to the same data, we elucidate why standard modeling approaches yield non-predictive models for single-cell gene regulation. We further explain how advanced tools recover precise, reproducible, and predictive understanding of diverse transcription regulation mechanisms, including gene activation, polymerase initiation, elongation, mRNA accumulation, spatial transport, and degradation. Our model-data integration approach should extend to any discrete dynamic process with rare events and realistically limited data.
|5:10-5:15||Metabolic modeling of human milk oligosaccharide biosynthesis and implications for maternal blood groups
Benjamin Kellman ✉ , Anne Richelle and Nathan Lewis
University of California, San Diego, US
Human Milk Oligosaccharides (HMOs) are abundant and functional components of human milk, which impact health and development of the infant. Their biosynthesis in the human mammary gland remains elusive despite nearly half a century of investigation. Here we have developed a framework for resolving ambiguous enzymes and metabolites in this process. Our approach leverages metabolic data to construct metabolic models; models are scored and selected based on their consistency with transcriptomic data. Starting from a generic metabolic network describing all feasible biosynthetic pathways of 34 potential HMO structures related to 16 most abundant (>97% by weight) oligosaccharides found in human milk. Through the integration of HMO glycoprofiling and transcriptomics, our modeling approach identifies the most likely HMO structures for uncharacterized HMOs, the associated biosynthetic reactions for those HMOs, and candidate genes for elongation, branching, fucosylation, and sialylation of HMOs. These results provide the molecular basis for HMO biosynthesis and thus can be used to guide new strategies for HMO synthesis for academic and nutritional use. Most notably, we observer unique metabolic activity as a function of maternal blood-type; the determining glycosyltransferases of which are also believed to influence HMO biosynthesis.
|5:15-5:20||A hierarchical, data-driven approach to modeling single-cell populations predicts latent causes of cell-to-cell variability
Carolin Loos ✉ , Katharina Moeller, Fabian Fröhlich, Tim Hucho and Jan Hasenauer
Institute of Computational Biology, Helmholtz Zentrum München, DE
All biological systems exhibit cell-to-cell variability, and this variability often has functional implications. To gain a thorough understanding of biological processes, the latent causes and underlying mechanisms of this variability must be elucidated. Cell populations comprising multiple distinct subpopulations are commonplace in biology, yet no current methods allow the sources of variability between and within individual subpopulations to be identified. This limits the analysis of single-cell data, for example obtained by flow cytometry and microscopy. We present a data-driven modeling framework to analyze cell populations, which comprise heterogeneous subpopulations. Our approach combines mixture modeling and frameworks for distribution approximation, facilitating the integration of multiple single-cell datasets and the detection of causal differences between and within subpopulations. We demonstrated the ability of our method to capture multiple levels of heterogeneity in the analyzes of simulated data and data from primary sensory neurons involved in pain initiation. Our approach predicted relative changes in TrkA and Erk1/2 expression levels but not subgroup composition to underlie increased NGF-responsiveness caused by exposure to different extracellular scaffolds.
|5:20-6:00||The Cell Division Cycle: A Closed Loop of Switches Embedded in Switches
John Tyson ✉
Virginia Tech, US
Well-nourished cells in a favorable environment (well supplied with growth factors and free from stresses, like ionizing radiation) will grow, replicate their genome, and divide into two daughter cells in a process (the “cell division cycle”) that repeats itself like clockwork. The cell division cycle, however, is not a clock: its function is not to measure periods of time but rather to ensure that the information (the genes) and the machinery (the proteins) of life are faithfully transmitted from one generation of cells to the next. Hence, although the processes of growth and division may exhibit a remarkable temporal periodicity, this is an epiphenomenon. The primary goal of the cell cycle is to guard the cell’s genome against damage during the replication/division process, lest the error(s) be irrevocably passed down to all future generations of progeny. Hence, cell cycle progression is closely monitored for errors, in particular DNA damage and misalignment of replicated chromosomes on the mitotic spindle. In this lecture we look closely at the molecular mechanisms that maintain genomic integrity during the cell division cycle, and find an unexpected and intriguing arrangement of concatenated and nested bistable toggle switches that are arranged in a self-perpetuating cycle. The topology of the network seems to play crucial roles in maintaining the stability of the genome during cell proliferation.Reference: B. Novak, F.S. Heldt & J.J. Tyson (2018). Genome stability during cell proliferation: a systems analysis of the molecular mechanisms controlling progression through the eukaryotic cell cycle. Curr. Opin. Syst. Biol. Vol 9, pp. 22-31.
Tomas Helikar ✉
University of Nebraska–Lincoln, US
|A Blueprint for Human Whole-cell Modeling
Arthur Goldberg, Balázs Szigeti, Yosef Roth, John Sekar, Saahith Pochiraju and Jonathan Karr
Icahn School of Medicine at Mount Sinai, US
Whole-cell (WC) computational models of human cells are a central goal of systems biology. WC models could help researchers understand cell biology and help physicians treat disease. Ongoing technological advances in experimentation and modeling are enhancing the feasibility of WC models. However, progress toward WC models remains slow. To identify the bottlenecks to WC modeling and develop a long-term plan to achieve human WC models, we surveyed the biomodeling community, reviewed the literature, and reflected on our experience prototyping WC models of bacteria. We identified four major bottlenecks: a) inadequate experimental methods and data repositories; b) inadequate tools for designing, describing, simulating, calibrating, and validating large models; c) few models of individual processes that can be combined into WC models; and d) insufficient coordination within the biomodeling community. Further, we propose a project, termed the Human Whole-Cell Modeling Project, which would overcome these bottlenecks and achieve the first human WC models. The cornerstones of the project include developing computational technologies for scalably building and simulating models, developing standard protocols and formats for collaborative modeling, collaboratively building models as a community, and focusing on a single cell line. We invite the community to join this exciting and ambitious effort.
|A cell proliferation model of human acute myeloid leukemia xenograft
Marco S. Nobile, Thalia Vlachou, Simone Spolaor, Daniela Bossi, Paolo Cazzaniga, Luisa Lanfrancone, Daniela Besozzi, Pier Giuseppe Pelicci and Giancarlo Mauri
Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, IT
Acute myeloid leukemia is one of the most common hematological malignancies, characterized by high relapse and mortality rates: its inherent intra-tumor heterogeneity between subpopulations of cells is thought to play an important role in disease recurrence and resistance to chemotherapy. Current experimental methods are often not enough to quantify and assess the dynamics of these subpopulations. In order to overcome this limitation, we introduce a novel modeling and simulation framework that takes into account the inherent stochasticity of cell division events to investigate the possible occurrence of different subpopulations of cell types in acute myeloid leukemia, notably leveraging experimental data derived from human xenografts in mice. Our results highlight the role played by quiescent cells, as well as proliferating cells characterized by different rates of division, in the progression and evolution of the disease, hinting at the necessity to further characterize tumor cell subpopulations.
|A comprehensive community metabolic model of maize root system
Ab Rauf Shah, Bailee Lichter, Camila Pereira Braga, Dongdong Zhang, Bhanwar Lal Puniya, Edgar B. Cahoon, Jiri Adamec and Tomas Helikar
University of Nebraska-Lincoln, US
Roots play an important role in absorption of water, minerals, regulation of metabolism, and overall plant growth and maintenance of homeostasis. Furthermore, roots form complex interactions with their soil microbial community. Despite of its importance, root metabolism remains largely unexplored. Given the complex nature of the root system within the plant and its soil environment, a multiscale model of root metabolism and the soil microbiome community has the potential to better characterize the relationship between genotype and phenotype, and predict new interventions for crop improvement. Herein we describe a metabolic model of maize root that was developed using public transcriptomic data from 18 maize root tissues and previous maize models. This model consists of 4,917 reactions associated with 5,637 genes. In order to characterize interactions between maize root and soil microbes, we also developed metabolic models of seven microbes that are in a symbiotic relationship with maize. We are in the process of integrating these models into a multiscale model that will be able to describe the dynamic maize root – soil microbial interactions. This model will be used to understand root-microbe interplay regulating maize metabolic responses, and to predict novel pathways associated with maize root exudate and rhizobiome interactions.
|A natural language dialogue system for model-driven explanation of biological experiments
Benjamin M. Gyori, John Bachman, Patrick Greene, Funda Durupinar, Xue Zhang, James Allen, Lucian Galescu, Choh Man Teng, Ian Perera, William de Beaumont, Mark Burstein, Scott Friedman, Jeffrey Rye, Emek Demir, Brent Cochran and Peter Sorger
Harvard University, US
Finding sets of mechanisms that can explain observations in biological experiments (e.g., “How does treatment with SB431542 decrease the amount of SMURF2?”) generally involves a laborious process of information gathering and the construction and testing of a hypothesis in the form of a model. We present a system in which a user interacts with a computer partner through open-ended two-way English language dialogue to collect information on relevant mechanisms, and construct and test a mechanistic model serving as an explanation to observations of interest. The integrated system combines natural language understanding, dialog management, and the recognition of the user’s goals with the planning and execution of a variety of biological reasoning agents (Bioagents) capabilities. Bioagents report on relationships between drugs, their targets, and associations with disease; transcription factors and their targets, and mechanistic paths between proteins in pathway databases and networks assembled from literature-mining. Bioagents also interface with automated model assembly (INDRA) and simulation systems (Kappa, BioNetGen) to allow incremental model building and queries with respect to a model in the course of the dialogue. The dialogue system is embedded in a web-based interface which allows multi-modal visualization of the model being discussed.
|A Predictive Machine Learning Framework to Assess the Transcriptomic Response of Species to Environmental Toxins
Siavash Nazari and Mehrdad Hajibabaei
University of Guelph, CA
The traditional ecotoxicological methods do not detect the presence of hazardous chemicals not before a significant environmental damage has already been done. Being motivated by the idea that essential pathways and genes are conserved across the taxa, we present a framework that predicts the shared transcriptomic response of species to toxins. This framework mines chemical-gene interaction data from publicly available databases like the Comparative Toxicology Database, retrieving an initial set of affected genes. Next, it obtains corresponding translated protein sequences and expands the dataset with orthologs of those. Afterwards, it clusters these protein sequences obtaining orthogroups. Lastly, for a number of pathways with the most genes present, the pipeline learns a Bayesian Network to derive meaningful correlations between shared pathways. We ran the pipeline on a testcase dataset consisting of a group of phylogenetically distant taxa, namely Mus musculus, Danio rerio, Drosophila melanogaster and Caenorhabditis elegans, with input chemical groups of heavy-metals and dioxins. A total number of 42802 affected genes were retrieved for heavy-metals and dioxins respectively. Based on retrieved protein sequences, they were clustered into 9606 orthogroups in 125 pathways. The highly affected pathways suggested by our pipeline correspond with findings of previous studies in the literature.
|Assessing temporal network signaling entropy dynamics of tissue injury
Ankit Jambusaria, Zhigang Hong, Yang Dai, Asrar Malik and Jalees Rehman
University of Illinois at Chicago, US
A key challenge in systems biology is the elucidation of the underlying molecular pathways which regulate cellular phenotype during time series experiments in which biological tissues or cells are exposed to stressors. It is possible that there are undiscovered interactions between genes or proteins in response to a given stressor which may only be identified by a novel approach to assess gene regulatory network dynamics. We propose a novel framework based on statistical mechanical principles for systems analysis and interpretation of molecular omics data. Specifically, we propose the notion of network signaling entropy (or uncertainty) as a means of elucidating novel interactions which will provide insights into underlying basic biology, disease and repair mechanisms. We describe the power of assessing network signaling entropy to discriminate cells according to their distinct states of injury or repair during a time series transcriptomic analysis. Our analyses suggest that network signaling entropy decreases in response to inflammatory stimulation, suggesting that entropy can be used to identify novel regulatory elements mediating inflammatory injury and post-injury repair. We thus propose network signaling entropy as a powerful approach for understanding signaling promiscuity during tissue injury, repair, and regeneration.
|Cellular phenotype switching and effect of gene duplication through accurate computation of probability velocity and flux fields
Anna Terebus, Chun Liu and Jie Liang
University of Illinois at Chicago, US
Biochemical reaction networks are often stochastic because of the different time scales of reactions and often low copy numbers of participating molecular species. The discrete Chemical Master Equation provides a fundamental framework for studying their time-evolving and steady state probability landscapes. Vector fields of probability velocity and flux can further characterize the time-varying and non-equilibrium steady states properties of these systems. Here we describe a general approach of analysis of the global flow map of probability mass in all directions of all molecular species. It takes into full account the discreetness of both states and jump reactions, and provides an exact quantification of the vector fields along the boundaries of the state space dictated by the reaction network. We apply this approach to study the toggle switch network, in which the reactions of transcription and translation are both explicitly modeled. We describe the mechanism of the transitions between important cellular states, as well as examine how duplication of genes in the toggle switch affects the non-equilibrium dynamics of transitions between them. We explore changes in the dynamics of non-equilibrium probability landscape and the appearance of new cellular states, as well as changes in their locations.
|Chimeric Protein-Protein Interaction Networks Reveal Alterations in Cancer-Specific Phenotypes
Milana Frenkel-Morgenstern, Alessandro Gorohovski, Somnath Tagore, Vaishnovi Sekar, Miguel Vazquez and Alfonso Valencia
Barcelona Supercomputing Centre BSC, ES
Chimeric proteins, comprising peptides deriving from the translation of two parental genes, are produced in cancers by chromosomal aberrations. Considering discrete protein domains as binding sites for specific domains of interacting proteins, we have catalogued the protein interaction networks for more than 11,000 cancer fusions in order to build the Chimeric Protein-Protein-Interactions (ChiPPI). Mapping the influence of fusion proteins on cell metabolism and protein interaction networks reveals that chimeric protein-protein interaction (PPI) networks often lose tumor suppressor proteins, and gain onco-proteins. We compared ChiPPI networks in different cancer phenotypes, e.g. in leukemia/lymphoma, sarcoma and solid tumors finding distinct enrichment patterns for each disease type. While certain pathways are enriched in all three diseases (Wnt, Notch, TGF beta), there are distinct patterns for leukemia (EGF receptor, DNA replication, CCKR), for sarcoma (p53 pathway, CCKR), and solid tumors (FGF and EGF signaling). We validated the predicted PPI networks using high-throughput transcriptomics and proteomics methods. More than 65% of fusions were confirmed at the unique junction sites and more than 46% of PPI networks were altered in at least two data samples. Thus, ChiPPI represents a comprehensive tool for studying skewed cellular networks produced by fusion proteins in different cancer types.
|COMPUTATIONAL IDENTIFICATION OF PROTEIN BIOMARKERS TO PREDICT EXCESSIVE SCARRING
Sridevi Nagaraja, Lin Chen, Luisa Dipietro, Jaques Reifman and Alexander Mitrophanov
Department of Defense Biotechnology High Performance Computing Software Applications Institute, US
Exaggerated cutaneous scarring is a debilitating medical problem that occurs after trauma and surgical procedures. Frequently, extreme scarring leads to permanent functional loss in the scar tissue and significant disfigurement in patients. The ability to predict the scarring outcome in advance during the early stages of the wound-healing response is key for developing successful prophylactic therapeutic interventions. We sought to computationally identify prospective protein biomarkers that would enable such predictions. Using a previously developed and validated computational model that captures the kinetics of essential cell types and proteins during injury-initiated wound healing, we generated a dataset of 120,000 simulations representing distinct wound-healing scenarios. By applying a recently published, novel computational strategy that comprised data classification, protein concentration distribution analysis, and logistic regression models, we identified diagnostic and prognostic biomarkers of excessive wound scarring. Specifically, we found that increased levels of interleukin(IL)-10, tissue inhibitor of matrix metalloproteinase (TIMP)-1, and fibronectin could predict pathological scarring with an accuracy of ~80% as early as 4 weeks in advance, and with an accuracy of ~86% if the proteins are assayed 3 weeks in advance. Clinical validation of these model-predicted biomarkers may provide prognostic tools for objective, personalized clinical assessments of traumatic and surgical wounds. Disclaimer: The opinions and assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting the views of the U.S. Army or of the U.S. Department of Defense. This abstract has been approved for public release with unlimited distribution.
|DEPICTIVE : A strategy for the quantitative discovery of sources of cell-to-cell variability
Robert Vogel, Luís Santos, Jerry Chipuk, Marc Birtwistle, Gustavo Stolovitzky and Pablo Meyer
Single cell measurements have shown that populations of cells are intrinsically diverse in their biomolecular compositions, state, and responsiveness to environmental conditions. Surprisingly, genetic variability is not necessary for establishing population diversity. In fact, non-genetic sources of cell-to-cell variability (ngCCV) are a manifestation of the physical properties of the biochemical processes of cells, and consequently represent a general property of life at the single cell level. Of particular interest to the biomedical community is how this ngCCV contributes to pathway regulation and disease. To date a quantitative framework that specifically attributes population diversity to the observed variability in biomolecular components is lacking. To such end, we developed a method for DEtermining Parameter Influence on Cell-to-cell variability through the Inference of Variance Explained, DEPICTIVE for short. Using single cell measurements, DEPICTIVE computes the contribution of each biomolecular observable to the binary response being studied. We validated our method with both simulation data and experimental measurements of TRAIL induced apoptosis of Jurkat cells. Our method uncovered mitochondria abundance as a novel source of ngCCV that tunes the sensitivity of individual cells to TRAIL. Indeed, ngCCV that manifests as diverse sensitivities to therapeutic intervention is an important consideration for precision medicine.
|E-Cell4 : a multi-scale, multi-algorithm simulation environment with automated modeling pipelines from bioinformatics data
Kozo Nishida, Kazunari Kaizu and Koichi Takahashi
RIKEN Center for Biosystems Dynamics Research, JP
E-Cell System version 4 (E-Cell4) is a software environment that supports cellular simulations at multiple scale (spatial / nonspatial), algorithms (deterministic / stochastic), and platforms (operating systems, high performance computing resource managers, and cloud computing). E-Cell4 also has unified APIs to combine and switch multiple algorithms independent of the model. In this poster, we introduce the new feature of E-Cell4 called model annotation that can directly link a cellular model to bioinformatics databases. Data pipelines in Python language support constructing and customizing fully-annotated models based on various types of databases and model annotations allow automatic acquisition and integration of metadata. Based on these annotations of model entities (species, reactions), users can interactively access databases with Jupyter Notebook. The annotations generate useful information related to the model, e.g., description of entities, formatted equations, a summary table of parameters, and list of publications, as a publishable document with no extra cost. Also, the rule-based model notation facilitates the natural representation and consistent integration of interactions and reactions such as protein modification and isotopic labeling. Here, a notebook for modeling and simulation of metabolic network based on the KEGG pathway database is demonstrated. E-Cell4 is freely available at https://github.com/ecell/ecell4 .
|From empirical biomarkers to models of disease mechanisms
Maria Peña-Chilet, Cankut Cubuk, Carlos Loucera, Kinza Rian, Marta R. Hidalgo, Isabel A. Nepomuceno-Chamorro, Helena Molina-Abril and Joaquin Dopazo
Clinical Bioinformatica Area, Fundacion Progreso y Salud, Sevilla, ES
Currently, personalized medicine is based on the identification of biomarkers that mostly consist on individual mutational events. Such biomarkers have been discovered by observing statistical associations to disease progression or treatment responses. However, despite of their clinical utility, biomarkers success is purely probabilistic, often modest and frequently lacks any mechanistic anchoring to the fundamental cellular processes responsible for the disease or therapeutic response. Therefore, a more comprehensive, systems-based understanding of the way in which genes interact to shape the phenotype is required. Here we show how low-informative, decontextualized gene expression and gene variation data can be integrated and transformed into mechanism-based biomarkers containing higher-level information on the molecular mechanisms that determine complex phenotypes, such as disease outcome or drug response, by means of mathematical models of signaling pathway activity. Moreover, we show how these models can be used to find cancer drivers and to propose knowledge-based, personalized therapeutic interventions.
|Genome-based simulation of a whole bacterial cell.
Kazunari Kaizu, Kozo Nishida and Koichi Takahashi
A whole cell modeling has been one of grand challenges in the post-genomic era. However, it is yet very difficult to realize the sustainable way of modeling and predictable simulation of a cell. Here, we present a novel framework of automatic bottom-up modeling from a genomic sequence, and of genome-scale simulation for prokaryotic cells at single- molecule and nucleotide resolution. As an example, a whole cell simulation of Escherichia coli is demonstrated. The software accepts a genomic sequence (1), automatically annotates genomic regions, e.g. operons, open reading frames and protein domains, based on various databases (2), generates a whole cell model consisting of gene expression, protein modification, metabolism, and replication (3), and simulates the stochastic agent-based model representing individual molecules and events in single-nucleotide resolution (4). It can directly evaluate the effect of mutations and synthetic genomes just by editing DNA sequences without detailed knowledge about the mathematical model. To evaluate the predictability, the simulation enables genome-scale computational experiments (in silico omics) quantitatively comparable with wet experiments like RNA-Seq and ChIP-seq. Integration of bioinformatics and systems biology based on a genomic sequence enables sustainable approach of whole cell modeling and bridges the gap between computational and experimental biology.
|Identification of common and specific mutated driver pathways in cancer
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, CN
Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acutemyeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found.
|KBase: An Integrated Systems Biology Knowledgebase for Predictive Biological and Environmental Research
Adam P. Arkin, Robert Cottingham, Christopher Henry, Nomi Harris, Benjamin Allen, Jason Baumohl, Shane Canon, Stephen Chan, John-Marc Chandonia, Dylan Chivian, Paramvir Dehal, Meghan Drake, Janaka N. Edirisinghe, Jose P. Faria, Uma Ganapathy, Annette Greiner, Tian Gu, James G. Jeffryes, Marcin Joachimiak, Roy Kamimura, Keith Keller, Vivek Kumar, Sunita Kumari, Miriam Land, Sean McCorkle, Arman Mikaili, Daniel Murphy-Olson, Arfath Pasha, Erik Pearson, Gavin Price, Priya Ranjan, William Riehl, Samuel M. D. Seaver, Alan Seleman, James Thomason, Doreen Ware, Shinjae Yoo, Qizh Zhang and Diane Zheng
Lawrence Berkeley National Laboratory, Berkeley, CA, US
The DOE Systems Biology Knowledgebase (KBase) is a free, open-source software and data platform that enables researchers to collaboratively generate, test, compare, and share hypotheses about biological functions; analyze their own data along with public and collaborator data; and combine experimental evidence and conclusions to model plant and microbial physiology and community dynamics. KBase currently has over 160 analysis tools (see https://narrative.kbase.us/#appcatalog) that offer diverse scientific functionality for (meta)genome assembly, contig binning, genome annotation, sequence homology analysis, tree building, comparative genomics, metabolic modeling, community modeling, gap-filling, RNA-seq processing, and expression analysis (see Figure 1). Users can build and share sophisticated workflows by chaining together multiple apps–for example, one could predict species interactions from metagenomic data by assembling raw reads, binning assembled contigs by species, annotating genomes, aligning RNA-seq reads, and reconstructing and analyzing individual and community metabolic models. Computational experiments in KBase are saved in the form of Narratives. A finished Narrative represents a complete record of everything the authors did to complete their analysis. This recording of a user’s KBase activities within a sharable Narrative is a central pillar of KBase’s support for reproducible transparent research, simplifying the re-purposing, re-application, and extension of scientific techniques.
|Leveraging chromatin accessibility for transcriptional regulatory network inference in T Helper 17 Cells
Emily R. Miraldi, Maria Pokrovskii, Aaron Watters, Dayanne M. Castro, Nicholas De Veaux, Jason A. Hall, June-Yong Lee, Maria Ciofani, Nick Carriero, Dan R. Littman and Richard Bonneau
New York University, Flatiron Institute, US
Transcriptional regulatory networks (TRNs) provide insight into cellular behavior by describing interactions between transcription factors (TFs) and their gene targets. The Assay for Transposase Accessible Chromatin (ATAC)-seq, coupled with TF motif analysis, provides indirect evidence of chromatin binding for hundreds of TFs. Here, we propose modified LASSO regression with StARS model selection for TRN inference in a mammalian setting, using ATAC-seq data to influence gene expression modeling. We rigorously test our methods in the context of T Helper Cell Type 17 (Th17) differentiation, generating new ATAC-seq data to complement existing Th17 genomic resources (plentiful gene expression data, 25 TF knock-outs and 9 TF ChIP-seq experiments). In this resource-rich mammalian setting, we undertake quantitative, genome-scale evaluation of our methods. In addition to the context-specific ATAC-seq, we evaluate generic sources of prior information, from a curated database and other cell types. We refine and extend our previous Th17 TRN, using our new TRN inference methods to integrate all Th17 data, highlighting new TFs in Th17 gene regulation. Given the popularity of ATAC-seq, which provides high-resolution with low sample input requirements, our methods will improve TRN inference in new mammalian systems, especially in vivo, for cells directly from humans and animal models.
|Mathematical modeling of time course single cell and pooled gene expression data in cancer with CancerInSilico
Thomas Sherman, Luciane Kagohara, Raymon Cao, Raymond Cheng, Matthew Satriano, Gabriel Krigsfeld, Ruchira Ranaweera, Yong Tang, Sandra Jablonski, Genevieve Stein-O’Brien, Daria Gaykalova, Louis Weiner, Christine Chung, Cristian Tomasetti and Elana Fertig
Johns Hopkins University, US
Bioinformatics techniques to analyze time course bulk and single cell omics data are advancing. The absence of a known ground truth of the dynamics of molecular changes challenges benchmarking their performance on real data. Realistic simulated time-course datasets are essential to assess the performance of time course bioinformatics algorithms. We develop an R/Bioconductor package CancerInSilico to simulate bulk and single cell transcriptional data from a known ground truth obtained from mathematical models of cellular systems. The core model of the package is an off-lattice, cell-center Monte Carlo mathematical model for cellular growth. We adapt this model to simulate the impact of growth suppression by targeted therapeutics in cancer and benchmark simulations against bulk in vitro experimental data. Sensitivity to parameters is evaluated and used to predict the relative impact of variation in cellular growth parameters and cell types on tumor heterogeneity in therapeutic response.
|Mathematical Models of Histone Modification Propagation During the Repair of Double-Strand DNA Breaks
Gabriel Bronk, Kevin Li, James Haber and Jane Kondev
Department of Physics, Brandeis University, US
DNA constantly undergoes double-strand breaks (DSBs), which result in cell death if the DSBs are not repaired. An early step in DSB repair is the phosphorylation of H2A histones (called gamma-H2AX) around the break site, extending up to 50 kilobases from the DSB in S. cerevisiae. Kinases Mec1 and Tel1 in S. cerevisiae (ATR and ATM in mammals) are responsible for the phosphorylation of H2A and are known to bind to the DSB site. We aim to understand how these histone modifications propagate along the chromosome from the break site. We create mathematical models of several potential propagation mechanisms, in which the kinases reach distant H2As by (1) sliding along the chromosome, (2) by diffusing in 3D from the break site to the H2As, or (3) by looping of the chromatin to bring a DSB-bound kinase into contact with a distant H2A. For each model, we derive the probability of H2A phosphorylation as a function of the distance from the DSB and time since the formation of the DSB. We quantitatively compare these theories to chromatin immunoprecipitation measurements of the kinetics of H2A phosphorylation in S. cerevisiae. We find that Tel1 undergoes sliding and Mec1 likely slides as well.
|Modelling mRNA Transfection Using Diffusion Processes
Susanne Pieschner and Christiane Fuchs
Helmholtz Zentrum Muenchen, University of Bielefeld, DE
mRNA transfection is the process of introducing mRNA into a living cell. mRNA delivery becomes increasingly interesting for biomedical applications because it enables treatment of diseases by means of targeted expression of proteins and it is transient, avoiding the risk of permanently integrating into the genome. Despite its potential in treating diseases, many parameters of mRNA transfection are still unrevealed. We study mRNA transfection on the single-cell level. To that end, we model its dynamics by diffusion approximations to the discrete-state processes. Several models elaborate different aspects of the system, e.g. enzymatic degradation of the mRNA or ribosomal binding to mRNA for translation. The corresponding diffusion processes are equivalently described by stochastic differential equations (SDEs). Based on data from time-lapse fluorescence microscopy, we estimate the SDE model parameters. As observations are usually only available in rather low frequency, we use a Markov chain Monte Carlo algorithm that employs Bayesian data imputation and that can also handle latent variables and measurement error. We compare our approach to a recently published one based on ordinary differential equations (ODEs) and investigate e.g. how far problems of identifiability from the ODE setting can be overcome by our SDE approach.
|ModelSEED 2.0: Improving automated model reconstruction across phylogenetically diverse microbial species
Jose P. Faria, Janaka N. Edirisinghe, Filipe Liu, Samuel M. D. Seaver, James G. Jeffryes, Qizh Zhang, Pamela Weisenhorn, Boris Sadkhin, Nidhi Gupta, Tian Gu and Christopher Henry
Argonne National Laboratory, US
The ModelSEED is a leading platform for automated genome-scale metabolic reconstruction, with over 100k models constructed since it’s release in 2010. Here we introduce the largest ModelSEED update since its initial release. First, we are launching a new website (www.modelseed.org), which integrates functionality from the PlantSEED resource for plant model reconstruction. This new site offers improvements to the biochemistry search and model reconstruction interfaces. Additionally, prokaryotic and plant genomes may now be annotated directly on the ModelSEED site. The ModelSEED Biochemistry Database was also updated and loaded into Github (https://github.com/ModelSEED/ModelSEEDDatabase). This enables users to curate the existing biochemistry and submit their own additions. A major part of this update was curation of the ModelSEED template models, fixing and expanding gene–reaction mappings. Additionally, biomass compositions were extended to include new metabolites that are essential for many organisms. We also improved the ModelSEED gap filling algorithm to restrict the addition of thermodynamically infeasible pathways. We validated our improvements by constructing new models for a diverse set microbial genomes, and testing model accuracy in predicting growth and knockout phenotype data. These final changes also impact the ModelSEED deploy in KBase and PATRIC.
|Prediction of infection outcome by computational modeling of Yersinia enterocolitica infection
Janina Geißert, Martin Eichner, Erwin Bohn, Reihaneh Mostolizadeh, Andreas Dräger, Ingo B. Autenrieth, Sina Beier and Monika Schütz
Institut für Medizinische Mikrobiologie und Hygiene, Universitätsklinikum Tübingen, DE
Course and outcome of gastrointestinal infections depend on the complex interplay of pathogens, their virulence and fitness factors, the host immune response, presence and composition of the endogenous microbiome. An expansion of pathogens within the gastrointestinal tract implies an increased risk for the development of severe systemic infections, especially in patients receiving antibiotic treatment or in an immunocompromised state. We developed a computational model to predict pathogen expansion, gut colonization, and infection outcome. For implementation and challenge of the model, oral mouse infection experiments with the enteropathogen Yersinia enterocolitica (Ye) were used. Our model calculates the bacterial population dynamics during gastrointestinal infection and accounts for specific pathogen characteristics, the host immune capacity and colonization resistance mediated by the endogenous microbiome. We calibrated the model to experimental data obtained by the infection of a healthy host. Afterward, we challenged our model by adopting scenarios where either a microbiome was lacking (mimicking antibiotic treatment of patients), or where the immune response was partially impaired. Experimental mouse infections approved predicted population dynamics based on these scenarios. Our model provides new hypotheses about the roles of host- and pathogen-derived factors and might be useful for developing personalized infection prevention and treatment strategies.
|Stochasticity of Coagulation and Fragmentation of Self-Assembly from Exact Computed Solution of Discrete Chemical Master Equation
Farid Manuchehrfar, Wei Tian, Tom Chou and Jie Liang
University of Illinois at Chicago, US
Coagulation and fragmentation (CF) is a fundamental process in which particles attach to each other to form clusters, while existing clusters also break into smaller clusters. This is a ubiquitous process that plays significant roles in biological problems such as brain shrinkage, Alzheimer’s disease or amyloid-beta aggregation in neurodegenerative disease. CF often occurs in confined space with limited number of particles; thus the system can be highly stochastic. A fundamental approach to investigate CF is through solving the underlying discrete Chemical Master Equation (dCME), which provides exact descriptions of the time-evolving and the steady states of the CF system. Recent theoretical models which are based on dCME do not fully take into account the attachment, detachment, synthesis, and degradation, as well as the effects of dimensionality, simultaneously. We use the newly developed Accurate Chemical Master Equation (ACME) method to solve the underlying dCME of the CF process and examine the time evolving dynamics of CF system at different attachment, detachment, synthesis, and degradation rates. We demonstrate how these factors can have profound effects on the CF process.
|Systematically validated transcription-factor activity inference on a whole-cell scale
Cynthia Ma and Michael R. Brent
A long-standing modeling problem is to infer the activity levels of each TF in many cell samples, given the gene expression profile of each sample and a qualitative network map, indicating which TFs have the potential to regulate each gene in the genome. Accurate TF-activity (TFA) inference would be useful for identifying TFs whose activity is affected by drug treatments or cancer mutations. It would also provide models for predicting the effects of knocking out or over expressing specific combinations of TFs. We present solutions problems that have limited the practical utility of TFA inference: (1) a method for constructing the required qualitative TF-network maps; (2) a method for exploiting samples of cells in which a TF has been genetically deleted; (3) a combination of regularization and constraints on parameters that improves both accuracy and interpretability; (4) the first application of TFA to a large collection of expression profiles, on a whole-cell scale, without prior knowledge beyond the qualitative network map. Our systematic, objective, genome-scale evaluations of inferred activity levels, using real data, show that our approach works on genomic scale. This opens the door to meaningful comparison of TFA inference methods and to their widespread application.
|Systems modeling of phenotypic plasticity of CD4+ T cell differentiation
Bhanwar Lal Puniya, Robert Todd, Akram Mohammed, Deborah Brown, Matteo Barberis and Tomas Helikar
University of Nebraska-Lincoln, US
CD4+ T cells provide cell mediated-immunity in response to pathogens and diseases. After activation, naïve T cells differentiate into effector T helper and regulatory subtypes. These subtypes were initially thought of as terminally differentiated; however, plasticity in T cell differentiation has been observed in recent studies. In this study, we developed a logic-based computational model of signaling pathways that govern the differentiation process of naive T cells into T helper 1, 2, 17, and induced Treg cells. We characterized the dynamic capacity of T cell differentiation in response to the varying dosage of 512 extracellular cytokine combinations. In addition to the classical phenotypes, we predicted previously reported and novel complex T cell phenotypes that have co-existence of multiple lineage-specifying transcription factors (TFs). Our results suggested that plasticity in T cell differentiation is a function of both cytokine composition and dosage. We also identified the specific patterns of extracellular environments that can lead to each T cell subtype. Based on cytokine dosage, we identified the dominant stimuli that control the transition between canonical and complex phenotypes. In the end, we predicted the optimal activity of input cytokines that maximize the activity levels of multiple lineage-specifying TFs in complex phenotypes.
|The JSBML project: a fully featured Java API for working with systems biological models
Nicolas Rodriguez, Thomas M. Hamm, Roman Schulte, Leandro Watanabe, Ibrahim Y. Vazirabad, Victor Kofia, Chris J. Myers, Akira Funahashi, Nicolas Le Novère, Michael Hucka and Andreas Dräger
Applied Bioinformatics Group, Center for Bioinformatics Tübingen (ZBIT), University of Tübingen, DE
SBML is the most widely used data format to encode and exchange models in systems biology. The open-source JSBML project was launched in 2009 as an international collaboration aiming to provide a feature-rich Java implementation for reading, manipulating, and writing SBML files. The JSBML project is a stable, actively developed, and well-documented software project with many contributors around the world. A growing number of applications is now available that uses JSBML as their back-end for data manipulation. These cover diverse areas of use cases, including model building and graphical display, constraint-based modeling, dynamic simulation, annotation, etc. JSBML supports all levels, versions, and releases of SBML and provides numerous utility functions for working with this standard. Thereby, JSBML integrates well with further Java libraries for community standards. The JSBML team actively maintains and updates the project. JSBML is being used in students’ education and numerous research projects. Major model databases, such as BioModels or BiGG Models, use JSBML-based tools for their curation pipelines. JSBML is also regularly subject of international students coding events. JSBML can be freely obtained under the terms of the LGPL 2.1 from https://github.com/sbmlteam/jsbml/. The users’ guide at http://sbml.org/Software/JSBML/docs/ provides further information about using JSBML. Contact: email@example.com
|Topological and Dynamical Characteristics of a Large-Scale Signaling Network by Module Analysis
Cong-Doan Truong, Tien-Tung Truong and Yung-Keun Kwon
University of Ulsan, KR
It is a challenge in systems biology to unravel modular characteristics of biological networks. In this paper, we examined a large-scale signaling network consisting of 5443 genes and 37663 interactions to discover topological or dynamical characteristics of the modules in the network. By using a module detection algorithm, we identified 17 modules in the network and classified them into two groups, Group 1 and 2, according that the module size is larger and smaller than the average value, respectively. First, we compared the proportions of important genes such as disease, drug-target, and essential genes between two module groups, and found that they were significantly higher in Group 1 than in Group 2. Second, we computed five different centrality measures of the modules and observed that Group 1 is more central than Group 2. Third, we examined module-based robustness and it was shown that in-module robustness of Group 1 was slightly higher than that of Group 2, whereas out-module of Group 1 was smaller than that of Group 2. Finally, gene ontology of Group 1 and 2 was differently enriched. Taken together, the modular structure can be a useful property to understand topological and dynamical characteristics of a large-scale signaling network.
|Tracking and Engineering the Evolution of Organismal Fitness via Multi-Organism mRNA Translation Whole Cell Simulations
Hadas Zur, Rachel Cohen-Kupiec and Tamir Tuller
Tel Aviv University, IL
We report the first whole-cell translation simulations of several organisms, to understand their evolution. The models consider all fundamental biophysical aspects of translation dynamics and are based on parameters estimated from experimental data. We developed tools, such as ancestral parameter reconstruction, for comparing sets of whole-cell translation models, and understanding transcriptome evolution via connecting genotypes to phenotypes (translation biophysics). Among others, we show that in S.cerevisiae our model was able to explain 49% of the experimental data variability, with elongation explaining 23%. Via analyses of the inferred changes in the genomic nucleotide composition and biophysical aspects of translation, we demonstrated how various known and yet unknown patterns (e.g. sequence and structural motifs) in coding regions, improve organism fitness. Based on these models, we developed a novel generic approach for improving fitness by introducing silent mutations via elimination of ribosomal traffic jams. Thereby, more resources are available promoting improved fitness and growth-rate. The algorithm is already implemented on S.cerevisiae via CRISPR-CAS9 genome editing tool, where we show that by introducing silent mutations to two genes we can increase the growth rate by 3-5%. The approach can be used for improving the fitness of any organism used in biotechnology, medicine, and agriculture.
|Using Bayesian Optimization to Learn Parameters of Molecular Self Assembly Systems
Sushant Patkar, Marcus Thomas, Russell Schwartz and Roded Sharan
Tel Aviv University, IL
The spontaneous self-assembly of molecules into functional complexes is central to all major cellular processes, yet self-assembly chemistry has only slowly been incorporated into systems biology modeling. This in large part results from substantial computational and experimental challenges to self-assembly modeling, simulation, and model inference compared to simpler enzymatic and transport networks. Rule-based stochastic simulation has provided a way to model and tractably simulate even highly complicated self-assembly reaction networks and to learn model parameters from experimental data via simulation-based model fitting. Nonetheless, large parameter spaces, high computational cost of simulations, and limited experimental data have so far precluded the use of Bayesian methods for characterizing uncertainty of model fits, which have become the standard for most other systems biology model inference. In the present work, we improve on prior data-driven model inference for self-assembly systems in two directions: 1) extending data-fitting to encompass small-angle scattering (SAS), a richer experimental data source than the static light scattering (SLS) used in prior work, and 2) developing an efficient Bayesian optimization framework by learning Gaussian process models as surrogate functions to capture model uncertainty. We demonstrate and validate the approach on synthetic SAS data for a virus capsid assembly model.
Thursday, February 1, 2018: Abstract submission opens
Thursday, April 5, 2018: Abstract submission deadline
Friday, April 6, 2018: Late poster submission opens
Tuesday, May 1, 2018: Late poster submissions deadline
Thursday, May 3, 2018: Abstract acceptance notification
Tuesday, May 22, 2018: Late poster acceptance notifications
Friday-Tuesday July 6-10, 2018: ISMB conference
Saturday July 7, 2018: SysMod meeting
We are currently accepting abstracts for contributed oral presentations and posters. Abstracts should briefly (approximately 200 words) summarize the background/motivation, methods, results, and conclusions of your study.
We intend to provide a small number of travel scholarships to students and postdocs.
For more information, please contact the SysMod coordinators 🔗.