|
A Blueprint for Human Whole-cell Modeling
Arthur Goldberg, Balázs Szigeti, Yosef Roth, John Sekar, Saahith Pochiraju and Jonathan Karr
Icahn School of Medicine at Mount Sinai, USWhole-cell (WC) computational models of human cells are a central goal of systems biology. WC models could help researchers understand cell biology and help physicians treat disease. Ongoing technological advances in experimentation and modeling are enhancing the feasibility of WC models. However, progress toward WC models remains slow. To identify the bottlenecks to WC modeling and develop a long-term plan to achieve human WC models, we surveyed the biomodeling community, reviewed the literature, and reflected on our experience prototyping WC models of bacteria. We identified four major bottlenecks: a) inadequate experimental methods and data repositories; b) inadequate tools for designing, describing, simulating, calibrating, and validating large models; c) few models of individual processes that can be combined into WC models; and d) insufficient coordination within the biomodeling community. Further, we propose a project, termed the Human Whole-Cell Modeling Project, which would overcome these bottlenecks and achieve the first human WC models. The cornerstones of the project include developing computational technologies for scalably building and simulating models, developing standard protocols and formats for collaborative modeling, collaboratively building models as a community, and focusing on a single cell line. We invite the community to join this exciting and ambitious effort. |
|
A cell proliferation model of human acute myeloid leukemia xenograft
Marco S. Nobile, Thalia Vlachou, Simone Spolaor, Daniela Bossi, Paolo Cazzaniga, Luisa Lanfrancone, Daniela Besozzi, Pier Giuseppe Pelicci and Giancarlo Mauri
Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, ITAcute myeloid leukemia is one of the most common hematological malignancies, characterized by high relapse and mortality rates: its inherent intra-tumor heterogeneity between subpopulations of cells is thought to play an important role in disease recurrence and resistance to chemotherapy. Current experimental methods are often not enough to quantify and assess the dynamics of these subpopulations. In order to overcome this limitation, we introduce a novel modeling and simulation framework that takes into account the inherent stochasticity of cell division events to investigate the possible occurrence of different subpopulations of cell types in acute myeloid leukemia, notably leveraging experimental data derived from human xenografts in mice. Our results highlight the role played by quiescent cells, as well as proliferating cells characterized by different rates of division, in the progression and evolution of the disease, hinting at the necessity to further characterize tumor cell subpopulations. |
|
A comprehensive community metabolic model of maize root system
Ab Rauf Shah, Bailee Lichter, Camila Pereira Braga, Dongdong Zhang, Bhanwar Lal Puniya, Edgar B. Cahoon, Jiri Adamec and Tomas Helikar
University of Nebraska-Lincoln, USRoots play an important role in absorption of water, minerals, regulation of metabolism, and overall plant growth and maintenance of homeostasis. Furthermore, roots form complex interactions with their soil microbial community. Despite of its importance, root metabolism remains largely unexplored. Given the complex nature of the root system within the plant and its soil environment, a multiscale model of root metabolism and the soil microbiome community has the potential to better characterize the relationship between genotype and phenotype, and predict new interventions for crop improvement. Herein we describe a metabolic model of maize root that was developed using public transcriptomic data from 18 maize root tissues and previous maize models. This model consists of 4,917 reactions associated with 5,637 genes. In order to characterize interactions between maize root and soil microbes, we also developed metabolic models of seven microbes that are in a symbiotic relationship with maize. We are in the process of integrating these models into a multiscale model that will be able to describe the dynamic maize root – soil microbial interactions. This model will be used to understand root-microbe interplay regulating maize metabolic responses, and to predict novel pathways associated with maize root exudate and rhizobiome interactions. |
|
A natural language dialogue system for model-driven explanation of biological experiments
Benjamin M. Gyori, John Bachman, Patrick Greene, Funda Durupinar, Xue Zhang, James Allen, Lucian Galescu, Choh Man Teng, Ian Perera, William de Beaumont, Mark Burstein, Scott Friedman, Jeffrey Rye, Emek Demir, Brent Cochran and Peter Sorger
Harvard University, USFinding sets of mechanisms that can explain observations in biological experiments (e.g., “How does treatment with SB431542 decrease the amount of SMURF2?”) generally involves a laborious process of information gathering and the construction and testing of a hypothesis in the form of a model. We present a system in which a user interacts with a computer partner through open-ended two-way English language dialogue to collect information on relevant mechanisms, and construct and test a mechanistic model serving as an explanation to observations of interest. The integrated system combines natural language understanding, dialog management, and the recognition of the user’s goals with the planning and execution of a variety of biological reasoning agents (Bioagents) capabilities. Bioagents report on relationships between drugs, their targets, and associations with disease; transcription factors and their targets, and mechanistic paths between proteins in pathway databases and networks assembled from literature-mining. Bioagents also interface with automated model assembly (INDRA) and simulation systems (Kappa, BioNetGen) to allow incremental model building and queries with respect to a model in the course of the dialogue. The dialogue system is embedded in a web-based interface which allows multi-modal visualization of the model being discussed. |
|
A Predictive Machine Learning Framework to Assess the Transcriptomic Response of Species to Environmental Toxins
Siavash Nazari and Mehrdad Hajibabaei
University of Guelph, CAThe traditional ecotoxicological methods do not detect the presence of hazardous chemicals not before a significant environmental damage has already been done. Being motivated by the idea that essential pathways and genes are conserved across the taxa, we present a framework that predicts the shared transcriptomic response of species to toxins. This framework mines chemical-gene interaction data from publicly available databases like the Comparative Toxicology Database, retrieving an initial set of affected genes. Next, it obtains corresponding translated protein sequences and expands the dataset with orthologs of those. Afterwards, it clusters these protein sequences obtaining orthogroups. Lastly, for a number of pathways with the most genes present, the pipeline learns a Bayesian Network to derive meaningful correlations between shared pathways. We ran the pipeline on a testcase dataset consisting of a group of phylogenetically distant taxa, namely Mus musculus, Danio rerio, Drosophila melanogaster and Caenorhabditis elegans, with input chemical groups of heavy-metals and dioxins. A total number of 42802 affected genes were retrieved for heavy-metals and dioxins respectively. Based on retrieved protein sequences, they were clustered into 9606 orthogroups in 125 pathways. The highly affected pathways suggested by our pipeline correspond with findings of previous studies in the literature. |
|
Assessing temporal network signaling entropy dynamics of tissue injury
Ankit Jambusaria, Zhigang Hong, Yang Dai, Asrar Malik and Jalees Rehman
University of Illinois at Chicago, USA key challenge in systems biology is the elucidation of the underlying molecular pathways which regulate cellular phenotype during time series experiments in which biological tissues or cells are exposed to stressors. It is possible that there are undiscovered interactions between genes or proteins in response to a given stressor which may only be identified by a novel approach to assess gene regulatory network dynamics. We propose a novel framework based on statistical mechanical principles for systems analysis and interpretation of molecular omics data. Specifically, we propose the notion of network signaling entropy (or uncertainty) as a means of elucidating novel interactions which will provide insights into underlying basic biology, disease and repair mechanisms. We describe the power of assessing network signaling entropy to discriminate cells according to their distinct states of injury or repair during a time series transcriptomic analysis. Our analyses suggest that network signaling entropy decreases in response to inflammatory stimulation, suggesting that entropy can be used to identify novel regulatory elements mediating inflammatory injury and post-injury repair. We thus propose network signaling entropy as a powerful approach for understanding signaling promiscuity during tissue injury, repair, and regeneration. |
|
Cellular phenotype switching and effect of gene duplication through accurate computation of probability velocity and flux fields
Anna Terebus, Chun Liu and Jie Liang
University of Illinois at Chicago, USBiochemical reaction networks are often stochastic because of the different time scales of reactions and often low copy numbers of participating molecular species. The discrete Chemical Master Equation provides a fundamental framework for studying their time-evolving and steady state probability landscapes. Vector fields of probability velocity and flux can further characterize the time-varying and non-equilibrium steady states properties of these systems. Here we describe a general approach of analysis of the global flow map of probability mass in all directions of all molecular species. It takes into full account the discreetness of both states and jump reactions, and provides an exact quantification of the vector fields along the boundaries of the state space dictated by the reaction network. We apply this approach to study the toggle switch network, in which the reactions of transcription and translation are both explicitly modeled. We describe the mechanism of the transitions between important cellular states, as well as examine how duplication of genes in the toggle switch affects the non-equilibrium dynamics of transitions between them. We explore changes in the dynamics of non-equilibrium probability landscape and the appearance of new cellular states, as well as changes in their locations. |
|
Chimeric Protein-Protein Interaction Networks Reveal Alterations in Cancer-Specific Phenotypes
Milana Frenkel-Morgenstern, Alessandro Gorohovski, Somnath Tagore, Vaishnovi Sekar, Miguel Vazquez and Alfonso Valencia
Barcelona Supercomputing Centre BSC, ESChimeric proteins, comprising peptides deriving from the translation of two parental genes, are produced in cancers by chromosomal aberrations. Considering discrete protein domains as binding sites for specific domains of interacting proteins, we have catalogued the protein interaction networks for more than 11,000 cancer fusions in order to build the Chimeric Protein-Protein-Interactions (ChiPPI). Mapping the influence of fusion proteins on cell metabolism and protein interaction networks reveals that chimeric protein-protein interaction (PPI) networks often lose tumor suppressor proteins, and gain onco-proteins. We compared ChiPPI networks in different cancer phenotypes, e.g. in leukemia/lymphoma, sarcoma and solid tumors finding distinct enrichment patterns for each disease type. While certain pathways are enriched in all three diseases (Wnt, Notch, TGF beta), there are distinct patterns for leukemia (EGF receptor, DNA replication, CCKR), for sarcoma (p53 pathway, CCKR), and solid tumors (FGF and EGF signaling). We validated the predicted PPI networks using high-throughput transcriptomics and proteomics methods. More than 65% of fusions were confirmed at the unique junction sites and more than 46% of PPI networks were altered in at least two data samples. Thus, ChiPPI represents a comprehensive tool for studying skewed cellular networks produced by fusion proteins in different cancer types. |
|
COMPUTATIONAL IDENTIFICATION OF PROTEIN BIOMARKERS TO PREDICT EXCESSIVE SCARRING
Sridevi Nagaraja, Lin Chen, Luisa Dipietro, Jaques Reifman and Alexander Mitrophanov
Department of Defense Biotechnology High Performance Computing Software Applications Institute, USExaggerated cutaneous scarring is a debilitating medical problem that occurs after trauma and surgical procedures. Frequently, extreme scarring leads to permanent functional loss in the scar tissue and significant disfigurement in patients. The ability to predict the scarring outcome in advance during the early stages of the wound-healing response is key for developing successful prophylactic therapeutic interventions. We sought to computationally identify prospective protein biomarkers that would enable such predictions. Using a previously developed and validated computational model that captures the kinetics of essential cell types and proteins during injury-initiated wound healing, we generated a dataset of 120,000 simulations representing distinct wound-healing scenarios. By applying a recently published, novel computational strategy that comprised data classification, protein concentration distribution analysis, and logistic regression models, we identified diagnostic and prognostic biomarkers of excessive wound scarring. Specifically, we found that increased levels of interleukin(IL)-10, tissue inhibitor of matrix metalloproteinase (TIMP)-1, and fibronectin could predict pathological scarring with an accuracy of ~80% as early as 4 weeks in advance, and with an accuracy of ~86% if the proteins are assayed 3 weeks in advance. Clinical validation of these model-predicted biomarkers may provide prognostic tools for objective, personalized clinical assessments of traumatic and surgical wounds. Disclaimer: The opinions and assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting the views of the U.S. Army or of the U.S. Department of Defense. This abstract has been approved for public release with unlimited distribution. |
|
DEPICTIVE : A strategy for the quantitative discovery of sources of cell-to-cell variability
Robert Vogel, Luís Santos, Jerry Chipuk, Marc Birtwistle, Gustavo Stolovitzky and Pablo Meyer
IBM, USSingle cell measurements have shown that populations of cells are intrinsically diverse in their biomolecular compositions, state, and responsiveness to environmental conditions. Surprisingly, genetic variability is not necessary for establishing population diversity. In fact, non-genetic sources of cell-to-cell variability (ngCCV) are a manifestation of the physical properties of the biochemical processes of cells, and consequently represent a general property of life at the single cell level. Of particular interest to the biomedical community is how this ngCCV contributes to pathway regulation and disease. To date a quantitative framework that specifically attributes population diversity to the observed variability in biomolecular components is lacking. To such end, we developed a method for DEtermining Parameter Influence on Cell-to-cell variability through the Inference of Variance Explained, DEPICTIVE for short. Using single cell measurements, DEPICTIVE computes the contribution of each biomolecular observable to the binary response being studied. We validated our method with both simulation data and experimental measurements of TRAIL induced apoptosis of Jurkat cells. Our method uncovered mitochondria abundance as a novel source of ngCCV that tunes the sensitivity of individual cells to TRAIL. Indeed, ngCCV that manifests as diverse sensitivities to therapeutic intervention is an important consideration for precision medicine. |
|
E-Cell4 : a multi-scale, multi-algorithm simulation environment with automated modeling pipelines from bioinformatics data
Kozo Nishida, Kazunari Kaizu and Koichi Takahashi
RIKEN Center for Biosystems Dynamics Research, JPE-Cell System version 4 (E-Cell4) is a software environment that supports cellular simulations at multiple scale (spatial / nonspatial), algorithms (deterministic / stochastic), and platforms (operating systems, high performance computing resource managers, and cloud computing). E-Cell4 also has unified APIs to combine and switch multiple algorithms independent of the model. In this poster, we introduce the new feature of E-Cell4 called model annotation that can directly link a cellular model to bioinformatics databases. Data pipelines in Python language support constructing and customizing fully-annotated models based on various types of databases and model annotations allow automatic acquisition and integration of metadata. Based on these annotations of model entities (species, reactions), users can interactively access databases with Jupyter Notebook. The annotations generate useful information related to the model, e.g., description of entities, formatted equations, a summary table of parameters, and list of publications, as a publishable document with no extra cost. Also, the rule-based model notation facilitates the natural representation and consistent integration of interactions and reactions such as protein modification and isotopic labeling. Here, a notebook for modeling and simulation of metabolic network based on the KEGG pathway database is demonstrated. E-Cell4 is freely available at https://github.com/ecell/ecell4 . |
|
From empirical biomarkers to models of disease mechanisms
Maria Peña-Chilet, Cankut Cubuk, Carlos Loucera, Kinza Rian, Marta R. Hidalgo, Isabel A. Nepomuceno-Chamorro, Helena Molina-Abril and Joaquin Dopazo
Clinical Bioinformatica Area, Fundacion Progreso y Salud, Sevilla, ESCurrently, personalized medicine is based on the identification of biomarkers that mostly consist on individual mutational events. Such biomarkers have been discovered by observing statistical associations to disease progression or treatment responses. However, despite of their clinical utility, biomarkers success is purely probabilistic, often modest and frequently lacks any mechanistic anchoring to the fundamental cellular processes responsible for the disease or therapeutic response. Therefore, a more comprehensive, systems-based understanding of the way in which genes interact to shape the phenotype is required. Here we show how low-informative, decontextualized gene expression and gene variation data can be integrated and transformed into mechanism-based biomarkers containing higher-level information on the molecular mechanisms that determine complex phenotypes, such as disease outcome or drug response, by means of mathematical models of signaling pathway activity. Moreover, we show how these models can be used to find cancer drivers and to propose knowledge-based, personalized therapeutic interventions. |
|
Genome-based simulation of a whole bacterial cell.
Kazunari Kaizu, Kozo Nishida and Koichi Takahashi
RIKEN, JPA whole cell modeling has been one of grand challenges in the post-genomic era. However, it is yet very difficult to realize the sustainable way of modeling and predictable simulation of a cell. Here, we present a novel framework of automatic bottom-up modeling from a genomic sequence, and of genome-scale simulation for prokaryotic cells at single- molecule and nucleotide resolution. As an example, a whole cell simulation of Escherichia coli is demonstrated. The software accepts a genomic sequence (1), automatically annotates genomic regions, e.g. operons, open reading frames and protein domains, based on various databases (2), generates a whole cell model consisting of gene expression, protein modification, metabolism, and replication (3), and simulates the stochastic agent-based model representing individual molecules and events in single-nucleotide resolution (4). It can directly evaluate the effect of mutations and synthetic genomes just by editing DNA sequences without detailed knowledge about the mathematical model. To evaluate the predictability, the simulation enables genome-scale computational experiments (in silico omics) quantitatively comparable with wet experiments like RNA-Seq and ChIP-seq. Integration of bioinformatics and systems biology based on a genomic sequence enables sustainable approach of whole cell modeling and bridges the gap between computational and experimental biology. |
|
Identification of common and specific mutated driver pathways in cancer
Junhua Zhang
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, CNCancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acutemyeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. |
|
KBase: An Integrated Systems Biology Knowledgebase for Predictive Biological and Environmental Research
Adam P. Arkin, Robert Cottingham, Christopher Henry, Nomi Harris, Benjamin Allen, Jason Baumohl, Shane Canon, Stephen Chan, John-Marc Chandonia, Dylan Chivian, Paramvir Dehal, Meghan Drake, Janaka N. Edirisinghe, Jose P. Faria, Uma Ganapathy, Annette Greiner, Tian Gu, James G. Jeffryes, Marcin Joachimiak, Roy Kamimura, Keith Keller, Vivek Kumar, Sunita Kumari, Miriam Land, Sean McCorkle, Arman Mikaili, Daniel Murphy-Olson, Arfath Pasha, Erik Pearson, Gavin Price, Priya Ranjan, William Riehl, Samuel M. D. Seaver, Alan Seleman, James Thomason, Doreen Ware, Shinjae Yoo, Qizh Zhang and Diane Zheng
Lawrence Berkeley National Laboratory, Berkeley, CA, USThe DOE Systems Biology Knowledgebase (KBase) is a free, open-source software and data platform that enables researchers to collaboratively generate, test, compare, and share hypotheses about biological functions; analyze their own data along with public and collaborator data; and combine experimental evidence and conclusions to model plant and microbial physiology and community dynamics. KBase currently has over 160 analysis tools (see https://narrative.kbase.us/#appcatalog) that offer diverse scientific functionality for (meta)genome assembly, contig binning, genome annotation, sequence homology analysis, tree building, comparative genomics, metabolic modeling, community modeling, gap-filling, RNA-seq processing, and expression analysis (see Figure 1). Users can build and share sophisticated workflows by chaining together multiple apps–for example, one could predict species interactions from metagenomic data by assembling raw reads, binning assembled contigs by species, annotating genomes, aligning RNA-seq reads, and reconstructing and analyzing individual and community metabolic models. Computational experiments in KBase are saved in the form of Narratives. A finished Narrative represents a complete record of everything the authors did to complete their analysis. This recording of a user’s KBase activities within a sharable Narrative is a central pillar of KBase’s support for reproducible transparent research, simplifying the re-purposing, re-application, and extension of scientific techniques. |
|
Leveraging chromatin accessibility for transcriptional regulatory network inference in T Helper 17 Cells
Emily R. Miraldi, Maria Pokrovskii, Aaron Watters, Dayanne M. Castro, Nicholas De Veaux, Jason A. Hall, June-Yong Lee, Maria Ciofani, Nick Carriero, Dan R. Littman and Richard Bonneau
New York University, Flatiron Institute, USTranscriptional regulatory networks (TRNs) provide insight into cellular behavior by describing interactions between transcription factors (TFs) and their gene targets. The Assay for Transposase Accessible Chromatin (ATAC)-seq, coupled with TF motif analysis, provides indirect evidence of chromatin binding for hundreds of TFs. Here, we propose modified LASSO regression with StARS model selection for TRN inference in a mammalian setting, using ATAC-seq data to influence gene expression modeling. We rigorously test our methods in the context of T Helper Cell Type 17 (Th17) differentiation, generating new ATAC-seq data to complement existing Th17 genomic resources (plentiful gene expression data, 25 TF knock-outs and 9 TF ChIP-seq experiments). In this resource-rich mammalian setting, we undertake quantitative, genome-scale evaluation of our methods. In addition to the context-specific ATAC-seq, we evaluate generic sources of prior information, from a curated database and other cell types. We refine and extend our previous Th17 TRN, using our new TRN inference methods to integrate all Th17 data, highlighting new TFs in Th17 gene regulation. Given the popularity of ATAC-seq, which provides high-resolution with low sample input requirements, our methods will improve TRN inference in new mammalian systems, especially in vivo, for cells directly from humans and animal models. |
|
Mathematical modeling of time course single cell and pooled gene expression data in cancer with CancerInSilico
Thomas Sherman, Luciane Kagohara, Raymon Cao, Raymond Cheng, Matthew Satriano, Gabriel Krigsfeld, Ruchira Ranaweera, Yong Tang, Sandra Jablonski, Genevieve Stein-O’Brien, Daria Gaykalova, Louis Weiner, Christine Chung, Cristian Tomasetti and Elana Fertig
Johns Hopkins University, USBioinformatics techniques to analyze time course bulk and single cell omics data are advancing. The absence of a known ground truth of the dynamics of molecular changes challenges benchmarking their performance on real data. Realistic simulated time-course datasets are essential to assess the performance of time course bioinformatics algorithms. We develop an R/Bioconductor package CancerInSilico to simulate bulk and single cell transcriptional data from a known ground truth obtained from mathematical models of cellular systems. The core model of the package is an off-lattice, cell-center Monte Carlo mathematical model for cellular growth. We adapt this model to simulate the impact of growth suppression by targeted therapeutics in cancer and benchmark simulations against bulk in vitro experimental data. Sensitivity to parameters is evaluated and used to predict the relative impact of variation in cellular growth parameters and cell types on tumor heterogeneity in therapeutic response. |
|
Mathematical Models of Histone Modification Propagation During the Repair of Double-Strand DNA Breaks
Gabriel Bronk, Kevin Li, James Haber and Jane Kondev
Department of Physics, Brandeis University, USDNA constantly undergoes double-strand breaks (DSBs), which result in cell death if the DSBs are not repaired. An early step in DSB repair is the phosphorylation of H2A histones (called gamma-H2AX) around the break site, extending up to 50 kilobases from the DSB in S. cerevisiae. Kinases Mec1 and Tel1 in S. cerevisiae (ATR and ATM in mammals) are responsible for the phosphorylation of H2A and are known to bind to the DSB site. We aim to understand how these histone modifications propagate along the chromosome from the break site. We create mathematical models of several potential propagation mechanisms, in which the kinases reach distant H2As by (1) sliding along the chromosome, (2) by diffusing in 3D from the break site to the H2As, or (3) by looping of the chromatin to bring a DSB-bound kinase into contact with a distant H2A. For each model, we derive the probability of H2A phosphorylation as a function of the distance from the DSB and time since the formation of the DSB. We quantitatively compare these theories to chromatin immunoprecipitation measurements of the kinetics of H2A phosphorylation in S. cerevisiae. We find that Tel1 undergoes sliding and Mec1 likely slides as well. |
|
Modelling mRNA Transfection Using Diffusion Processes
Susanne Pieschner and Christiane Fuchs
Helmholtz Zentrum Muenchen, University of Bielefeld, DEmRNA transfection is the process of introducing mRNA into a living cell. mRNA delivery becomes increasingly interesting for biomedical applications because it enables treatment of diseases by means of targeted expression of proteins and it is transient, avoiding the risk of permanently integrating into the genome. Despite its potential in treating diseases, many parameters of mRNA transfection are still unrevealed. We study mRNA transfection on the single-cell level. To that end, we model its dynamics by diffusion approximations to the discrete-state processes. Several models elaborate different aspects of the system, e.g. enzymatic degradation of the mRNA or ribosomal binding to mRNA for translation. The corresponding diffusion processes are equivalently described by stochastic differential equations (SDEs). Based on data from time-lapse fluorescence microscopy, we estimate the SDE model parameters. As observations are usually only available in rather low frequency, we use a Markov chain Monte Carlo algorithm that employs Bayesian data imputation and that can also handle latent variables and measurement error. We compare our approach to a recently published one based on ordinary differential equations (ODEs) and investigate e.g. how far problems of identifiability from the ODE setting can be overcome by our SDE approach. |
|
ModelSEED 2.0: Improving automated model reconstruction across phylogenetically diverse microbial species
Jose P. Faria, Janaka N. Edirisinghe, Filipe Liu, Samuel M. D. Seaver, James G. Jeffryes, Qizh Zhang, Pamela Weisenhorn, Boris Sadkhin, Nidhi Gupta, Tian Gu and Christopher Henry
Argonne National Laboratory, USThe ModelSEED is a leading platform for automated genome-scale metabolic reconstruction, with over 100k models constructed since it’s release in 2010. Here we introduce the largest ModelSEED update since its initial release. First, we are launching a new website (www.modelseed.org), which integrates functionality from the PlantSEED resource for plant model reconstruction. This new site offers improvements to the biochemistry search and model reconstruction interfaces. Additionally, prokaryotic and plant genomes may now be annotated directly on the ModelSEED site. The ModelSEED Biochemistry Database was also updated and loaded into Github (https://github.com/ModelSEED/ModelSEEDDatabase). This enables users to curate the existing biochemistry and submit their own additions. A major part of this update was curation of the ModelSEED template models, fixing and expanding gene–reaction mappings. Additionally, biomass compositions were extended to include new metabolites that are essential for many organisms. We also improved the ModelSEED gap filling algorithm to restrict the addition of thermodynamically infeasible pathways. We validated our improvements by constructing new models for a diverse set microbial genomes, and testing model accuracy in predicting growth and knockout phenotype data. These final changes also impact the ModelSEED deploy in KBase and PATRIC. |
|
Prediction of infection outcome by computational modeling of Yersinia enterocolitica infection
Janina Geißert, Martin Eichner, Erwin Bohn, Reihaneh Mostolizadeh, Andreas Dräger, Ingo B. Autenrieth, Sina Beier and Monika Schütz
Institut für Medizinische Mikrobiologie und Hygiene, Universitätsklinikum Tübingen, DECourse and outcome of gastrointestinal infections depend on the complex interplay of pathogens, their virulence and fitness factors, the host immune response, presence and composition of the endogenous microbiome. An expansion of pathogens within the gastrointestinal tract implies an increased risk for the development of severe systemic infections, especially in patients receiving antibiotic treatment or in an immunocompromised state. We developed a computational model to predict pathogen expansion, gut colonization, and infection outcome. For implementation and challenge of the model, oral mouse infection experiments with the enteropathogen Yersinia enterocolitica (Ye) were used. Our model calculates the bacterial population dynamics during gastrointestinal infection and accounts for specific pathogen characteristics, the host immune capacity and colonization resistance mediated by the endogenous microbiome. We calibrated the model to experimental data obtained by the infection of a healthy host. Afterward, we challenged our model by adopting scenarios where either a microbiome was lacking (mimicking antibiotic treatment of patients), or where the immune response was partially impaired. Experimental mouse infections approved predicted population dynamics based on these scenarios. Our model provides new hypotheses about the roles of host- and pathogen-derived factors and might be useful for developing personalized infection prevention and treatment strategies. |
|
Stochasticity of Coagulation and Fragmentation of Self-Assembly from Exact Computed Solution of Discrete Chemical Master Equation
Farid Manuchehrfar, Wei Tian, Tom Chou and Jie Liang
University of Illinois at Chicago, USCoagulation and fragmentation (CF) is a fundamental process in which particles attach to each other to form clusters, while existing clusters also break into smaller clusters. This is a ubiquitous process that plays significant roles in biological problems such as brain shrinkage, Alzheimer’s disease or amyloid-beta aggregation in neurodegenerative disease. CF often occurs in confined space with limited number of particles; thus the system can be highly stochastic. A fundamental approach to investigate CF is through solving the underlying discrete Chemical Master Equation (dCME), which provides exact descriptions of the time-evolving and the steady states of the CF system. Recent theoretical models which are based on dCME do not fully take into account the attachment, detachment, synthesis, and degradation, as well as the effects of dimensionality, simultaneously. We use the newly developed Accurate Chemical Master Equation (ACME) method to solve the underlying dCME of the CF process and examine the time evolving dynamics of CF system at different attachment, detachment, synthesis, and degradation rates. We demonstrate how these factors can have profound effects on the CF process. |
|
Systematically validated transcription-factor activity inference on a whole-cell scale
Cynthia Ma and Michael R. Brent
wustl, USA long-standing modeling problem is to infer the activity levels of each TF in many cell samples, given the gene expression profile of each sample and a qualitative network map, indicating which TFs have the potential to regulate each gene in the genome. Accurate TF-activity (TFA) inference would be useful for identifying TFs whose activity is affected by drug treatments or cancer mutations. It would also provide models for predicting the effects of knocking out or over expressing specific combinations of TFs. We present solutions problems that have limited the practical utility of TFA inference: (1) a method for constructing the required qualitative TF-network maps; (2) a method for exploiting samples of cells in which a TF has been genetically deleted; (3) a combination of regularization and constraints on parameters that improves both accuracy and interpretability; (4) the first application of TFA to a large collection of expression profiles, on a whole-cell scale, without prior knowledge beyond the qualitative network map. Our systematic, objective, genome-scale evaluations of inferred activity levels, using real data, show that our approach works on genomic scale. This opens the door to meaningful comparison of TFA inference methods and to their widespread application. |
|
Systems modeling of phenotypic plasticity of CD4+ T cell differentiation
Bhanwar Lal Puniya, Robert Todd, Akram Mohammed, Deborah Brown, Matteo Barberis and Tomas Helikar
University of Nebraska-Lincoln, USCD4+ T cells provide cell mediated-immunity in response to pathogens and diseases. After activation, naïve T cells differentiate into effector T helper and regulatory subtypes. These subtypes were initially thought of as terminally differentiated; however, plasticity in T cell differentiation has been observed in recent studies. In this study, we developed a logic-based computational model of signaling pathways that govern the differentiation process of naive T cells into T helper 1, 2, 17, and induced Treg cells. We characterized the dynamic capacity of T cell differentiation in response to the varying dosage of 512 extracellular cytokine combinations. In addition to the classical phenotypes, we predicted previously reported and novel complex T cell phenotypes that have co-existence of multiple lineage-specifying transcription factors (TFs). Our results suggested that plasticity in T cell differentiation is a function of both cytokine composition and dosage. We also identified the specific patterns of extracellular environments that can lead to each T cell subtype. Based on cytokine dosage, we identified the dominant stimuli that control the transition between canonical and complex phenotypes. In the end, we predicted the optimal activity of input cytokines that maximize the activity levels of multiple lineage-specifying TFs in complex phenotypes. |
|
The JSBML project: a fully featured Java API for working with systems biological models
Nicolas Rodriguez, Thomas M. Hamm, Roman Schulte, Leandro Watanabe, Ibrahim Y. Vazirabad, Victor Kofia, Chris J. Myers, Akira Funahashi, Nicolas Le Novère, Michael Hucka and Andreas Dräger
Applied Bioinformatics Group, Center for Bioinformatics Tübingen (ZBIT), University of Tübingen, DESBML is the most widely used data format to encode and exchange models in systems biology. The open-source JSBML project was launched in 2009 as an international collaboration aiming to provide a feature-rich Java implementation for reading, manipulating, and writing SBML files. The JSBML project is a stable, actively developed, and well-documented software project with many contributors around the world. A growing number of applications is now available that uses JSBML as their back-end for data manipulation. These cover diverse areas of use cases, including model building and graphical display, constraint-based modeling, dynamic simulation, annotation, etc. JSBML supports all levels, versions, and releases of SBML and provides numerous utility functions for working with this standard. Thereby, JSBML integrates well with further Java libraries for community standards. The JSBML team actively maintains and updates the project. JSBML is being used in students’ education and numerous research projects. Major model databases, such as BioModels or BiGG Models, use JSBML-based tools for their curation pipelines. JSBML is also regularly subject of international students coding events. JSBML can be freely obtained under the terms of the LGPL 2.1 from https://github.com/sbmlteam/jsbml/. The users’ guide at http://sbml.org/Software/JSBML/docs/ provides further information about using JSBML. Contact: jsbml-development@googlegroups.com |
|
Topological and Dynamical Characteristics of a Large-Scale Signaling Network by Module Analysis
Cong-Doan Truong, Tien-Tung Truong and Yung-Keun Kwon
University of Ulsan, KRIt is a challenge in systems biology to unravel modular characteristics of biological networks. In this paper, we examined a large-scale signaling network consisting of 5443 genes and 37663 interactions to discover topological or dynamical characteristics of the modules in the network. By using a module detection algorithm, we identified 17 modules in the network and classified them into two groups, Group 1 and 2, according that the module size is larger and smaller than the average value, respectively. First, we compared the proportions of important genes such as disease, drug-target, and essential genes between two module groups, and found that they were significantly higher in Group 1 than in Group 2. Second, we computed five different centrality measures of the modules and observed that Group 1 is more central than Group 2. Third, we examined module-based robustness and it was shown that in-module robustness of Group 1 was slightly higher than that of Group 2, whereas out-module of Group 1 was smaller than that of Group 2. Finally, gene ontology of Group 1 and 2 was differently enriched. Taken together, the modular structure can be a useful property to understand topological and dynamical characteristics of a large-scale signaling network. |
|
Tracking and Engineering the Evolution of Organismal Fitness via Multi-Organism mRNA Translation Whole Cell Simulations
Hadas Zur, Rachel Cohen-Kupiec and Tamir Tuller
Tel Aviv University, ILWe report the first whole-cell translation simulations of several organisms, to understand their evolution. The models consider all fundamental biophysical aspects of translation dynamics and are based on parameters estimated from experimental data. We developed tools, such as ancestral parameter reconstruction, for comparing sets of whole-cell translation models, and understanding transcriptome evolution via connecting genotypes to phenotypes (translation biophysics). Among others, we show that in S.cerevisiae our model was able to explain 49% of the experimental data variability, with elongation explaining 23%. Via analyses of the inferred changes in the genomic nucleotide composition and biophysical aspects of translation, we demonstrated how various known and yet unknown patterns (e.g. sequence and structural motifs) in coding regions, improve organism fitness. Based on these models, we developed a novel generic approach for improving fitness by introducing silent mutations via elimination of ribosomal traffic jams. Thereby, more resources are available promoting improved fitness and growth-rate. The algorithm is already implemented on S.cerevisiae via CRISPR-CAS9 genome editing tool, where we show that by introducing silent mutations to two genes we can increase the growth rate by 3-5%. The approach can be used for improving the fitness of any organism used in biotechnology, medicine, and agriculture. |
|
Using Bayesian Optimization to Learn Parameters of Molecular Self Assembly Systems
Sushant Patkar, Marcus Thomas, Russell Schwartz and Roded Sharan
Tel Aviv University, ILThe spontaneous self-assembly of molecules into functional complexes is central to all major cellular processes, yet self-assembly chemistry has only slowly been incorporated into systems biology modeling. This in large part results from substantial computational and experimental challenges to self-assembly modeling, simulation, and model inference compared to simpler enzymatic and transport networks. Rule-based stochastic simulation has provided a way to model and tractably simulate even highly complicated self-assembly reaction networks and to learn model parameters from experimental data via simulation-based model fitting. Nonetheless, large parameter spaces, high computational cost of simulations, and limited experimental data have so far precluded the use of Bayesian methods for characterizing uncertainty of model fits, which have become the standard for most other systems biology model inference. In the present work, we improve on prior data-driven model inference for self-assembly systems in two directions: 1) extending data-fitting to encompass small-angle scattering (SAS), a richer experimental data source than the static light scattering (SLS) used in prior work, and 2) developing an efficient Bayesian optimization framework by learning Gaussian process models as surrogate functions to capture model uncertainty. We demonstrate and validate the approach on synthetic SAS data for a virus capsid assembly model. |