1. Curation

BEL Commons hosts biological knowledge assemblies that are encoded in Biological Expression Language. BEL supports the assembly of context-specific qualitative causal and correlative relations between biological entities across multiple modes and scales in BEL Script with provenance information, external namespace references, relation provenance (citation and evidence), and relation metadata such as the biological context (anatomy, cell, disease, etc.).

Upload and Parse

On the home page, click Parse BEL. In the form, choose a file, then click "upload".

The link to the BEL Script used in this video can be found here.

Share

Users can create projects that allow them to easily share networks with groups of other users. This is useful when multiple curators are uploading related BEL Scripts. Later, they can be assembled and queried.

Validate Namespaces

Large terminologies that are curated for projects investigating new diseases and pathologies can be validated by checking their contents using the Ontology Lookup Service, provided by the EBI to identify duplicate names and enable better semantic integration.

2. Summarize a Knowledge Assembly

From the home page, click List Networks, find your network, and select "Summarize".

This web application organizes high level statistical information about a network, such as the number of nodes, edges, author contributions, citation contributions, and provenance information as well as global network statistics such as the average node degree, network density, number of weakly connected components, etc. When appropriate, it proves feedback on syntax and semantics of the source BEL document to assist in curation.

Finally, the summary page provides an assessment of the "Biological Grammar", or the biological validity of statements. These analysis include identification of contradictory edges, unstable biological motifs in pairs and triplets of nodes, and other information that is inferred to be missing or incomplete.

3. Exploration of the Knowledge Base

Query Builder

A query contain three steps:

  1. Assembly: A list of networks can be assembled. This is useful to integrate many networks in the same disease area that have been produced at different times and in different places. Additionally, this allows for integration of other static resources such as gene families, orthology information, biochemical reaction databases, and other annotations to fill in the most fine-granular information extracted from structured knowledge bases. We provide several networks publicly for users to add to their assemblies.
  2. Seeding: Large assemblies of networks are very difficult to view, especially when there is a certain point of interest for the user. We provide several network seeding methods to create a more relevant and managable network before performing general queries. For example, subgraphs can be seeded around a given node or list of nodes, based on edges with certain properties, or even by author or citation provenance information.
  3. Transformations: Networks can be modified using the entire suite of tools provided by PyBEL. This includes filters for nodes based on certain properties, expansion around nodes of interest, exclusion of nodes or groups of nodes, and more.

Finally, the results of queries can be summarized, downloaded in many formats, or explored.

Interactive Network Explorer

The results of query can be explored interactively with the Biological Network Explorer. Its tools panel contains an extended query builder interface that can be used to apply additional transformations. Network algorithms can be readily applied to networks such as path searches, centrality calculations, and overlaying of external data. These data can come from differential expression experiments, or directly from the results of the Heat Diffusion workflow, which is explained below.

4. Analyze a Knowledge Assembly

Data sets like differential gene expression can be used to quantify the perturbation amplitude of biological processes in a network using an randomized algorithm based on NPA. Candidate upstream mechanisms are generated for each biological process and a heat diffusion algorithm is used to quantify the cumulative observed effect of upstream genes and gene products based on the differential gene expression data.

This algorithm is general enough that other data types could be used, such as copy number variations on SNPs or clinical measurements of neuro-imaging features, which have been annotated to our Alzheimer Disease Knowledge assembly with NeuroMMSigDB.

Data sets can be directly uploaded and analyzed. The results of these experiments can then be directly overlaid to the interactive network viewer to provide a data-driven analysis of given networks or sub-networks.

About

BEL Commons is developed and maintained in an academic capacity by Charles Tapley Hoyt and Daniel Domingo-Fernández at the Fraunhofer SCAI Department of Bioinformatics with support from the IMI project, AETIONOMY. It is built on top of the open source project, PyBEL. Please feel free to contact us here to give us feedback or report any issues.