Input and Output

Input and output functions for BEL graphs.

PyBEL provides multiple lossless interchange options for BEL. Lossy output formats are also included for convenient export to other programs. Notably, a de facto interchange using Resource Description Framework (RDF) to match the ability of other existing software is excluded due the immaturity of the BEL to RDF mapping.

pybel.load(path, **kwargs)[source]

Read a BEL graph.

Parameters
  • path (str) – The path to a BEL graph in any of the formats with extensions described below

  • kwargs – The keyword arguments are passed to the importer function

Return type

BELGraph

Returns

A BEL graph.

This is the universal loader, which means any file path can be given and PyBEL will look up the appropriate load function. Allowed extensions are:

  • bel

  • bel.nodelink.json

  • bel.cx.json

  • bel.jgif.json

The previous extensions also support gzipping. Other allowed extensions that don’t support gzip are:

  • bel.pickle / bel.gpickle / bel.pkl

  • indra.json

pybel.dump(graph, path, **kwargs)[source]

Write a BEL graph.

Parameters
  • graph (BELGraph) – A BEL graph

  • path (str) – The path to which the BEL graph is written.

  • kwargs – The keyword arguments are passed to the exporter function

This is the universal loader, which means any file path can be given and PyBEL will look up the appropriate writer function. Allowed extensions are:

  • bel

  • bel.nodelink.json

  • bel.unodelink.json

  • bel.cx.json

  • bel.jgif.json

  • bel.graphdati.json

The previous extensions also support gzipping. Other allowed extensions that don’t support gzip are:

  • bel.pickle / bel.gpickle / bel.pkl

  • indra.json

  • tsv

  • gsea

Return type

None

Import

Parsing Modes

The PyBEL parser has several modes that can be enabled and disabled. They are described below.

Allow Naked Names

By default, this is set to False. The parser does not allow identifiers that are not qualified with namespaces (naked names), like in p(YFG). A proper namespace, like p(HGNC:YFG) must be used. By setting this to True, the parser becomes permissive to naked names. In general, this is bad practice and this feature will be removed in the future.

Allow Nested

By default, this is set to False. The parser does not allow nested statements is disabled. See overview. By setting this to True the parser will accept nested statements one level deep.

Citation Clearing

By default, this is set to True. While the BEL specification clearly states how the language should be used as a state machine, many BEL documents do not conform to the strict SET/UNSET rules. To guard against annotations accidentally carried from one set of statements to the next, the parser has two modes. By default, in citation clearing mode, when a SET CITATION command is reached, it will clear all other annotations (except the STATEMENT_GROUP, which has higher priority). This behavior can be disabled by setting this to False to re-enable strict parsing.

Reference

pybel.from_bel_script(path, **kwargs)[source]

Load a BEL graph from a file resource. This function is a thin wrapper around from_lines().

Parameters

path (Union[str, TextIO]) – A path or file-like

The remaining keyword arguments are passed to pybel.io.line_utils.parse_lines(), which populates a BELGraph.

Return type

BELGraph

pybel.from_bel_script_url(url, **kwargs)[source]

Load a BEL graph from a URL resource.

Parameters

url (str) – A valid URL pointing to a BEL document

The remaining keyword arguments are passed to pybel.io.line_utils.parse_lines().

Return type

BELGraph

pybel.to_bel_script(graph, path, use_identifiers=True)[source]

Write the BELGraph as a canonical BEL script.

Parameters
  • graph (BELGraph) – the BEL Graph to output as a BEL Script

  • path (Union[str, TextIO]) – A path or file-like.

  • use_identifiers (bool) – Enables extended BEP-0008 syntax

Return type

None

Hetionet

Importer for Hetionet JSON.

pybel.from_hetionet_json(hetionet_dict, use_tqdm=True)[source]

Convert a Hetionet dictionary to a BEL graph.

Return type

BELGraph

pybel.from_hetionet_file(file)[source]

Get Hetionet from a JSON file.

Return type

BELGraph

pybel.from_hetionet_gz(path)[source]

Get Hetionet from its JSON GZ file.

Return type

BELGraph

pybel.get_hetionet()[source]

Get Hetionet from GitHub, cache, and convert to BEL.

Return type

BELGraph

Transport

All transport pairs are reflective and data-preserving.

Bytes

Conversion functions for BEL graphs with bytes and Python pickles.

pybel.from_bytes(bytes_graph, check_version=True)[source]

Read a graph from bytes (the result of pickling the graph).

Parameters
  • bytes_graph (bytes) – File or filename to write

  • check_version (bool) – Checks if the graph was produced by this version of PyBEL

Return type

BELGraph

pybel.to_bytes(graph, protocol=5)[source]

Convert a graph to bytes with pickle.

Note that the pickle module has some incompatibilities between Python 2 and 3. To export a universally importable pickle, choose 0, 1, or 2.

Parameters
  • graph (BELGraph) – A BEL graph

  • protocol (int) – Pickling protocol to use. Defaults to HIGHEST_PROTOCOL.

Return type

bytes

pybel.from_bytes_gz(bytes_graph)[source]

Read a graph from gzipped bytes (the result of pickling the graph).

Parameters

bytes_graph (bytes) – File or filename to write

Return type

BELGraph

pybel.to_bytes_gz(graph, protocol=5)[source]

Convert a graph to gzipped bytes with pickle.

Parameters
  • graph (BELGraph) – A BEL graph

  • protocol (int) – Pickling protocol to use. Defaults to HIGHEST_PROTOCOL.

Return type

bytes

pybel.from_pickle(path, check_version=True)[source]

Read a graph from a pickle file.

Parameters
  • path (Union[str, BinaryIO]) – File or filename to read. Filenames ending in .gz or .bz2 will be uncompressed.

  • check_version (bool) – Checks if the graph was produced by this version of PyBEL

Return type

BELGraph

pybel.to_pickle(graph, path, protocol=5)[source]

Write this graph to a pickle file.

Note that the pickle module has some incompatibilities between Python 2 and 3. To export a universally importable pickle, choose 0, 1, or 2.

Parameters
  • graph (BELGraph) – A BEL graph

  • path (Union[str, BinaryIO]) – A path or file-like

  • protocol (int) – Pickling protocol to use. Defaults to HIGHEST_PROTOCOL.

Return type

None

pybel.from_pickle_gz(path)[source]

Read a graph from a gzipped pickle file.

Return type

BELGraph

pybel.to_pickle_gz(graph, path, protocol=5)[source]

Write this graph to a gzipped pickle file.

Return type

None

Streamable BEL (JSONL)

Streamable BEL as JSON.

pybel.from_sbel(it, includes_metadata=True)[source]

Load a BEL graph from an iterable of dictionaries corresponding to lines in BEL JSONL.

Parameters
  • it (Iterable[Any]) – An iterable of dictionaries.

  • includes_metadata (bool) – By default, interprets the first element of the iterable as the graph’s metadata. Switch to False to disable.

Return type

BELGraph

Returns

A BEL graph

pybel.to_sbel(graph)[source]

Create a list of JSON dictionaries corresponding to lines in BEL JSONL.

Return type

List[Any]

pybel.from_sbel_file(path)[source]

Build a graph from the BEL JSONL contained in the given file.

Parameters

path (Union[str, TextIO]) – A path or file-like

Return type

BELGraph

pybel.to_sbel_file(graph, path, separators=',', ':', **kwargs)[source]

Write this graph as BEL JSONL to a file.

Parameters
  • graph (BELGraph) – A BEL graph

  • separators – The separators used in json.dumps()

  • path (Union[str, TextIO]) – A path or file-like

Return type

None

pybel.from_sbel_gz(path)[source]

Read a graph as BEL JSONL from a gzip file.

Return type

BELGraph

pybel.to_sbel_gz(graph, path, separators=',', ':', **kwargs)[source]

Write a graph as BEL JSONL to a gzip file.

Parameters
  • graph (BELGraph) – A BEL graph

  • separators – The separators used in json.dumps()

  • path (str) – A path for a gzip file

Return type

None

Cyberinfrastructure Exchange

This module wraps conversion between pybel.BELGraph and the Cyberinfrastructure Exchange (CX) JSON.

CX is an aspect-oriented network interchange format encoded in JSON with a format inspired by the JSON-LD encoding of Resource Description Framework (RDF). It is primarily used by the Network Data Exchange (NDEx) and more recent versions of Cytoscape.

See also

pybel.from_cx(cx)[source]

Rebuild a BELGraph from CX JSON output from PyBEL.

Parameters

cx (List[Dict]) – The CX JSON object for this graph

Return type

BELGraph

pybel.to_cx(graph)[source]

Convert a BEL Graph to a CX JSON object for use with NDEx.

Return type

List[Dict]

pybel.from_cx_jsons(graph_json_str)[source]

Read a BEL graph from a CX JSON string.

Return type

BELGraph

pybel.to_cx_jsons(graph, **kwargs)[source]

Dump this graph as a CX JSON object to a string.

Return type

str

pybel.from_cx_file(path)[source]

Read a file containing CX JSON and converts to a BEL graph.

Parameters

path (Union[str, TextIO]) – A readable file or file-like containing the CX JSON for this graph

Return type

BELGraph

Returns

A BEL Graph representing the CX graph contained in the file

pybel.to_cx_file(graph, path, indent=2, **kwargs)[source]

Write a BEL graph to a JSON file in CX format.

Parameters
  • graph (BELGraph) – A BEL graph

  • path (Union[str, TextIO]) – A writable file or file-like

  • indent (Optional[int]) – How many spaces to use to pretty print. Change to None for no pretty printing

The example below shows how to output a BEL graph as CX to an open file.

from pybel.examples import sialic_acid_graph
from pybel import to_cx_file
with open('graph.bel.cx.json', 'w') as file:
    to_cx_file(sialic_acid_graph, file)

The example below shows how to output a BEL graph as CX to a file at a given path.

from pybel.examples import sialic_acid_graph
from pybel import to_cx_file
to_cx_file(sialic_acid_graph, 'graph.bel.cx.json')

If you have a big graph, you might consider storing it as a gzipped JGIF file by using to_cx_gz().

Return type

None

pybel.from_cx_gz(path)[source]

Read a graph as CX JSON from a gzip file.

Return type

BELGraph

pybel.to_cx_gz(graph, path, **kwargs)[source]

Write a graph as CX JSON to a gzip file.

Return type

None

JSON Graph Interchange Format

Conversion functions for BEL graphs with JGIF JSON.

The JSON Graph Interchange Format (JGIF) is specified similarly to the Node-Link JSON. Interchange with this format provides compatibilty with other software and repositories, such as the Causal Biological Network Database.

pybel.from_jgif(graph_jgif_dict, parser_kwargs=None)[source]

Build a BEL graph from a JGIF JSON object.

Parameters

graph_jgif_dict (dict) – The JSON object representing the graph in JGIF format

Return type

BELGraph

pybel.to_jgif(graph)[source]

Build a JGIF dictionary from a BEL graph.

Parameters

graph (pybel.BELGraph) – A BEL graph

Returns

A JGIF dictionary

Return type

dict

Warning

Untested! This format is not general purpose and is therefore time is not heavily invested. If you want to use Cytoscape.js, we suggest using pybel.to_cx() instead.

The example below shows how to output a BEL graph as a JGIF dictionary.

import os
from pybel.examples import sialic_acid_graph
graph_jgif_json = pybel.to_jgif(sialic_acid_graph)

If you want to write the graph directly to a file as JGIF, see func:to_jgif_file.

pybel.from_jgif_jsons(graph_json_str)[source]

Read a BEL graph from a JGIF JSON string.

Return type

BELGraph

pybel.to_jgif_jsons(graph, **kwargs)[source]

Dump this graph as a JGIF JSON object to a string.

Return type

str

pybel.from_jgif_file(path)[source]

Build a graph from the JGIF JSON contained in the given file.

Parameters

path (Union[str, TextIO]) – A path or file-like

Return type

BELGraph

pybel.to_jgif_file(graph, file, **kwargs)[source]

Write JGIF to a file.

Parameters
  • graph (BELGraph) – A BEL graph

  • file (Union[str, TextIO]) – A writable file or file-like

The example below shows how to output a BEL graph as JGIF to an open file.

from pybel.examples import sialic_acid_graph
from pybel import to_jgif_file
with open('graph.bel.jgif.json', 'w') as file:
    to_jgif_file(sialic_acid_graph, file)

The example below shows how to output a BEL graph as JGIF to a file at a given path.

from pybel.examples import sialic_acid_graph
from pybel import to_jgif_file
to_jgif_file(sialic_acid_graph, 'graph.bel.jgif.json')

If you have a big graph, you might consider storing it as a gzipped JGIF file by using to_jgif_gz().

Return type

None

pybel.from_jgif_gz(path)[source]

Read a graph as JGIF JSON from a gzip file.

Return type

BELGraph

pybel.to_jgif_gz(graph, path, **kwargs)[source]

Write a graph as JGIF JSON to a gzip file.

Return type

None

pybel.post_jgif(graph, url, **kwargs)[source]

Post the JGIF to a given URL.

Return type

Response

pybel.from_cbn_jgif(graph_jgif_dict)[source]

Build a BEL graph from CBN JGIF.

Map the JGIF used by the Causal Biological Network Database to standard namespace and annotations, then builds a BEL graph using pybel.from_jgif().

Parameters

graph_jgif_dict (dict) – The JSON object representing the graph in JGIF format

Return type

BELGraph

Example: .. code-block:: python

import requests from pybel import from_cbn_jgif apoptosis_url = ‘http://causalbionet.com/Networks/GetJSONGraphFile?networkId=810385422’ graph_jgif_dict = requests.get(apoptosis_url).json() graph = from_cbn_jgif(graph_jgif_dict)

Warning

Handling the annotations is not yet supported, since the CBN documents do not refer to the resources used to create them. This may be added in the future, but the annotations must be stripped from the graph before uploading to the network store using pybel.struct.mutation.strip_annotations().

pybel.from_cbn_jgif_file(path)[source]

Build a graph from a file containing the CBN variant of JGIF.

Parameters

path (Union[str, TextIO]) – A path or file-like

Return type

BELGraph

GraphDati

Conversion functions for BEL graphs with GraphDati.

Note that these are not exact I/O - you can’t currently use them as a round trip because the input functions expect the GraphDati format that’s output by BioDati.

pybel.to_graphdati(graph, *, use_identifiers=True, skip_unqualified=True, use_tqdm=False, metadata_extras=None)[source]

Export a GraphDati list using the nanopub.

Parameters
  • graph – A BEL graph

  • use_identifiers (bool) – use OBO-style identifiers

  • use_tqdm (bool) – Show a progress bar while generating nanopubs

  • skip_unqualified (bool) – Should unqualified edges be output as nanopubs? Defaults to false.

  • metadata_extras (Optional[Mapping[str, Any]]) – Extra information to pass into the metadata part of nanopubs

Return type

List[Mapping[str, Mapping[str, Any]]]

pybel.from_graphdati(j, use_tqdm=True)[source]

Convert data from the “normal” network format.

Warning

BioDati crashes when requesting the full network format, so this isn’t yet explicitly supported

Return type

BELGraph

pybel.to_graphdati_file(graph, path, use_identifiers=True, **kwargs)[source]

Write this graph as GraphDati JSON to a file.

Parameters
  • graph (BELGraph) – A BEL graph

  • path (Union[str, TextIO]) – A path or file-like

Return type

None

pybel.from_graphdati_file(path)[source]

Load a file containing GraphDati JSON.

Parameters

path (Union[str, TextIO]) – A path or file-like

Return type

BELGraph

pybel.to_graphdati_gz(graph, path, **kwargs)[source]

Write a graph as GraphDati JSON to a gzip file.

Return type

None

pybel.from_graphdati_gz(path)[source]

Read a graph as GraphDati JSON from a gzip file.

Return type

BELGraph

pybel.to_graphdati_jsons(graph, **kwargs)[source]

Dump this graph as a GraphDati JSON object to a string.

Parameters

graph (BELGraph) – A BEL graph

Return type

str

pybel.from_graphdati_jsons(s)[source]

Load a graph from a GraphDati JSON string.

Parameters

graph – A BEL graph

Return type

BELGraph

pybel.to_graphdati_jsonl(graph, file, use_identifiers=True, use_tqdm=True)[source]

Write this graph as a GraphDati JSON lines file.

Parameters

graph – A BEL graph

pybel.to_graphdati_jsonl_gz(graph, path, **kwargs)[source]

Write a graph as GraphDati JSONL to a gzip file.

Parameters

graph (BELGraph) – A BEL graph

Return type

None

INDRA

Conversion functions for BEL graphs with INDRA.

After assembling a model with INDRA, a list of indra.statements.Statement can be converted to a pybel.BELGraph with indra.assemblers.pybel.PybelAssembler.

from indra.assemblers.pybel import PybelAssembler
import pybel

stmts = [
    # A list of INDRA statements
]

pba = PybelAssembler(
    stmts,
    name='Graph Name',
    version='0.0.1',
    description='Graph Description'
)
graph = pba.make_model()

# Write to BEL file
pybel.to_bel_path(belgraph, 'simple_pybel.bel')

Warning

These functions are hard to unit test because they rely on a whole set of java dependencies and will likely not be for a while.

pybel.from_indra_statements(stmts, name=None, version=None, description=None, authors=None, contact=None, license=None, copyright=None, disclaimer=None)[source]

Import a model from indra.

Parameters
  • stmts (List[indra.statements.Statement]) – A list of statements

  • name (Optional[str]) – The graph’s name

  • version (Optional[str]) – The graph’s version. Recommended to use semantic versioning or YYYYMMDD format.

  • description (Optional[str]) – The description of the graph

  • authors (Optional[str]) – The authors of this graph

  • contact (Optional[str]) – The contact email for this graph

  • license (Optional[str]) – The license for this graph

  • copyright (Optional[str]) – The copyright for this graph

  • disclaimer (Optional[str]) – The disclaimer for this graph

Return type

pybel.BELGraph

pybel.from_indra_statements_json(stmts_json, **kwargs)[source]

Get a BEL graph from INDRA statements JSON.

Return type

BELGraph

Other kwargs are passed to from_indra_statements().

pybel.from_indra_statements_json_file(file, **kwargs)[source]

Get a BEL graph from INDRA statements JSON file.

Return type

BELGraph

Other kwargs are passed to from_indra_statements().

pybel.to_indra_statements(graph)[source]

Export this graph as a list of INDRA statements using the indra.sources.pybel.PybelProcessor.

Parameters

graph (pybel.BELGraph) – A BEL graph

Return type

list[indra.statements.Statement]

pybel.to_indra_statements_json(graph)[source]

Export this graph as INDRA JSON list.

Parameters

graph (pybel.BELGraph) – A BEL graph

Return type

List[Mapping[str, Any]]

pybel.to_indra_statements_json_file(graph, path, indent=2, **kwargs)[source]

Export this graph as INDRA statement JSON.

Parameters

Other kwargs are passed to json.dump().

pybel.from_biopax(path, **kwargs)[source]

Import a model encoded in Pathway Commons BioPAX via indra.

Parameters

path (str) – Path to a BioPAX OWL file

Return type

pybel.BELGraph

Other kwargs are passed to from_indra_statements().

Warning

Not compatible with all BioPAX! See INDRA documentation.

Visualization

Jupyter

Support for displaying BEL graphs in Jupyter notebooks.

pybel.to_jupyter(graph, width=1000, height=650, color_map=None)[source]

Display a BEL graph inline in a Jupyter notebook.

To use successfully, make run as the last statement in a cell inside a Jupyter notebook.

Parameters
  • graph (BELGraph) – A BEL graph

  • width (int) – The width of the visualization window to render

  • height (int) – The height of the visualization window to render

  • color_map (Optional[Mapping[str, str]]) – A dictionary from PyBEL internal node functions to CSS color strings like #FFEE00. Defaults to default_color_map

Returns

An IPython notebook Javascript object

Return type

IPython.display.Javascript

Analytical Services

PyNPA

Exporter for PyNPA.

pybel.to_npa_directory(graph, directory, **kwargs)[source]

Write the BEL file to two files in the directory for pynpa.

Return type

None

pybel.to_npa_dfs(graph, cartesian_expansion=False, nomenclature_method_first_layer=None, nomenclature_method_second_layer=None, direct_tf_only=False)[source]

Export the BEL graph as two lists of triples for the pynpa.

Parameters
  • graph (BELGraph) – A BEL graph

  • cartesian_expansion (bool) – If true, applies cartesian expansion on both reactions (reactants x products) as well as list abundances using list_abundance_cartesian_expansion() and reaction_cartesian_expansion()

  • nomenclature_method_first_layer (Optional[str]) – Either “curie”, “name” or “inodes. Defaults to “curie”.

  • nomenclature_method_second_layer (Optional[str]) – Either “curie”, “name” or “inodes. Defaults to “curie”.

  1. Pick out all transcription factor relationships. Protein X is a transcription factor for gene Y IFF complex(p(X), g(Y)) -> r(Y)

  2. Get all other interactions between any gene/rna/protein that are directed causal for the PPI layer

Return type

Tuple[DataFrame, DataFrame]

HiPathia

Convert a BEL graph to HiPathia inputs.

Input

SIF File
  • Text file with three columns separated by tabs.

  • Each row represents an interaction in the pathway. First column is the source node, third column the target node, and the second is the type of relation between them.

  • Only activation and inhibition interactions are allowed.

  • The name of the nodes in this file will be stored as the IDs of the nodes.

  • The nodes IDs should have the following structure: N (dash) pathway ID (dash) node ID.

  • HiPathia distinguish between two types of nodes: simple and complex.

Simple nodes:

  • Simple nodes may include many genes, but only one is needed to perform the function of the node. This could correspond to a protein family of enzymes that all have the same function - only one of them needs to be present for the action to take place. Simple nodes are defined within

  • Node IDs from simple nodes do not include any space, i.e. N-hsa04370-11.

Complex nodes:

  • Complex nodes include different simple nodes and represent protein complexes. Each simple node within the complex represents one protein in the complex. This node requires the presence of all their simple nodes to perform its function.

  • Node IDs from complex nodes are the juxtaposition of the included simple node IDs, separated by spaces, i.e. N-hsa04370-10 26.

ATT File

Text file with twelve (12) columns separated by tabulars. Each row represents a node (either simple or complex).

The columns included are:

  1. ID: Node ID as explained above.

  2. label: Name to be shown in the picture of the pathway en HGNC. Generally, the gene name of the first included EntrezID gene is used as label. For complex nodes, we juxtapose the gene names of the first genes of each simple node included (see genesList column below).

  3. X: The X-coordinate of the position of the node in the pathway.

  4. Y: The Y-coordinate of the position of the node in the pathway.

  5. color: The default color of the node.

  6. shape: The shape of the node. “rectangle” should be used for genes and “circle” for metabolites.

  7. type: The type of the node, either “gene” for genes or “compound” for metabolites. For complex nodes, the type of each of their included simple nodes is juxtaposed separated by commas, i.e. gene,gene.

  8. label.cex: Amount by which plotting label should be scaled relative to the default.

  9. label.color: Default color of the node.

  10. width: Default width of the node.

  11. height: Default height of the node.

  12. genesList: List of genes included in each node, with EntrezID:

  • Simple nodes: EntrezIDs of the genes included, separated by commas (“,”) and no spaces, i.e. 56848,8877 for node N-hsa04370-11.

  • Complex nodes: GenesList of the simple nodes included, separated by a slash (“/”) and no spaces, and in the same order as in the node ID. For example, node N-hsa04370-10 26 includes two simple nodes: 10 and 26. Its genesList column is 5335,5336,/,9047, meaning that the genes included in node 10 are 5335 and 5336, and the gene included in node 26 is 9047.

pybel.to_hipathia(graph, directory, draw=True)[source]

Export HiPathia artifacts for the graph.

Return type

None

pybel.to_hipathia_dfs(graph, draw_directory=None)[source]

Get the ATT and SIF dataframes.

Parameters
  • graph (BELGraph) – A BEL graph

  • draw_directory (Optional[str]) – The directory in which a drawing should be output

  1. Identify nodes: 1. Identify all proteins 2. Identify all protein families 3. Identify all complexes with just a protein or a protein family in them

  2. Identify interactions between any of those things that are causal

  3. Profit!

Return type

Union[Tuple[None, None], Tuple[DataFrame, DataFrame]]

pybel.from_hipathia_paths(name, att_path, sif_path)[source]

Get a BEL graph from HiPathia files.

Return type

BELGraph

pybel.from_hipathia_dfs(name, att_df, sif_df)[source]

Get a BEL graph from HiPathia dataframes.

Return type

BELGraph

SPIA

An exporter for signaling pathway impact analysis (SPIA) described by [Tarca2009].

Tarca2009

Tarca, A. L., et al (2009). A novel signaling pathway impact analysis. Bioinformatics, 25(1), 75–82.

pybel.to_spia_dfs(graph)[source]

Create an excel sheet ready to be used in SPIA software.

Parameters

graph (BELGraph) – BELGraph

Return type

Mapping[str, DataFrame]

Returns

dictionary with matrices

pybel.to_spia_excel(graph, path)[source]

Write the BEL graph as an SPIA-formatted excel sheet at the given path.

Return type

None

pybel.to_spia_tsvs(graph, directory)[source]

Write the BEL graph as a set of SPIA-formatted TSV files in a given directory.

Return type

None

PyKEEN

Entry points for PyKEEN.

PyKEEN is a machine learning library for knowledge graph embeddings that supports node clustering, link prediction, entity disambiguation, question/answering, and other tasks with knowledge graphs. It provides an interface for registering plugins using Python’s entrypoints under the pykeen.triples.extension_importer and pykeen.triples.prefix_importer groups. More specific information about how the PyBEL plugins are loaded into PyKEEN can be found in PyBEL’s setup.cfg under the [options.entry_points] header.

The following example shows how you can parse/load the triples from a BEL document with the *.bel extension.

from urllib.request import urlretrieve
url = 'https://raw.githubusercontent.com/cthoyt/selventa-knowledge/master/selventa_knowledge/small_corpus.bel'
urlretrieve(url, 'small_corpus.bel')

# Example 1A: Make triples factory
from pykeen.triples import TriplesFactory
tf = TriplesFactory(path='small_corpus.bel')

# Example 1B: Use directly in the pipeline, which automatically invokes training/testing set stratification
from pykeen.pipeline import pipeline
results = pipeline(
    dataset='small_corpus.bel',
    model='TransE',
)

The same is true for precompiled BEL documents in the node-link format with the *.bel.nodelink.json extension and the pickle format with the *.bel.pickle extension.

The following example shows how you can load/parse the triples from a BEL document stored in BEL Commons using the bel-commons prefix in combination with the network’s identifier.

# Example 2A: Make a triples factory
from pykeen.triples import TriplesFactory
# the network's identifier is 528
tf = TriplesFactory(path='bel-commons:528')

# Example 1B: Use directly in the pipeline, which automatically invokes training/testing set stratification
from pykeen.pipeline import pipeline
results = pipeline(
    dataset='bel-commons:528',
    model='TransR',
)

Currently, this relies on the default BEL Commons service provider at https://bel-commons-dev.scai.fraunhofer.de, whose location might change in the future.

pybel.io.pykeen.get_triples_from_bel(path)[source]

Get triples from a BEL file by wrapping pybel.io.tsv.api.get_triples().

Parameters

path (str) – the file path to a BEL Script

Return type

ndarray

Returns

A three column array with head, relation, and tail in each row

Get triples from a BEL Node-link JSON file by wrapping pybel.io.tsv.api.get_triples().

Parameters

path (str) – the file path to a BEL Node-link JSON file

Return type

ndarray

Returns

A three column array with head, relation, and tail in each row

pybel.io.pykeen.get_triples_from_bel_pickle(path)[source]

Get triples from a BEL pickle file by wrapping pybel.io.tsv.api.get_triples().

Parameters

path (str) – the file path to a BEL pickle file

Return type

ndarray

Returns

A three column array with head, relation, and tail in each row

pybel.io.pykeen.get_triples_from_bel_commons(network_id)[source]

Load a BEL document from BEL Commons by wrapping pybel.io.tsv.api.get_triples().

Parameters

network_id (str) – The network identifier for a graph in BEL Commons

Return type

ndarray

Returns

A three column array with head, relation, and tail in each row

Machine Learning

Export functions for Machine Learning.

While BEL is a fantastic medium for storing metadata and high granularity information on edges, machine learning algorithms can not consume BEL graphs directly. This module provides functions that make inferences and interpretations of BEL graphs in order to interface with machine learning platforms. One example where we’ve done this is BioKEEN, which uses this module to convert BEL graphs into a format for knowledge graph embeddings.

pybel.to_triples(graph, use_tqdm=False, raise_on_none=False)[source]

Get a non-redundant list of triples representing the graph.

Parameters
  • graph (BELGraph) – A BEL graph

  • use_tqdm (bool) – Should a progress bar be shown?

  • raise_on_none (bool) – Should an exception be raised if no triples are returned?

Raises

NoTriplesValueError

Return type

List[Tuple[str, str, str]]

pybel.to_triples_file(graph, path, *, use_tqdm=False, sep='\\t', raise_on_none=False)[source]

Write the graph as a TSV.

Parameters
  • graph (BELGraph) – A BEL graph

  • path (Union[str, TextIO]) – A path or file-like

  • use_tqdm (bool) – Should a progress bar be shown?

  • sep – The separator to use

  • raise_on_none (bool) – Should an exception be raised if no triples are returned?

Raises

NoTriplesValueError

Return type

None

pybel.to_edgelist(graph, path, *, use_tqdm=False, sep='\\t', raise_on_none=False)[source]

Write the graph as an edgelist.

Parameters
  • graph (BELGraph) – A BEL graph

  • path (Union[str, TextIO]) – A path or file-like

  • use_tqdm (bool) – Should a progress bar be shown?

  • sep – The separator to use

  • raise_on_none (bool) – Should an exception be raised if no triples are returned?

Raises

NoTriplesValueError

Return type

None

Web Services

BEL Commons

Transport functions for BEL Commons.

BEL Commons is a free, open-source platform for hosting BEL content. Because it was originally developed and published in an academic capacity at Fraunhofer SCAI, a public instance can be found at https://bel-commons-dev.scai.fraunhofer.de. However, this instance is only supported out of posterity and will not be updated. If you would like to host your own instance of BEL Commons, there are instructions on its GitHub page.

pybel.from_bel_commons(network_id, host=None)[source]

Retrieve a public network from BEL Commons.

In the future, this function may be extended to support authentication.

Parameters
  • network_id (int) – The BEL Commons network identifier

  • host (Optional[str]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config with PYBEL_REMOTE_HOST or the environment as PYBEL_REMOTE_HOST.

Raises

ValueError if host configuration can not be found

Return type

BELGraph

pybel.to_bel_commons(graph, host=None, user=None, password=None, public=True)[source]

Send a graph to the receiver service and returns the requests response object.

Parameters
  • graph (BELGraph) – A BEL graph

  • host (Optional[str]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config with PYBEL_REMOTE_HOST or the environment as PYBEL_REMOTE_HOST.

  • user (Optional[str]) – Username for BEL Commons. Alternatively, looks up in PyBEL config with PYBEL_REMOTE_USER or the environment as PYBEL_REMOTE_USER

  • password (Optional[str]) – Password for BEL Commons. Alternatively, looks up in PyBEL config with PYBEL_REMOTE_PASSWORD or the environment as PYBEL_REMOTE_PASSWORD

  • public (bool) – Should the network be made public?

Return type

Response

Returns

The response object from requests

Amazon Simple Storage Service (S3)

Transport functions for Amazon Web Services (AWS).

AWS has a cloud-based file storage service called S3 that can be programatically accessed using the boto3 package. This module provides functions for quickly wrapping upload/download of BEL graphs using the gzipped Node-Link schema.

pybel.to_s3(graph, *, bucket, key, client=None)[source]

Save BEL to S3 as gzipped node-link JSON.

If you don’t specify an instantiated client, PyBEL will do its best to load a default one using boto3.client() like in the following example:

import pybel
from pybel.examples import sialic_acid_graph

graph = pybel.to_s3(
    sialic_acid_graph,
    bucket='your bucket',
    key='your file name.bel.nodelink.json.gz',
)

However, if you would like to configure your own, you can do it with something like this:

import boto3
s3_client = boto3.client('s3')

import pybel
from pybel.examples import sialic_acid_graph

graph = pybel.to_s3(
    sialic_acid_graph,
    client=s3_client,
    bucket='your bucket',
    key='your file name.bel.nodelink.json.gz',
)

Warning

This assumes you already have credentials set up on your machine

If you don’t already have a bucket, you can create one using boto3 by following this tutorial: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-example-creating-buckets.html

Return type

None

pybel.from_s3(*, bucket, key, client=None)[source]

Get BEL from gzipped node-link JSON from Amazon S3.

If you don’t specify an instantiated client, PyBEL will do its best to load a default one using boto3.client() like in the following example:

graph = pybel.from_s3(bucket='your bucket', key='your file name.bel.nodelink.json.gz')

However, if you would like to configure your own, you can do it with something like this:

import boto3
s3_client = boto3.client('s3')

import pybel
graph = pybel.from_s3(
    client=s3_client,
    bucket='your bucket',
    key='your file name.bel.nodelink.json.gz',
)
Return type

BELGraph

BioDati

Transport functions for BioDati.

BioDati is a paid, closed-source platform for hosting BEL content. However, they do have a demo instance running at https://studio.demo.biodati.com with which the examples in this module will be described.

As noted in the transport functions for BioDati, you should change the URLs to point to your own instance of BioDati. If you’re looking for an open source storage system for hosting your own BEL content, you may consider BEL Commons, with the caveat that it is currently maintained in an academic capacity. Disclosure: BEL Commons is developed by the developers of PyBEL.

pybel.to_biodati(graph, *, username='demo@biodati.com', password='demo', base_url='https://nanopubstore.demo.biodati.com', chunksize=None, use_tqdm=True, collections=None, overwrite=False, validate=True, email=False)[source]

Post this graph to a BioDati server.

Parameters
  • graph (BELGraph) – A BEL graph

  • username (str) – The email address to log in to BioDati. Defaults to “demo@biodati.com” for the demo server

  • password (str) – The password to log in to BioDati. Defaults to “demo” for the demo server

  • base_url (str) – The BioDati nanopub store base url. Defaults to “https://nanopubstore.demo.biodati.com” for the demo server’s nanopub store

  • chunksize (Optional[int]) – The number of nanopubs to post at a time. By default, does all.

  • use_tqdm (bool) – Should tqdm be used when iterating?

  • collections (Optional[Iterable[str]]) – Tags to add to the nanopubs for lookup on BioDati

  • overwrite (bool) – Set the BioDati upload “overwrite” setting

  • validate (bool) – Set the BioDati upload “validate” setting

  • email (Union[bool, str]) – Who should get emailed with results about the upload? If true, emails to user used for login. If string, emails to that user. If false, no email.

Return type

Response

Returns

The response from the BioDati server (last response if using chunking)

Warning

BioDati does not support large uploads (yet?).

Warning

The default public BioDati server has been put here. You should switch it to yours. It will look like https://nanopubstore.<YOUR NAME>.biodati.com.

pybel.from_biodati(network_id, username='demo@biodati.com', password='demo', base_url='https://networkstore.demo.biodati.com')[source]

Get a graph from a BioDati network store based on its network identifier.

Parameters
  • network_id (str) – The internal identifier of the network you want to download.

  • username (str) – The email address to log in to BioDati. Defaults to “demo@biodati.com” for the demo server

  • password (str) – The password to log in to BioDati. Defaults to “demo” for the demo server

  • base_url (str) – The BioDati network store base url. Defaults to “https://networkstore.demo.biodati.com” for the demo server’s network store

Example usage:

from pybel import from_biodati
network_id = '01E46GDFQAGK5W8EFS9S9WMH12'  # COVID-19 graph example from Wendy Zimmermann
graph = from_biodati(
    network_id=network_id,
    username='demo@biodati.com',
    password='demo',
    base_url='https://networkstore.demo.biodati.com',
)
graph.summarize()

Warning

The default public BioDati server has been put here. You should switch it to yours. It will look like https://networkstore.<YOUR NAME>.biodati.com.

Return type

BELGraph

Fraunhofer OrientDB

Transport functions for Fraunhofer’s OrientDB.

Fraunhofer hosts an instance of OrientDB that contains BEL in a schema similar to pybel.io.umbrella_nodelink. However, they include custom relations that do not come from a controlled vocabulary, and have not made the schema, ETL scripts, or documentation available.

Unlike BioDati and BEL Commons, the Fraunhofer OrientDB does not allow for uploads, so only a single function pybel.from_fraunhofer_orientdb() is provided by PyBEL.

pybel.from_fraunhofer_orientdb(database='covid', user='covid_user', password='covid', query=None)[source]

Get a BEL graph from the Fraunhofer OrientDB.

Parameters
  • database (str) – The OrientDB database to connect to

  • user (str) – The user to connect to OrientDB

  • password (str) – The password to connect to OrientDB

  • query (Optional[str]) – The query to run. Defaults to the URL encoded version of select from E, where E is all edges in the OrientDB edge database. Likely does not need to be changed, except in the case of selecting specific subsets of edges. Make sure you URL encode it properly, because OrientDB’s RESTful API puts it in the URL’s path.

By default, this function connects to the covid database, that corresponds to the COVID-19 Knowledge Graph 0. If other databases in the Fraunhofer OrientDB are published and demo username/password combinations are given, the following table will be updated.

Database

Username

Password

covid

covid_user

covid

The covid database can be downloaded and converted to a BEL graph like this:

import pybel
graph = pybel.from_fraunhofer_orientdb(
    database='covid',
    user='covid_user',
    password='covid',
)
graph.summarize()

However, because the source BEL scripts for the COVID-19 Knowledge Graph are available on GitHub and the authors pre-enabled it for PyBEL, it can be downloaded with pip install git+https://github.com/covid19kg/covid19kg.git and used with the following python code:

import covid19kg
graph = covid19kg.get_graph()
graph.summarize()

Warning

It was initially planned to handle some of the non-standard relationships listed in the Fraunhofer OrientDB’s schema in their OrientDB Studio instance, but none of them actually appear in the only network that is accessible. If this changes, please leave an issue at https://github.com/pybel/pybel/issues so it can be addressed.

0

Domingo-Fernández, D., et al. (2020). COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. bioRxiv 2020.04.14.040667.

Return type

BELGraph

EMMAA

Ecosystem of Machine-maintained Models with Automated Analysis (EMMAA).

EMMAA is a project built on top of INDRA by the Sorger Lab at Harvard Medical School. It automatically builds knowledge graphs around pathways/indications periodically (almost daily) using the INDRA Database, which in turn is updated periodically (almost daily) with the most recent literature from MEDLINE, PubMed Central, several major publishers, and other bespoke text corpora such as CORD-19.

pybel.from_emmaa(model, *, date=None)[source]

Get an EMMAA model as a BEL graph.

Get the most recent COVID-19 model from EMMAA with the following:

import pybel

covid19_emmaa_graph = pybel.from_emmaa('covid19')
covid19_emmaa_graph.summarize()

PyBEL does its best to look up the most recent model, but if that doesn’t work, you can specify it explicitly with the date keyword argument in the form of %Y-%m-%d-%H-%M-%S like in the following:

import pybel

covid19_emmaa_graph = pybel.from_emmaa('covid19', '2020-04-23-17-44-57')
covid19_emmaa_graph.summarize()
Return type

BELGraph

Databases

SQL Databases

Conversion functions for BEL graphs with a SQL database.

pybel.from_database(name, version=None, manager=None)[source]

Load a BEL graph from a database.

If name and version are given, finds it exactly with pybel.manager.Manager.get_network_by_name_version(). If just the name is given, finds most recent with pybel.manager.Manager.get_network_by_name_version()

Parameters
  • name (str) – The name of the graph

  • version (Optional[str]) – The version string of the graph. If not specified, loads most recent graph added with this name

Returns

A BEL graph loaded from the database

Return type

Optional[BELGraph]

pybel.to_database(graph, manager=None, use_tqdm=True)[source]

Store a graph in a database.

Parameters

graph (BELGraph) – A BEL graph

Returns

If successful, returns the network object from the database.

Return type

Optional[Network]

Neo4j

Output functions for BEL graphs to Neo4j.

pybel.to_neo4j(graph, neo_connection, use_tqdm=False)[source]

Upload a BEL graph to a Neo4j graph database using py2neo.

Parameters

Example Usage:

>>> import py2neo
>>> import pybel
>>> from pybel.examples import sialic_acid_graph
>>> neo_graph = py2neo.Graph("http://localhost:7474/db/data/")  # use your own connection settings
>>> pybel.to_neo4j(sialic_acid_graph, neo_graph)

Lossy Export

GraphML

Conversion functions for BEL graphs with GraphML.

pybel.to_graphml(graph, path, schema=None)[source]

Write a graph to a GraphML XML file using networkx.write_graphml().

Parameters
  • graph (BELGraph) – BEL Graph

  • path (Union[str, BinaryIO]) – Path to the new exported file

  • schema (Optional[str]) – Type of export. Currently supported: “simple” and “umbrella”.

The .graphml file extension is suggested so Cytoscape can recognize it. By default, this function exports using the PyBEL schema of including modifier information into the edges. As an alternative, this function can also distinguish between

Return type

None

Miscellaneous

This module contains IO functions for outputting BEL graphs to lossy formats, such as GraphML and CSV.

pybel.to_csv(graph, path, sep=None)[source]

Write the graph as a tab-separated edge list.

The resulting file will contain the following columns:

  1. Source BEL term

  2. Relation

  3. Target BEL term

  4. Edge data dictionary

See the Data Models section of the documentation for which data are stored in the edge data dictionary, such as queryable information about transforms on the subject and object and their associated metadata.

Return type

None

pybel.to_sif(graph, path, sep=None)[source]

Write the graph as a tab-separated SIF file.

The resulting file will contain the following columns:

  1. Source BEL term

  2. Relation

  3. Target BEL term

This format is simple and can be used readily with many applications, but is lossy in that it does not include relation metadata.

Return type

None

pybel.to_gsea(graph, path)[source]

Write the genes/gene products to a GRP file for use with GSEA gene set enrichment analysis.

See also

Return type

None