Input and Output
Input and output functions for BEL graphs.
PyBEL provides multiple lossless interchange options for BEL. Lossy output formats are also included for convenient export to other programs. Notably, a de facto interchange using Resource Description Framework (RDF) to match the ability of other existing software is excluded due the immaturity of the BEL to RDF mapping.
- pybel.load(path, **kwargs)[source]
Read a BEL graph.
- Parameters
path (
str
) – The path to a BEL graph in any of the formats with extensions described belowkwargs – The keyword arguments are passed to the importer function
- Return type
- Returns
A BEL graph.
This is the universal loader, which means any file path can be given and PyBEL will look up the appropriate load function. Allowed extensions are:
bel
bel.nodelink.json
bel.cx.json
bel.jgif.json
The previous extensions also support gzipping. Other allowed extensions that don’t support gzip are:
bel.pickle / bel.gpickle / bel.pkl
indra.json
- pybel.dump(graph, path, **kwargs)[source]
Write a BEL graph.
- Parameters
This is the universal loader, which means any file path can be given and PyBEL will look up the appropriate writer function. Allowed extensions are:
bel
bel.nodelink.json
bel.unodelink.json
bel.cx.json
bel.jgif.json
bel.graphdati.json
The previous extensions also support gzipping. Other allowed extensions that don’t support gzip are:
bel.pickle / bel.gpickle / bel.pkl
indra.json
tsv
gsea
- Return type
Import
Parsing Modes
The PyBEL parser has several modes that can be enabled and disabled. They are described below.
Allow Naked Names
By default, this is set to False
. The parser does not allow identifiers that are not qualified with
namespaces (naked names), like in p(YFG)
. A proper namespace, like p(HGNC:YFG)
must be used. By
setting this to True
, the parser becomes permissive to naked names. In general, this is bad practice and this
feature will be removed in the future.
Allow Nested
By default, this is set to False
. The parser does not allow nested statements is disabled. See overview.
By setting this to True
the parser will accept nested statements one level deep.
Citation Clearing
By default, this is set to True
. While the BEL specification clearly states how the language should be used as
a state machine, many BEL documents do not conform to the strict SET
/UNSET
rules. To guard against
annotations accidentally carried from one set of statements to the next, the parser has two modes. By default, in
citation clearing mode, when a SET CITATION
command is reached, it will clear all other annotations (except
the STATEMENT_GROUP
, which has higher priority). This behavior can be disabled by setting this to False
to re-enable strict parsing.
Reference
- pybel.from_bel_script(path: Union[str, TextIO], **kwargs) pybel.struct.graph.BELGraph [source]
Load a BEL graph from a file resource. This function is a thin wrapper around
from_lines()
.The remaining keyword arguments are passed to
pybel.io.line_utils.parse_lines()
, which populates aBELGraph
.- Return type
- pybel.from_bel_script_url(url, **kwargs)[source]
Load a BEL graph from a URL resource.
- Parameters
url (
str
) – A valid URL pointing to a BEL document
The remaining keyword arguments are passed to
pybel.io.line_utils.parse_lines()
.- Return type
Hetionet
Importer for Hetionet JSON.
Transport
All transport pairs are reflective and data-preserving.
Bytes
Conversion functions for BEL graphs with bytes and Python pickles.
- pybel.from_bytes(bytes_graph, check_version=True)[source]
Read a graph from bytes (the result of pickling the graph).
- pybel.to_bytes(graph, protocol=5)[source]
Convert a graph to bytes with pickle.
Note that the pickle module has some incompatibilities between Python 2 and 3. To export a universally importable pickle, choose 0, 1, or 2.
- Parameters
- Return type
- pybel.from_bytes_gz(bytes_graph)[source]
Read a graph from gzipped bytes (the result of pickling the graph).
- pybel.from_pickle(path: Union[str, BinaryIO], check_version: bool = True) pybel.struct.graph.BELGraph [source]
Read a graph from a pickle file.
- pybel.to_pickle(graph: pybel.struct.graph.BELGraph, path: Union[str, BinaryIO], protocol: int = 5) None [source]
Write this graph to a pickle file.
Note that the pickle module has some incompatibilities between Python 2 and 3. To export a universally importable pickle, choose 0, 1, or 2.
- Parameters
- Return type
Node-Link JSON
Conversion functions for BEL graphs with node-link JSON.
- pybel.from_nodelink(graph_json_dict, check_version=True)[source]
Build a graph from node-link JSON Object.
- Return type
- pybel.from_nodelink_jsons(graph_json_str, check_version=True)[source]
Read a BEL graph from a node-link JSON string.
- Return type
- pybel.to_nodelink_jsons(graph, **kwargs)[source]
Dump this graph as a node-link JSON object to a string.
- Return type
- pybel.from_nodelink_file(path: Union[str, TextIO], check_version: bool = True) pybel.struct.graph.BELGraph [source]
Build a graph from the node-link JSON contained in the given file.
- pybel.to_nodelink_file(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO], **kwargs) None [source]
Write this graph as node-link JSON to a file.
Streamable BEL (JSONL)
Streamable BEL as JSON.
- pybel.from_sbel(it, includes_metadata=True)[source]
Load a BEL graph from an iterable of dictionaries corresponding to lines in BEL JSONL.
- pybel.to_sbel(graph)[source]
Create a list of JSON dictionaries corresponding to lines in BEL JSONL.
- pybel.from_sbel_file(path: Union[str, TextIO]) pybel.struct.graph.BELGraph [source]
Build a graph from the BEL JSONL contained in the given file.
Cyberinfrastructure Exchange
This module wraps conversion between pybel.BELGraph
and the Cyberinfrastructure Exchange (CX) JSON.
CX is an aspect-oriented network interchange format encoded in JSON with a format inspired by the JSON-LD encoding of Resource Description Framework (RDF). It is primarily used by the Network Data Exchange (NDEx) and more recent versions of Cytoscape.
See also
The NDEx Data Model Specification
CX Support for Cytoscape.js on the Cytoscape App Store
- pybel.to_cx_jsons(graph, **kwargs)[source]
Dump this graph as a CX JSON object to a string.
- Return type
- pybel.from_cx_file(path: Union[str, TextIO]) pybel.struct.graph.BELGraph [source]
Read a file containing CX JSON and converts to a BEL graph.
- pybel.to_cx_file(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO], indent: Optional[int] = 2, **kwargs) None [source]
Write a BEL graph to a JSON file in CX format.
- Parameters
The example below shows how to output a BEL graph as CX to an open file.
from pybel.examples import sialic_acid_graph from pybel import to_cx_file with open('graph.bel.cx.json', 'w') as file: to_cx_file(sialic_acid_graph, file)
The example below shows how to output a BEL graph as CX to a file at a given path.
from pybel.examples import sialic_acid_graph from pybel import to_cx_file to_cx_file(sialic_acid_graph, 'graph.bel.cx.json')
If you have a big graph, you might consider storing it as a gzipped JGIF file by using
to_cx_gz()
.- Return type
JSON Graph Interchange Format
Conversion functions for BEL graphs with JGIF JSON.
The JSON Graph Interchange Format (JGIF) is specified similarly to the Node-Link JSON. Interchange with this format provides compatibilty with other software and repositories, such as the Causal Biological Network Database.
- pybel.from_jgif(graph_jgif_dict, parser_kwargs=None)[source]
Build a BEL graph from a JGIF JSON object.
- pybel.to_jgif(graph)[source]
Build a JGIF dictionary from a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Returns
A JGIF dictionary
- Return type
Warning
Untested! This format is not general purpose and is therefore time is not heavily invested. If you want to use Cytoscape.js, we suggest using
pybel.to_cx()
instead.The example below shows how to output a BEL graph as a JGIF dictionary.
import os from pybel.examples import sialic_acid_graph graph_jgif_json = pybel.to_jgif(sialic_acid_graph)
If you want to write the graph directly to a file as JGIF, see func:to_jgif_file.
- pybel.from_jgif_jsons(graph_json_str)[source]
Read a BEL graph from a JGIF JSON string.
- Return type
- pybel.to_jgif_jsons(graph, **kwargs)[source]
Dump this graph as a JGIF JSON object to a string.
- Return type
- pybel.from_jgif_file(path: Union[str, TextIO]) pybel.struct.graph.BELGraph [source]
Build a graph from the JGIF JSON contained in the given file.
- pybel.to_jgif_file(graph: pybel.struct.graph.BELGraph, file: Union[str, TextIO], **kwargs) None [source]
Write JGIF to a file.
The example below shows how to output a BEL graph as JGIF to an open file.
from pybel.examples import sialic_acid_graph from pybel import to_jgif_file with open('graph.bel.jgif.json', 'w') as file: to_jgif_file(sialic_acid_graph, file)
The example below shows how to output a BEL graph as JGIF to a file at a given path.
from pybel.examples import sialic_acid_graph from pybel import to_jgif_file to_jgif_file(sialic_acid_graph, 'graph.bel.jgif.json')
If you have a big graph, you might consider storing it as a gzipped JGIF file by using
to_jgif_gz()
.- Return type
- pybel.to_jgif_gz(graph, path, **kwargs)[source]
Write a graph as JGIF JSON to a gzip file.
- Return type
- pybel.from_cbn_jgif(graph_jgif_dict)[source]
Build a BEL graph from CBN JGIF.
Map the JGIF used by the Causal Biological Network Database to standard namespace and annotations, then builds a BEL graph using
pybel.from_jgif()
.- Parameters
graph_jgif_dict (dict) – The JSON object representing the graph in JGIF format
- Return type
Example: .. code-block:: python
import requests from pybel import from_cbn_jgif apoptosis_url = ‘http://causalbionet.com/Networks/GetJSONGraphFile?networkId=810385422’ graph_jgif_dict = requests.get(apoptosis_url).json() graph = from_cbn_jgif(graph_jgif_dict)
Warning
Handling the annotations is not yet supported, since the CBN documents do not refer to the resources used to create them. This may be added in the future, but the annotations must be stripped from the graph before uploading to the network store using
pybel.struct.mutation.strip_annotations()
.
GraphDati
Conversion functions for BEL graphs with GraphDati.
Note that these are not exact I/O - you can’t currently use them as a round trip because the input functions expect the GraphDati format that’s output by BioDati.
- pybel.to_graphdati(graph, *, use_identifiers=True, skip_unqualified=True, use_tqdm=False, metadata_extras=None)[source]
Export a GraphDati list using the nanopub.
- Parameters
graph – A BEL graph
use_identifiers (
bool
) – use OBO-style identifiersuse_tqdm (
bool
) – Show a progress bar while generating nanopubsskip_unqualified (
bool
) – Should unqualified edges be output as nanopubs? Defaults to false.metadata_extras (
Optional
[Mapping
[str
,Any
]]) – Extra information to pass into the metadata part of nanopubs
- Return type
- pybel.from_graphdati(j, use_tqdm=True)[source]
Convert data from the “normal” network format.
Warning
BioDati crashes when requesting the
full
network format, so this isn’t yet explicitly supported- Return type
- pybel.to_graphdati_file(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO], use_identifiers: bool = True, **kwargs) None [source]
Write this graph as GraphDati JSON to a file.
- pybel.to_graphdati_gz(graph, path, **kwargs)[source]
Write a graph as GraphDati JSON to a gzip file.
- Return type
- pybel.to_graphdati_jsons(graph, **kwargs)[source]
Dump this graph as a GraphDati JSON object to a string.
- pybel.from_graphdati_jsons(s)[source]
Load a graph from a GraphDati JSON string.
- Parameters
graph – A BEL graph
- Return type
INDRA
Conversion functions for BEL graphs with INDRA.
After assembling a model with INDRA, a list of
indra.statements.Statement
can be converted to a pybel.BELGraph
with
indra.assemblers.pybel.PybelAssembler
.
from indra.assemblers.pybel import PybelAssembler
import pybel
stmts = [
# A list of INDRA statements
]
pba = PybelAssembler(
stmts,
name='Graph Name',
version='0.0.1',
description='Graph Description'
)
graph = pba.make_model()
# Write to BEL file
pybel.to_bel_path(belgraph, 'simple_pybel.bel')
Warning
These functions are hard to unit test because they rely on a whole set of java dependencies and will likely not be for a while.
- pybel.from_indra_statements(stmts, name=None, version=None, description=None, authors=None, contact=None, license=None, copyright=None, disclaimer=None)[source]
Import a model from
indra
.- Parameters
stmts (List[indra.statements.Statement]) – A list of statements
version (
Optional
[str
]) – The graph’s version. Recommended to use semantic versioning orYYYYMMDD
format.
- Return type
- pybel.from_indra_statements_json(stmts_json, **kwargs)[source]
Get a BEL graph from INDRA statements JSON.
- Return type
Other kwargs are passed to
from_indra_statements()
.
- pybel.from_indra_statements_json_file(file, **kwargs)[source]
Get a BEL graph from INDRA statements JSON file.
- Return type
Other kwargs are passed to
from_indra_statements()
.
- pybel.to_indra_statements(graph)[source]
Export this graph as a list of INDRA statements using the
indra.sources.pybel.PybelProcessor
.- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
list[indra.statements.Statement]
- pybel.to_indra_statements_json(graph)[source]
Export this graph as INDRA JSON list.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- pybel.to_indra_statements_json_file(graph, path: Union[str, TextIO], indent: Optional[int] = 2, **kwargs)[source]
Export this graph as INDRA statement JSON.
- Parameters
graph (pybel.BELGraph) – A BEL graph
Other kwargs are passed to
json.dump()
.
- pybel.from_biopax(path, encoding=None, **kwargs)[source]
Import a model encoded in Pathway Commons BioPAX via
indra
.- Parameters
path (
str
) – Path to a BioPAX OWL fileencoding (
Optional
[str
]) – The encoding passed toindra.sources.biopax.process_owl()
. See https://github.com/sorgerlab/indra/pull/1199.
- Return type
Other kwargs are passed to
from_indra_statements()
.Warning
Not compatible with all BioPAX! See INDRA documentation.
Visualization
Jupyter
Support for displaying BEL graphs in Jupyter notebooks.
- pybel.to_jupyter(graph, width=1000, height=650, color_map=None)[source]
Display a BEL graph inline in a Jupyter notebook.
To use successfully, make run as the last statement in a cell inside a Jupyter notebook.
- Parameters
graph (
BELGraph
) – A BEL graphwidth (
int
) – The width of the visualization window to renderheight (
int
) – The height of the visualization window to rendercolor_map (
Optional
[Mapping
[str
,str
]]) – A dictionary from PyBEL internal node functions to CSS color strings like #FFEE00. Defaults todefault_color_map
- Returns
An IPython notebook Javascript object
- Return type
IPython.display.Javascript
Analytical Services
PyNPA
Exporter for PyNPA.
See also
- pybel.to_npa_directory(graph, directory, **kwargs)[source]
Write the BEL file to two files in the directory for
pynpa
.- Return type
- pybel.to_npa_dfs(graph, cartesian_expansion=False, nomenclature_method_first_layer=None, nomenclature_method_second_layer=None, direct_tf_only=False)[source]
Export the BEL graph as two lists of triples for the
pynpa
.- Parameters
graph (
BELGraph
) – A BEL graphcartesian_expansion (
bool
) – If true, applies cartesian expansion on both reactions (reactants x products) as well as list abundances usinglist_abundance_cartesian_expansion()
andreaction_cartesian_expansion()
nomenclature_method_first_layer (
Optional
[str
]) – Either “curie”, “name” or “inodes. Defaults to “curie”.nomenclature_method_second_layer (
Optional
[str
]) – Either “curie”, “name” or “inodes. Defaults to “curie”.
Pick out all transcription factor relationships. Protein X is a transcription factor for gene Y IFF
complex(p(X), g(Y)) -> r(Y)
Get all other interactions between any gene/rna/protein that are directed causal for the PPI layer
- Return type
Tuple
[DataFrame
,DataFrame
]
HiPathia
Convert a BEL graph to HiPathia inputs.
Input
SIF File
Text file with three columns separated by tabs.
Each row represents an interaction in the pathway. First column is the source node, third column the target node, and the second is the type of relation between them.
Only activation and inhibition interactions are allowed.
The name of the nodes in this file will be stored as the IDs of the nodes.
The nodes IDs should have the following structure: N (dash) pathway ID (dash) node ID.
HiPathia distinguish between two types of nodes: simple and complex.
Simple nodes:
Simple nodes may include many genes, but only one is needed to perform the function of the node. This could correspond to a protein family of enzymes that all have the same function - only one of them needs to be present for the action to take place. Simple nodes are defined within
Node IDs from simple nodes do not include any space, i.e. N-hsa04370-11.
Complex nodes:
Complex nodes include different simple nodes and represent protein complexes. Each simple node within the complex represents one protein in the complex. This node requires the presence of all their simple nodes to perform its function.
Node IDs from complex nodes are the juxtaposition of the included simple node IDs, separated by spaces, i.e. N-hsa04370-10 26.
ATT File
Text file with twelve (12) columns separated by tabulars. Each row represents a node (either simple or complex).
The columns included are:
ID
: Node ID as explained above.label
: Name to be shown in the picture of the pathway en HGNC. Generally, the gene name of the first included EntrezID gene is used as label. For complex nodes, we juxtapose the gene names of the first genes of each simple node included (see genesList column below).X
: The X-coordinate of the position of the node in the pathway.Y
: The Y-coordinate of the position of the node in the pathway.color
: The default color of the node.shape
: The shape of the node. “rectangle” should be used for genes and “circle” for metabolites.type
: The type of the node, either “gene” for genes or “compound” for metabolites. For complex nodes, the type of each of their included simple nodes is juxtaposed separated by commas, i.e. gene,gene.label.cex
: Amount by which plotting label should be scaled relative to the default.label.color
: Default color of the node.width
: Default width of the node.height
: Default height of the node.genesList
: List of genes included in each node, with EntrezID:
Simple nodes: EntrezIDs of the genes included, separated by commas (“,”) and no spaces, i.e. 56848,8877 for node N-hsa04370-11.
Complex nodes: GenesList of the simple nodes included, separated by a slash (“/”) and no spaces, and in the same order as in the node ID. For example, node N-hsa04370-10 26 includes two simple nodes: 10 and 26. Its genesList column is 5335,5336,/,9047, meaning that the genes included in node 10 are 5335 and 5336, and the gene included in node 26 is 9047.
- pybel.to_hipathia(graph, directory, draw=True)[source]
Export HiPathia artifacts for the graph.
- Return type
- pybel.to_hipathia_dfs(graph, draw_directory=None)[source]
Get the ATT and SIF dataframes.
- Parameters
Identify nodes: 1. Identify all proteins 2. Identify all protein families 3. Identify all complexes with just a protein or a protein family in them
Identify interactions between any of those things that are causal
Profit!
SPIA
An exporter for signaling pathway impact analysis (SPIA) described by [Tarca2009].
- Tarca2009
Tarca, A. L., et al (2009). A novel signaling pathway impact analysis. Bioinformatics, 25(1), 75–82.
PyKEEN
Entry points for PyKEEN.
PyKEEN is a machine learning library for knowledge graph embeddings that supports node clustering,
link prediction, entity disambiguation, question/answering, and other tasks with knowledge graphs.
It provides an interface for registering plugins using Python’s entrypoints under the
pykeen.triples.extension_importer
and pykeen.triples.prefix_importer
groups. More specific
information about how the PyBEL plugins are loaded into PyKEEN can be found in PyBEL’s
setup.cfg under the [options.entry_points]
header.
The following example shows how you can parse/load the triples from a BEL document with the *.bel extension.
from urllib.request import urlretrieve
url = 'https://raw.githubusercontent.com/cthoyt/selventa-knowledge/master/selventa_knowledge/small_corpus.bel'
urlretrieve(url, 'small_corpus.bel')
# Example 1A: Make triples factory
from pykeen.triples import TriplesFactory
tf = TriplesFactory(path='small_corpus.bel')
# Example 1B: Use directly in the pipeline, which automatically invokes training/testing set stratification
from pykeen.pipeline import pipeline
results = pipeline(
dataset='small_corpus.bel',
model='TransE',
)
The same is true for precompiled BEL documents in the node-link format with the *.bel.nodelink.json extension and the pickle format with the *.bel.pickle extension.
The following example shows how you can load/parse the triples from a BEL document stored in BEL Commons using the
bel-commons
prefix in combination with the network’s identifier.
# Example 2A: Make a triples factory
from pykeen.triples import TriplesFactory
# the network's identifier is 528
tf = TriplesFactory(path='bel-commons:528')
# Example 1B: Use directly in the pipeline, which automatically invokes training/testing set stratification
from pykeen.pipeline import pipeline
results = pipeline(
dataset='bel-commons:528',
model='TransR',
)
Currently, this relies on the default BEL Commons service provider at https://bel-commons-dev.scai.fraunhofer.de, whose location might change in the future.
- pybel.io.pykeen.get_triples_from_bel(path)[source]
Get triples from a BEL file by wrapping
pybel.io.tsv.api.get_triples()
.- Parameters
path (
str
) – the file path to a BEL Script- Return type
ndarray
- Returns
A three column array with head, relation, and tail in each row
- pybel.io.pykeen.get_triples_from_bel_nodelink(path)[source]
Get triples from a BEL Node-link JSON file by wrapping
pybel.io.tsv.api.get_triples()
.- Parameters
path (
str
) – the file path to a BEL Node-link JSON file- Return type
ndarray
- Returns
A three column array with head, relation, and tail in each row
- pybel.io.pykeen.get_triples_from_bel_pickle(path)[source]
Get triples from a BEL pickle file by wrapping
pybel.io.tsv.api.get_triples()
.- Parameters
path (
str
) – the file path to a BEL pickle file- Return type
ndarray
- Returns
A three column array with head, relation, and tail in each row
- pybel.io.pykeen.get_triples_from_bel_commons(network_id)[source]
Load a BEL document from BEL Commons by wrapping
pybel.io.tsv.api.get_triples()
.- Parameters
network_id (
str
) – The network identifier for a graph in BEL Commons- Return type
ndarray
- Returns
A three column array with head, relation, and tail in each row
Machine Learning
Export functions for Machine Learning.
While BEL is a fantastic medium for storing metadata and high granularity information on edges, machine learning algorithms can not consume BEL graphs directly. This module provides functions that make inferences and interpretations of BEL graphs in order to interface with machine learning platforms. One example where we’ve done this is BioKEEN, which uses this module to convert BEL graphs into a format for knowledge graph embeddings.
- pybel.to_triples(graph, use_tqdm=False, raise_on_none=False)[source]
Get a non-redundant list of triples representing the graph.
- pybel.to_triples_file(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO], *, use_tqdm: bool = False, sep='\t', raise_on_none: bool = False) None [source]
Write the graph as a TSV.
Web Services
BEL Commons
Transport functions for BEL Commons.
BEL Commons is a free, open-source platform for hosting BEL content. Because it was originally developed and published in an academic capacity at Fraunhofer SCAI, a public instance can be found at https://bel-commons-dev.scai.fraunhofer.de. However, this instance is only supported out of posterity and will not be updated. If you would like to host your own instance of BEL Commons, there are instructions on its GitHub page.
- pybel.from_bel_commons(network_id, host=None)[source]
Retrieve a public network from BEL Commons.
In the future, this function may be extended to support authentication.
- Parameters
- Raises
ValueError if host configuration can not be found
- Return type
- pybel.to_bel_commons(graph, host=None, user=None, password=None, public=True)[source]
Send a graph to the receiver service and returns the
requests
response object.- Parameters
graph (
BELGraph
) – A BEL graphhost (
Optional
[str
]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_HOST
or the environment asPYBEL_REMOTE_HOST
.user (
Optional
[str
]) – Username for BEL Commons. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_USER
or the environment asPYBEL_REMOTE_USER
password (
Optional
[str
]) – Password for BEL Commons. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_PASSWORD
or the environment asPYBEL_REMOTE_PASSWORD
public (
bool
) – Should the network be made public?
- Return type
Response
- Returns
The response object from
requests
Amazon Simple Storage Service (S3)
Transport functions for Amazon Web Services (AWS).
AWS has a cloud-based file storage service called S3 that can be programatically
accessed using the boto3
package. This module provides functions for quickly
wrapping upload/download of BEL graphs using the gzipped Node-Link schema.
- pybel.to_s3(graph, *, bucket, key, client=None)[source]
Save BEL to S3 as gzipped node-link JSON.
If you don’t specify an instantiated client, PyBEL will do its best to load a default one using
boto3.client()
like in the following example:import pybel from pybel.examples import sialic_acid_graph graph = pybel.to_s3( sialic_acid_graph, bucket='your bucket', key='your file name.bel.nodelink.json.gz', )
However, if you would like to configure your own, you can do it with something like this:
import boto3 s3_client = boto3.client('s3') import pybel from pybel.examples import sialic_acid_graph graph = pybel.to_s3( sialic_acid_graph, client=s3_client, bucket='your bucket', key='your file name.bel.nodelink.json.gz', )
Warning
This assumes you already have credentials set up on your machine
If you don’t already have a bucket, you can create one using
boto3
by following this tutorial: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-example-creating-buckets.html- Return type
- pybel.from_s3(*, bucket, key, client=None)[source]
Get BEL from gzipped node-link JSON from Amazon S3.
If you don’t specify an instantiated client, PyBEL will do its best to load a default one using
boto3.client()
like in the following example:graph = pybel.from_s3(bucket='your bucket', key='your file name.bel.nodelink.json.gz')
However, if you would like to configure your own, you can do it with something like this:
import boto3 s3_client = boto3.client('s3') import pybel graph = pybel.from_s3( client=s3_client, bucket='your bucket', key='your file name.bel.nodelink.json.gz', )
- Return type
BioDati
Transport functions for BioDati.
BioDati is a paid, closed-source platform for hosting BEL content. However, they do have a demo instance running at https://studio.demo.biodati.com with which the examples in this module will be described.
As noted in the transport functions for BioDati, you should change the URLs to point to your own instance of BioDati. If you’re looking for an open source storage system for hosting your own BEL content, you may consider BEL Commons, with the caveat that it is currently maintained in an academic capacity. Disclosure: BEL Commons is developed by the developers of PyBEL.
- pybel.to_biodati(graph, *, username='demo@biodati.com', password='demo', base_url='https://nanopubstore.demo.biodati.com', chunksize=None, use_tqdm=True, collections=None, overwrite=False, validate=True, email=False)[source]
Post this graph to a BioDati server.
- Parameters
graph (
BELGraph
) – A BEL graphusername (
str
) – The email address to log in to BioDati. Defaults to “demo@biodati.com” for the demo serverpassword (
str
) – The password to log in to BioDati. Defaults to “demo” for the demo serverbase_url (
str
) – The BioDati nanopub store base url. Defaults to “https://nanopubstore.demo.biodati.com” for the demo server’s nanopub storechunksize (
Optional
[int
]) – The number of nanopubs to post at a time. By default, does all.use_tqdm (
bool
) – Should tqdm be used when iterating?collections (
Optional
[Iterable
[str
]]) – Tags to add to the nanopubs for lookup on BioDatioverwrite (
bool
) – Set the BioDati upload “overwrite” settingvalidate (
bool
) – Set the BioDati upload “validate” settingemail (
Union
[bool
,str
]) – Who should get emailed with results about the upload? If true, emails to user used for login. If string, emails to that user. If false, no email.
- Return type
Response
- Returns
The response from the BioDati server (last response if using chunking)
Warning
BioDati does not support large uploads (yet?).
Warning
The default public BioDati server has been put here. You should switch it to yours. It will look like
https://nanopubstore.<YOUR NAME>.biodati.com
.
- pybel.from_biodati(network_id, username='demo@biodati.com', password='demo', base_url='https://networkstore.demo.biodati.com')[source]
Get a graph from a BioDati network store based on its network identifier.
- Parameters
network_id (
str
) – The internal identifier of the network you want to download.username (
str
) – The email address to log in to BioDati. Defaults to “demo@biodati.com” for the demo serverpassword (
str
) – The password to log in to BioDati. Defaults to “demo” for the demo serverbase_url (
str
) – The BioDati network store base url. Defaults to “https://networkstore.demo.biodati.com” for the demo server’s network store
Example usage:
from pybel import from_biodati network_id = '01E46GDFQAGK5W8EFS9S9WMH12' # COVID-19 graph example from Wendy Zimmermann graph = from_biodati( network_id=network_id, username='demo@biodati.com', password='demo', base_url='https://networkstore.demo.biodati.com', ) graph.summarize()
Warning
The default public BioDati server has been put here. You should switch it to yours. It will look like
https://networkstore.<YOUR NAME>.biodati.com
.- Return type
Fraunhofer OrientDB
Transport functions for Fraunhofer’s OrientDB.
Fraunhofer hosts
an instance of OrientDB that contains BEL in a schema similar to
pybel.io.umbrella_nodelink
. However, they include custom relations that do not come
from a controlled vocabulary, and have not made the schema, ETL scripts, or documentation available.
Unlike BioDati and BEL Commons, the Fraunhofer OrientDB does not allow for uploads, so only
a single function pybel.from_fraunhofer_orientdb()
is provided by PyBEL.
- pybel.from_fraunhofer_orientdb(database='covid', user='covid_user', password='covid', query=None)[source]
Get a BEL graph from the Fraunhofer OrientDB.
- Parameters
database (
str
) – The OrientDB database to connect touser (
str
) – The user to connect to OrientDBpassword (
str
) – The password to connect to OrientDBquery (
Optional
[str
]) – The query to run. Defaults to the URL encoded version ofselect from E
, whereE
is all edges in the OrientDB edge database. Likely does not need to be changed, except in the case of selecting specific subsets of edges. Make sure you URL encode it properly, because OrientDB’s RESTful API puts it in the URL’s path.
By default, this function connects to the
covid
database, that corresponds to the COVID-19 Knowledge Graph 0. If other databases in the Fraunhofer OrientDB are published and demo username/password combinations are given, the following table will be updated.Database
Username
Password
covid
covid_user
covid
The
covid
database can be downloaded and converted to a BEL graph like this:import pybel graph = pybel.from_fraunhofer_orientdb( database='covid', user='covid_user', password='covid', ) graph.summarize()
However, because the source BEL scripts for the COVID-19 Knowledge Graph are available on GitHub and the authors pre-enabled it for PyBEL, it can be downloaded with
pip install git+https://github.com/covid19kg/covid19kg.git
and used with the following python code:import covid19kg graph = covid19kg.get_graph() graph.summarize()
Warning
It was initially planned to handle some of the non-standard relationships listed in the Fraunhofer OrientDB’s schema in their OrientDB Studio instance, but none of them actually appear in the only network that is accessible. If this changes, please leave an issue at https://github.com/pybel/pybel/issues so it can be addressed.
- 0
Domingo-Fernández, D., et al. (2020). COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. bioRxiv 2020.04.14.040667.
- Return type
EMMAA
Ecosystem of Machine-maintained Models with Automated Analysis (EMMAA).
EMMAA is a project built on top of INDRA by the Sorger Lab at Harvard Medical School. It automatically builds knowledge graphs around pathways/indications periodically (almost daily) using the INDRA Database, which in turn is updated periodically (almost daily) with the most recent literature from MEDLINE, PubMed Central, several major publishers, and other bespoke text corpora such as CORD-19.
- pybel.from_emmaa(model, *, date=None, extension=None, suppress_warnings=False)[source]
Get an EMMAA model as a BEL graph.
Get the most recent COVID-19 model from EMMAA with the following:
import pybel covid19_emmaa_graph = pybel.from_emmaa('covid19', extension='jsonl') covid19_emmaa_graph.summarize()
PyBEL does its best to look up the most recent model, but if that doesn’t work, you can specify it explicitly with the
date
keyword argument in the form of%Y-%m-%d-%H-%M-%S
like in the following:import pybel covid19_emmaa_graph = pybel.from_emmaa('covid19', '2020-04-23-17-44-57', extension='jsonl') covid19_emmaa_graph.summarize()
- Return type
Databases
SQL Databases
Conversion functions for BEL graphs with a SQL database.
- pybel.from_database(name, version=None, manager=None)[source]
Load a BEL graph from a database.
If name and version are given, finds it exactly with
pybel.manager.Manager.get_network_by_name_version()
. If just the name is given, finds most recent withpybel.manager.Manager.get_network_by_name_version()
Neo4j
Output functions for BEL graphs to Neo4j.
- pybel.to_neo4j(graph, neo_connection, use_tqdm=False)[source]
Upload a BEL graph to a Neo4j graph database using
py2neo
.- Parameters
graph (pybel.BELGraph) – A BEL Graph
neo_connection (str or py2neo.Graph) – A
py2neo
connection object. Refer to the py2neo documentation for how to build this object.
Example Usage:
>>> import py2neo >>> import pybel >>> from pybel.examples import sialic_acid_graph >>> neo_graph = py2neo.Graph("http://localhost:7474/db/data/") # use your own connection settings >>> pybel.to_neo4j(sialic_acid_graph, neo_graph)
Lossy Export
Umbrella Node-Link JSON
The Umbrella Node-Link JSON format is similar to node-link but uses full BEL terms as nodes.
Given a BEL statement describing that X
phosphorylates Y
like act(p(X)) -> p(Y, pmod(Ph))
,
PyBEL usually stores the act()
information about X
as part of the relationship. In Umbrella mode,
this stays as part of the node.
Note that this generates additional nodes in the network for each of the “modified” versions of
the node. For example, act(p(X))
will be represented as individual node instead of
p(X)
, as in the standard node-link JSON exporter.
A user might want to use this exporter in the following scenarios:
Represent transitivity in activities like in
p(X, pmod(Ph)) -> act(p(X)) -> p(Y, pmod(Ph)) -> act(p(Y))
with four nodes that are more ammenable to simulatons (e.g., boolean networks, petri nets).Visualizing networks that in similar way to the legacy BEL Cytoscape plugin from the BEL Framework (warning: now defunct) using tools like Cytoscape.
GraphML
Conversion functions for BEL graphs with GraphML.
- pybel.to_graphml(graph, path, schema=None)[source]
Write a graph to a GraphML XML file using
networkx.write_graphml()
.- Parameters
The .graphml file extension is suggested so Cytoscape can recognize it. By default, this function exports using the PyBEL schema of including modifier information into the edges. As an alternative, this function can also distinguish between
- Return type
Miscellaneous
This module contains IO functions for outputting BEL graphs to lossy formats, such as GraphML and CSV.
- pybel.to_csv(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO], sep: Optional[str] = None) None [source]
Write the graph as a tab-separated edge list.
The resulting file will contain the following columns:
Source BEL term
Relation
Target BEL term
Edge data dictionary
See the Data Models section of the documentation for which data are stored in the edge data dictionary, such as queryable information about transforms on the subject and object and their associated metadata.
- Return type
- pybel.to_sif(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO], sep: Optional[str] = None) None [source]
Write the graph as a tab-separated SIF file.
The resulting file will contain the following columns:
Source BEL term
Relation
Target BEL term
This format is simple and can be used readily with many applications, but is lossy in that it does not include relation metadata.
- Return type
- pybel.to_gsea(graph: pybel.struct.graph.BELGraph, path: Union[str, TextIO]) None [source]
Write the genes/gene products to a GRP file for use with GSEA gene set enrichment analysis.
See also
GSEA publication
- Return type