Input and Output¶
Input and output functions for BEL graphs.
PyBEL provides multiple lossless interchange options for BEL. Lossy output formats are also included for convenient export to other programs. Notably, a de facto interchange using Resource Description Framework (RDF) to match the ability of other existing software is excluded due the immaturity of the BEL to RDF mapping.
-
pybel.
load
(path, **kwargs)[source]¶ Read a BEL graph.
- Parameters
path (
str
) – The path to a BEL graph in any of the formats with extensions described belowkwargs – The keyword arguments are passed to the importer function
- Return type
BELGraph
- Returns
A BEL graph.
This is the universal loader, which means any file path can be given and PyBEL will look up the appropriate load function. Allowed extensions are:
bel
bel.nodelink.json
bel.cx.json
bel.jgif.json
The previous extensions also support gzipping. Other allowed extensions that don’t support gzip are:
bel.pickle / bel.gpickle / bel.pkl
indra.json
-
pybel.
dump
(graph, path, **kwargs)[source]¶ Write a BEL graph.
- Parameters
graph (
BELGraph
) – A BEL graphpath (
str
) – The path to which the BEL graph is written.kwargs – The keyword arguments are passed to the exporter function
This is the universal loader, which means any file path can be given and PyBEL will look up the appropriate writer function. Allowed extensions are:
bel
bel.nodelink.json
bel.unodelink.json
bel.cx.json
bel.jgif.json
bel.graphdati.json
The previous extensions also support gzipping. Other allowed extensions that don’t support gzip are:
bel.pickle / bel.gpickle / bel.pkl
indra.json
tsv
gsea
- Return type
None
Import¶
Parsing Modes¶
The PyBEL parser has several modes that can be enabled and disabled. They are described below.
Allow Naked Names¶
By default, this is set to False
. The parser does not allow identifiers that are not qualified with
namespaces (naked names), like in p(YFG)
. A proper namespace, like p(HGNC:YFG)
must be used. By
setting this to True
, the parser becomes permissive to naked names. In general, this is bad practice and this
feature will be removed in the future.
Allow Nested¶
By default, this is set to False
. The parser does not allow nested statements is disabled. See overview.
By setting this to True
the parser will accept nested statements one level deep.
Citation Clearing¶
By default, this is set to True
. While the BEL specification clearly states how the language should be used as
a state machine, many BEL documents do not conform to the strict SET
/UNSET
rules. To guard against
annotations accidentally carried from one set of statements to the next, the parser has two modes. By default, in
citation clearing mode, when a SET CITATION
command is reached, it will clear all other annotations (except
the STATEMENT_GROUP
, which has higher priority). This behavior can be disabled by setting this to False
to re-enable strict parsing.
Reference¶
-
pybel.
from_bel_script
(path, **kwargs)[source]¶ Load a BEL graph from a file resource. This function is a thin wrapper around
from_lines()
.The remaining keyword arguments are passed to
pybel.io.line_utils.parse_lines()
, which populates aBELGraph
.- Return type
BELGraph
-
pybel.
from_bel_script_url
(url, **kwargs)[source]¶ Load a BEL graph from a URL resource.
- Parameters
url (
str
) – A valid URL pointing to a BEL document
The remaining keyword arguments are passed to
pybel.io.line_utils.parse_lines()
.- Return type
BELGraph
Transport¶
All transport pairs are reflective and data-preserving.
Bytes¶
Conversion functions for BEL graphs with bytes and Python pickles.
-
pybel.
from_bytes
(bytes_graph, check_version=True)[source]¶ Read a graph from bytes (the result of pickling the graph).
-
pybel.
to_bytes
(graph, protocol=4)[source]¶ Convert a graph to bytes with pickle.
Note that the pickle module has some incompatibilities between Python 2 and 3. To export a universally importable pickle, choose 0, 1, or 2.
- Parameters
graph (
BELGraph
) – A BEL networkprotocol (
int
) – Pickling protocol to use. Defaults toHIGHEST_PROTOCOL
.
- Return type
Node-Link JSON¶
Conversion functions for BEL graphs with node-link JSON.
-
pybel.
from_nodelink
(graph_json_dict, check_version=True)[source]¶ Build a graph from node-link JSON Object.
- Return type
BELGraph
-
pybel.
from_nodelink_jsons
(graph_json_str, check_version=True)[source]¶ Read a BEL graph from a node-link JSON string.
- Return type
BELGraph
-
pybel.
to_nodelink_jsons
(graph, **kwargs)[source]¶ Dump this graph as a node-link JSON object to a string.
- Return type
-
pybel.
from_nodelink_file
(path, check_version=True)[source]¶ Build a graph from the node-link JSON contained in the given file.
-
pybel.
to_nodelink_file
(graph, path, **kwargs)[source]¶ Write this graph as node-link JSON to a file.
Cyberinfrastructure Exchange¶
This module wraps conversion between pybel.BELGraph
and the Cyberinfrastructure Exchange (CX) JSON.
CX is an aspect-oriented network interchange format encoded in JSON with a format inspired by the JSON-LD encoding of Resource Description Framework (RDF). It is primarily used by the Network Data Exchange (NDEx) and more recent versions of Cytoscape.
See also
The NDEx Data Model Specification
CX Support for Cytoscape.js on the Cytoscape App Store
-
pybel.
from_cx_jsons
(graph_json_str)[source]¶ Read a BEL graph from a CX JSON string.
- Return type
BELGraph
-
pybel.
to_cx_jsons
(graph, **kwargs)[source]¶ Dump this graph as a CX JSON object to a string.
- Return type
-
pybel.
to_cx_file
(graph, path, indent=2, **kwargs)[source]¶ Write a BEL graph to a JSON file in CX format.
- Parameters
Example: >>> from pybel.examples import sialic_acid_graph >>> from pybel import to_cx_file >>> with open(‘graph.cx’, ‘w’) as f: >>> … to_cx_file(sialic_acid_graph, f)
- Return type
None
JSON Graph Interchange Format¶
Conversion functions for BEL graphs with JGIF JSON.
The JSON Graph Interchange Format (JGIF) is specified similarly to the Node-Link JSON. Interchange with this format provides compatibilty with other software and repositories, such as the Causal Biological Network Database.
-
pybel.
to_jgif
(graph)[source]¶ Build a JGIF dictionary from a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Returns
A JGIF dictionary
- Return type
Warning
Untested! This format is not general purpose and is therefore time is not heavily invested. If you want to use Cytoscape.js, we suggest using
pybel.to_cx()
instead.Example: >>> import pybel, os, json >>> graph_url = ‘https://arty.scai.fraunhofer.de/artifactory/bel/knowledge/selventa-small-corpus/selventa-small-corpus-20150611.bel’ >>> graph = pybel.from_bel_script_url(graph_url) >>> graph_jgif_json = pybel.to_jgif(graph) >>> with open(os.path.expanduser(‘~/Desktop/small_corpus.json’), ‘w’) as f: … json.dump(graph_jgif_json, f)
-
pybel.
from_jgif_jsons
(graph_json_str)[source]¶ Read a BEL graph from a JGIF JSON string.
- Return type
BELGraph
-
pybel.
to_jgif_jsons
(graph, **kwargs)[source]¶ Dump this graph as a JGIF JSON object to a string.
- Return type
-
pybel.
to_jgif_gz
(graph, path, **kwargs)[source]¶ Write a graph as JGIF JSON to a gzip file.
- Return type
None
-
pybel.
from_cbn_jgif
(graph_jgif_dict)[source]¶ Build a BEL graph from CBN JGIF.
Map the JGIF used by the Causal Biological Network Database to standard namespace and annotations, then builds a BEL graph using
pybel.from_jgif()
.- Parameters
graph_jgif_dict (dict) – The JSON object representing the graph in JGIF format
- Return type
Example: >>> import requests >>> from pybel import from_cbn_jgif >>> apoptosis_url = ‘http://causalbionet.com/Networks/GetJSONGraphFile?networkId=810385422’ >>> graph_jgif_dict = requests.get(apoptosis_url).json() >>> graph = from_cbn_jgif(graph_jgif_dict)
Warning
Handling the annotations is not yet supported, since the CBN documents do not refer to the resources used to create them. This may be added in the future, but the annotations must be stripped from the graph before uploading to the network store using
pybel.struct.mutation.strip_annotations()
.
HiPathia¶
Convert a BEL graph to HiPathia inputs.
Input¶
SIF File¶
Text file with three columns separated by tabs.
Each row represents an interaction in the pathway. First column is the source node, third column the target node, and the second is the type of relation between them.
Only activation and inhibition interactions are allowed.
The name of the nodes in this file will be stored as the IDs of the nodes.
The nodes IDs should have the following structure: N (dash) pathway ID (dash) node ID.
HiPathia distinguish between two types of nodes: simple and complex.
Simple nodes:
Simple nodes may include many genes, but only one is needed to perform the function of the node. This could correspond to a protein family of enzymes that all have the same function - only one of them needs to be present for the action to take place. Simple nodes are defined within
Node IDs from simple nodes do not include any space, i.e. N-hsa04370-11.
Complex nodes:
Complex nodes include different simple nodes and represent protein complexes. Each simple node within the complex represents one protein in the complex. This node requires the presence of all their simple nodes to perform its function.
Node IDs from complex nodes are the juxtaposition of the included simple node IDs, separated by spaces, i.e. N-hsa04370-10 26.
ATT File¶
Text file with twelve (12) columns separated by tabulars. Each row represents a node (either simple or complex).
The columns included are:
ID
: Node ID as explained above.label
: Name to be shown in the picture of the pathway en HGNC. Generally, the gene name of the first included EntrezID gene is used as label. For complex nodes, we juxtapose the gene names of the first genes of each simple node included (see genesList column below).X
: The X-coordinate of the position of the node in the pathway.Y
: The Y-coordinate of the position of the node in the pathway.color
: The default color of the node.shape
: The shape of the node. “rectangle” should be used for genes and “circle” for metabolites.type
: The type of the node, either “gene” for genes or “compound” for metabolites. For complex nodes, the type of each of their included simple nodes is juxtaposed separated by commas, i.e. gene,gene.label.cex
: Amount by which plotting label should be scaled relative to the default.label.color
: Default color of the node.width
: Default width of the node.height
: Default height of the node.genesList
: List of genes included in each node, with EntrezID:
Simple nodes: EntrezIDs of the genes included, separated by commas (“,”) and no spaces, i.e. 56848,8877 for node N-hsa04370-11.
Complex nodes: GenesList of the simple nodes included, separated by a slash (“/”) and no spaces, and in the same order as in the node ID. For example, node N-hsa04370-10 26 includes two simple nodes: 10 and 26. Its genesList column is 5335,5336,/,9047, meaning that the genes included in node 10 are 5335 and 5336, and the gene included in node 26 is 9047.
-
pybel.
to_hipathia
(graph, directory)[source]¶ Export HiPathia artifacts for the graph.
- Return type
None
-
pybel.
to_hipathia_dfs
(graph)[source]¶ Get the ATT and SIF dataframes.
Identify nodes: 1. Identify all proteins 2. Identify all protein families 3. Identify all complexes with just a protein or a protein family in them
Identify interactions between any of those things that are causal
Profit!
- Return type
Tuple
[DataFrame
,DataFrame
]
Export¶
Umbrella Node-Link JSON¶
The Umbrella Node-Link JSON format is similar to node-link but uses full BEL terms as nodes.
Given a BEL statement describing that X
phosphorylates Y
like act(p(X)) -> p(Y, pmod(Ph))
,
PyBEL usually stores the act()
information about X
as part of the relationship. In Umbrella mode,
this stays as part of the node.
Note that this generates additional nodes in the network for each of the “modified” versions of
the node. For example, act(p(X))
will be represented as individual node instead of
p(X)
, as in the standard node-link JSON exporter.
A user might want to use this exporter in the following scenarios:
Represent transitivity in activities like in
p(X, pmod(Ph)) -> act(p(X)) -> p(Y, pmod(Ph)) -> act(p(Y))
with four nodes that are more ammenable to simulatons (e.g., boolean networks, petri nets).Visualizing networks that in similar way to the legacy BEL Cytoscape plugin from the BEL Framework (warning: now defunct) using tools like Cytoscape.
GraphDati¶
Conversion functions for BEL graphs with GraphDati.
-
pybel.
to_graphdati_file
(graph, path, use_identifiers=True, **kwargs)[source]¶ Write this graph as GraphDati JSON to a file.
-
pybel.
to_graphdati_gz
(graph, path, **kwargs)[source]¶ Write a graph as GraphDati JSON to a gzip file.
- Return type
None
-
pybel.
to_graphdati_jsonl
(graph, file, use_identifiers=True, use_tqdm=True)[source]¶ Write this graph as a GraphDati JSON lines file.
-
pybel.
to_graphdati_jsonl_gz
(graph, path, **kwargs)[source]¶ Write a graph as GraphDati JSONL to a gzip file.
- Return type
None
-
pybel.
to_graphdati_jsons
(graph, **kwargs)[source]¶ Dump this graph as a GraphDati JSON object to a string.
- Parameters
graph (
BELGraph
) – A BEL graph- Return type
-
pybel.
post_graphdati
(graph, username='demo@biodati.com', password='demo', base_url='https://nanopubstore.demo.biodati.com', chunksize=None, **kwargs)[source]¶ Post this graph to a BioDati server.
- Parameters
graph (
BELGraph
) – A BEL graphusername (
str
) – The email address to log in to BioDati. Defaults to “demo@biodati.com” for the demo serverpassword (
str
) – The password to log in to BioDati. Defaults to “demo” for the demo serverbase_url (
str
) – The BioDati server base url. Defaults to “https://nanopubstore.demo.biodati.com” for the demo serverchunksize (
Optional
[int
]) – The number of nanopubs to post at a time. By default, does all.
Warning
The default public BioDati server has been put here. You should switch it to yours.
- Return type
Response
GraphML¶
Conversion functions for BEL graphs with GraphML.
-
pybel.
to_graphml
(graph, path, schema=None)[source]¶ Write a graph to a GraphML XML file using
networkx.write_graphml()
.- Parameters
The .graphml file extension is suggested so Cytoscape can recognize it. By default, this function exports using the PyBEL schema of including modifier information into the edges. As an alternative, this function can also distinguish between
- Return type
None
PyNPA¶
Exporter for PyNPA.
See also
-
pybel.
to_npa_directory
(graph, directory, **kwargs)[source]¶ Write the BEL file to two files in the directory for
pynpa
.- Return type
None
-
pybel.
to_npa_dfs
(graph, cartesian_expansion=False, nomenclature_method_first_layer=None, nomenclature_method_second_layer=None, direct_tf_only=False)[source]¶ Export the BEL graph as two lists of triples for the
pynpa
.- Parameters
graph (
BELGraph
) – A BEL graphcartesian_expansion (
bool
) – If true, applies cartesian expansion on both reactions (reactants x products) as well as list abundances usinglist_abundance_cartesian_expansion()
andreaction_cartesian_expansion()
nomenclature_method_first_layer (
Optional
[str
]) – Either “curie”, “name” or “inodes. Defaults to “curie”.nomenclature_method_second_layer (
Optional
[str
]) – Either “curie”, “name” or “inodes. Defaults to “curie”.
Pick out all transcription factor relationships. Protein X is a transcription factor for gene Y IFF
complex(p(X), g(Y)) -> r(Y)
Get all other interactions between any gene/rna/protein that are directed causal for the PPI layer
- Return type
Tuple
[DataFrame
,DataFrame
]
Miscellaneous¶
This module contains IO functions for outputting BEL graphs to lossy formats, such as GraphML and CSV.
-
pybel.
to_csv
(graph, path, sep=None)[source]¶ Write the graph as a tab-separated edge list.
The resulting file will contain the following columns:
Source BEL term
Relation
Target BEL term
Edge data dictionary
See the Data Models section of the documentation for which data are stored in the edge data dictionary, such as queryable information about transforms on the subject and object and their associated metadata.
- Return type
None
-
pybel.
to_sif
(graph, path, sep=None)[source]¶ Write the graph as a tab-separated SIF file.
The resulting file will contain the following columns:
Source BEL term
Relation
Target BEL term
This format is simple and can be used readily with many applications, but is lossy in that it does not include relation metadata.
- Return type
None
-
pybel.
to_gsea
(graph, path)[source]¶ Write the genes/gene products to a GRP file for use with GSEA gene set enrichment analysis.
See also
GSEA publication
- Return type
None
Databases¶
SQL Databases¶
Conversion functions for BEL graphs with a SQL database.
-
pybel.
from_database
(name, version=None, manager=None)[source]¶ Load a BEL graph from a database.
If name and version are given, finds it exactly with
pybel.manager.Manager.get_network_by_name_version()
. If just the name is given, finds most recent withpybel.manager.Manager.get_network_by_name_version()
Neo4j¶
Output functions for BEL graphs to Neo4j.
-
pybel.
to_neo4j
(graph, neo_connection, use_tqdm=False)[source]¶ Upload a BEL graph to a Neo4j graph database using
py2neo
.- Parameters
graph (pybel.BELGraph) – A BEL Graph
neo_connection (str or py2neo.Graph) – A
py2neo
connection object. Refer to the py2neo documentation for how to build this object.
Example Usage:
>>> import py2neo >>> import pybel >>> from pybel.examples import sialic_acid_graph >>> neo_graph = py2neo.Graph("http://localhost:7474/db/data/") # use your own connection settings >>> pybel.to_neo4j(sialic_acid_graph, neo_graph)
BEL Commons¶
This module facilitates rudimentary data exchange with BEL Commons.
-
pybel.
from_web
(network_id, host=None)[source]¶ Retrieve a public network from BEL Commons.
In the future, this function may be extended to support authentication.
- Parameters
network_id (
int
) – The BEL Commons network identifierhost (
Optional
[str
]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_HOST
or the environment asPYBEL_REMOTE_HOST
Defaults topybel.constants.DEFAULT_SERVICE_URL
- Return type
BELGraph
-
pybel.
to_web
(graph, host=None, user=None, password=None, public=False)[source]¶ Send a graph to the receiver service and returns the
requests
response object.- Parameters
graph (
BELGraph
) – A BEL graphhost (
Optional
[str
]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_HOST
or the environment asPYBEL_REMOTE_HOST
Defaults topybel.constants.DEFAULT_SERVICE_URL
user (
Optional
[str
]) – Username for BEL Commons. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_USER
or the environment asPYBEL_REMOTE_USER
password (
Optional
[str
]) – Password for BEL Commons. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_PASSWORD
or the environment asPYBEL_REMOTE_PASSWORD
- Return type
Response
- Returns
The response object from
requests
INDRA¶
Conversion functions for BEL graphs with INDRA.
After assembling a model with INDRA, a list of
indra.statements.Statement
can be converted to a pybel.BELGraph
with
indra.assemblers.pybel.PybelAssembler
.
from indra.assemblers.pybel import PybelAssembler
import pybel
stmts = [
# A list of INDRA statements
]
pba = PybelAssembler(
stmts,
name='Graph Name',
version='0.0.1',
description='Graph Description'
)
graph = pba.make_model()
# Write to BEL file
pybel.to_bel_path(belgraph, 'simple_pybel.bel')
Warning
These functions are hard to unit test because they rely on a whole set of java dependencies and will likely not be for a while.
-
pybel.
from_indra_statements
(stmts, name=None, version=None, description=None, authors=None, contact=None, license=None, copyright=None, disclaimer=None)[source]¶ Import a model from
indra
.- Parameters
stmts (List[indra.statements.Statement]) – A list of statements
version (
Optional
[str
]) – The graph’s version. Recommended to use semantic versioning orYYYYMMDD
format.
- Return type
-
pybel.
from_indra_statements_json
(stmts_json, **kwargs)[source]¶ Get a BEL graph from INDRA statements JSON.
- Return type
Other kwargs are passed to
from_indra_statements()
.
-
pybel.
from_indra_statements_json_file
(file, **kwargs)[source]¶ Get a BEL graph from INDRA statements JSON file.
- Return type
Other kwargs are passed to
from_indra_statements()
.
-
pybel.
to_indra_statements
(graph)[source]¶ Export this graph as a list of INDRA statements using the
indra.sources.pybel.PybelProcessor
.- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
list[indra.statements.Statement]
-
pybel.
to_indra_statements_json
(graph)[source]¶ Export this graph as INDRA JSON list.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
-
pybel.
to_indra_statements_json_file
(graph, path, indent=2, **kwargs)[source]¶ Export this graph as INDRA statement JSON.
- Parameters
graph (pybel.BELGraph) – A BEL graph
Other kwargs are passed to
json.dump()
.
-
pybel.
from_biopax
(path, **kwargs)[source]¶ Import a model encoded in Pathway Commons BioPAX via
indra
.- Parameters
path (
str
) – Path to a BioPAX OWL file- Return type
Other kwargs are passed to
from_indra_statements()
.Warning
Not compatible with all BioPAX! See INDRA documentation.