PyBEL 0.14.4 Documentation¶
Biological Expression Language (BEL) is a domain-specific language that enables the expression of complex molecular relationships and their context in a machine-readable form. Its simple grammar and expressive power have led to its successful use in the to describe complex disease networks with several thousands of relationships.
PyBEL is a pure Python software package that parses BEL documents, validates their semantics, and facilitates data interchange between common formats and database systems like JSON, CSV, Excel, SQL, CX, and Neo4J. Its companion package, PyBEL-Tools, contains a library of functions for analysis of biological networks. For result-oriented guides, see the PyBEL-Notebooks repository.
Installation is as easy as getting the code from PyPI with
python3 -m pip install pybel
. See the installation documentation.
For citation information, see the references page.
PyBEL is tested on Python 3.5+ on Mac OS and Linux using Travis CI as well as on Windows using AppVeyor.
See also
Overview¶
Background on Systems Biology Modeling¶
Biological Expression Language (BEL)¶
Biological Expression Language (BEL) is a domain specific language that enables the expression of complex molecular relationships and their context in a machine-readable form. Its simple grammar and expressive power have led to its successful use to describe complex disease networks with several thousands of relationships. For a detailed explanation, see the BEL 1.0 and 2.0 specifications.
OpenBEL Links¶
OpenBEL on Google Groups
OpenBEL Wiki
OpenBEL on GitHub
Chat on Gitter
Design Considerations¶
Missing Namespaces and Improper Names¶
The use of openly shared controlled vocabularies (namespaces) within BEL facilitates the exchange and consistency of
information. Finding the correct namespace:name
pair is often a difficult part of the curation process.
Outdated Namespaces¶
OpenBEL provides a variety of namespaces covering each of the BEL function types. These namespaces are generated by code found at https://github.com/OpenBEL/resource-generator and distributed at http://resources.openbel.org/belframework/.
This code has not been maintained to reflect the changes in the underlying resources, so this repository has been forked and updated at https://github.com/pybel/resource-generator to reflect the most recent versions of the underlying namespaces. The files are now distributed using the Fraunhofer SCAI Artifactory server.
Generating New Namespaces¶
In some cases, it is appropriate to design a new namespace, using the custom namespace specification provided by the OpenBEL Framework. Packages for generating namespace, annotation, and knowledge resources have been grouped in the Bio2BEL organization on GitHub.
Synonym Issues¶
Due to the huge number of terms across many namespaces, it’s difficult for curators to know the domain-specific synonyms that obscure the controlled/preferred term. However, the issue of synonym resolution and semantic searching has already been generally solved by the use of ontologies. Besides just a controlled vocabulary, they also a hierarchical model of knowledge, synonyms with cross-references to databases and other ontologies, and other information semantic reasoning. Ontologies in the biomedical domain can be found at OBO and EMBL-EBI OLS.
Additionally, as a tool for curators, the EMBL Ontology Lookup Service (OLS) allows for semantic searching. Simple queries for the terms ‘mitochondrial dysfunction’ and ‘amyloid beta-peptides’ immediately returned results from relevant ontologies, and ended a long debate over how to represent these objects within BEL. EMBL-EBI also provides a programmatic API to the OLS service, for searching terms (http://www.ebi.ac.uk/ols/api/search?q=folic%20acid) and suggesting resolutions (http://www.ebi.ac.uk/ols/api/suggest?q=folic+acid)
Implementation¶
PyBEL is implemented using the PyParsing module. It provides flexibility and incredible speed in parsing compared to regular expression implementation. It also allows for the addition of parsing action hooks, which allow the graph to be checked semantically at compile-time.
It uses SQLite to provide a consistent and lightweight caching system for external data, such as namespaces, annotations, ontologies, and SQLAlchemy to provide a cross-platform interface. The same data management system is used to store graphs for high-performance querying.
Extensions to BEL¶
The PyBEL compiler is fully compliant with both BEL v1.0 and v2.0 and automatically upgrades legacy statements. Additionally, PyBEL includes several additions to the BEL specification to enable expression of important concepts in molecular biology that were previously missing and to facilitate integrating new data types. A short example is the inclusion of protein oxidation in the default BEL namespace for protein modifications. Other, more elaborate additions are outlined below.
Syntax for Epigenetics¶
PyBEL introduces the gene modification function, gmod(), as a syntax for encoding epigenetic modifications. Its usage mirrors the pmod() function for proteins and includes arguments for methylation.
For example, the methylation of NDUFB6 was found to be negatively correlated with its expression in a study of insulin resistance and Type II diabetes. This can now be expressed in BEL such as in the following statement:
g(HGNC:NDUFB6, gmod(Me)) negativeCorrelation r(HGNC:NDUFB6)
References:
Note
This syntax is currently under consideration as BEP-0006.
Definition of Namespaces as Regular Expressions¶
BEL imposes the constraint that each identifier must be qualified with an enumerated namespace to enable semantic interoperability and data integration. However, enumerating a namespace with potentially billions of names, such as dbSNP, poses a computational issue. PyBEL introduces syntax for defining namespaces with a consistent pattern using a regular expression to overcome this issue. For these namespaces, semantic validation can be perform in post-processing against the underlying database. The dbSNP namespace can be defined with a syntax familiar to BEL annotation definitions with regular expressions as follows:
DEFINE NAMESPACE dbSNP AS PATTERN "rs[0-9]+"
Note
This syntax was proposed with BEP-0005 and has been officially accepted as part of the BEL 2.1 specification.
Definition of Resources using OWL¶
Previous versions of PyBEL until 0.11.2 had an alternative namespace definition. Now it is recommended to either
generate namespace files with reproducible build scripts following the Bio2BEL framework, or to directly add them to
the database with the Bio2BEL bio2bel.manager.namespace_manager.NamespaceManagerMixin
extension.
Things to Consider¶
Do All Statements Need Supporting Text?¶
Yes! All statements must be minimally qualified with a citation and evidence (now called SupportingText in BEL 2.0) to maintain provenance. Statements without evidence can’t be traced to their source or evaluated independently from the curator, so they are excluded.
Multiple Annotations¶
All single annotations are considered as single element sets. When multiple annotations are present, all are unioned and attached to a given edge.
SET Citation = {"PubMed","Example Article","12345"}
SET ExampleAnnotation1 = {"Example Value 11", "Example Value 12"}
SET ExampleAnnotation2 = {"Example Value 21", "Example Value 22"}
p(HGNC:YFG1) -> p(HGNC:YFG2)
Namespace and Annotation Name Choices¶
*.belns
and *.belanno
configuration files include an entry called “Keyword” in their respective
[Namespace] and [AnnotationDefinition] sections. To maintain understandability between BEL documents, PyBEL
warns when the names given in *.bel
documents do not match their respective resources. For now, capitalization
is not considered, but in the future, PyBEL will also warn when capitalization is not properly stylized, like forgetting
the lowercase ‘h’ in “ChEMBL”.
Why Not Nested Statements?¶
BEL has different relationships for modeling direct and indirect causal relations.
Direct¶
A => B
means that A directly increases B through a physical process.A =| B
means that A directly decreases B through a physical process.
Indirect¶
The relationship between two entities can be coded in BEL, even if the process is not well understood.
A -> B
means that A indirectly increases B. There are hidden elements in X that mediate this interaction through a pathway direct interactionsA (=> or =|) X_1 (=> or =|) ... X_n (=> or =|) B
, or through a set of multiple pathways that constitute a network.A -| B
means that A indirectly decreases B. Like forA -> B
, this process involves hidden components with varying activities.
Increasing Nested Relationships¶
BEL also allows object of a relationship to be another statement.
A => (B => C)
means that A increases the process by which B increases C. The example in the BEL Specp(HGNC:GATA1) => (act(p(HGNC:ZBTB16)) => r(HGNC:MPL))
represents GATA1 directly increasing the process by which ZBTB16 directly increases MPL. Before, directly increasing was used to specify physical contact, so it’s reasonable to conclude thatp(HGNC:GATA1) => act(p(HGNC:ZBTB16))
. The specification cites examples when B is an activity that only is affected in the context of A and C. This complicated enough that it is both impractical to standardize during curation, and impractical to represent in a network.A -> (B => C)
can be interpreted by assuming that A indirectly increases B, and because of monotonicity, conclude thatA -> C
as well.A => (B -> C)
is more difficult to interpret, because it does not describe which part of processB -> C
is affected by A or how. Is it thatA => B
, andB => C
, so we concludeA -> C
, or does it mean something else? Perhaps A impacts a different portion of the hidden process inB -> C
. These statements are ambiguous enough that they should be written as justA => B
, andB -> C
. If there is no literature evidence for the statementA -> C
, then it is not the job of the curator to make this inference. Identifying statements of this might be the goal of a bioinformatics analysis of the BEL network after compilation.A -> (B -> C)
introduces even more ambiguity, and it should not be used.A => (B =| C)
states A increases the process by which B decreases C. One interpretation of this statement might be thatA => B
andB =| C
. An analysis could inferA -| C
. Statements in the form ofA -> (B =| C)
can also be resolved this way, but with added ambiguity.
Decreasing Nested Relationships¶
While we could agree on usage for the previous examples, the decrease of a nested statement introduces an unreasonable amount of ambiguity.
A =| (B => C)
could mean A decreases B, and B also increases C. Does this mean A decreases C, or does it mean that C is still increased, but just not as much? Which of these statements takes precedence? Or do their effects cancel? The same can be said aboutA -| (B => C)
, and with added ambiguity for indirect increasesA -| (B -> C)
A =| (B =| C)
could mean that A decreases B and B decreases C. We could conclude that A increases C, or could we again run into the problem of not knowing the precedence? The same is true for the indirect versions.
Recommendations for Use in PyBEL¶
After considering the ambiguity of nested statements to be a great risk to clarity, and PyBEL disables the usage of nested statements by default. See the Input and Output section for different parser settings. At Fraunhofer SCAI, curators resolved these statements to single statements to improve the precision and readability of our BEL documents.
While most statements in the form A rel1 (B rel2 C)
can be reasonably expanded to A rel1 B
and
B rel2 C
, the few that cannot are the difficult-to-interpret cases that we need to be careful about in our
curation and later analyses.
Why Not RDF?¶
Current bel2rdf serialization tools build URLs with the OpenBEL Framework domain as a namespace, rather than respect the original namespaces of original entities. This does not follow the best practices of the semantic web, where URL’s representing an object point to a real page with additional information. For example, UniProt does an exemplary job of this. Ultimately, using non-standard URLs makes harmonizing and data integration difficult.
Additionally, the RDF format does not easily allow for the annotation of edges. A simple statement in BEL that one protein up-regulates another can be easily represented in a triple in RDF, but when the annotations and citation from the BEL document need to be included, this forces RDF serialization to use approaches like representing the statement itself as a node. RDF was not intended to represent this type of information, but more properly for locating resources (hence its name). Furthermore, many blank nodes are introduced throughout the process. This makes RDF incredibly difficult to understand or work with. Later, writing queries in SPARQL becomes very difficult because the data format is complicated and the language is limited. For example, it would be incredibly complicated to write a query in SPARQL to get the objects of statements from publications by a certain author.
Installation¶
The latest stable code can be installed from PyPI with:
$ python3 -m pip install pybel
The most recent code can be installed from the source on GitHub with:
$ python3 -m pip install git+https://github.com/pybel/pybel.git
For developers, the repository can be cloned from GitHub and installed in editable mode with:
$ git clone https://github.com/pybel/pybel.git
$ cd pybel
$ python3 -m pip install -e .
Extras¶
The setup.py
makes use of the extras_require
argument of setuptools.setup()
in order to make some heavy
packages that support special features of PyBEL optional to install, in order to make the installation more lean by
default. A single extra can be installed from PyPI like python3 -m pip install pybel[neo4j]
or multiple can
be installed using a list like python3 -m pip install pybel[neo4j,indra]
. Likewise, for developer
installation, extras can be installed in editable mode with python3 -m pip install -e .[neo4j]
or multiple can
be installed using a list like python3 -m pip install -e .[neo4j,indra]
. The available extras are:
neo4j¶
This extension installs the py2neo
package to support upload and download to Neo4j databases.
See also
indra¶
This extra installs support for indra
, the integrated network dynamical reasoner and assembler. Because it also
represents biology in BEL-like statements, many statements from PyBEL can be converted to INDRA, and visa-versa. This
package also enables the import of BioPAX, SBML, and SBGN into BEL.
See also
pybel.from_indra_pickle()
pybel.to_indra()
jupyter¶
This extra installs support for visualizing BEL graphs in Jupyter notebooks.
See also
pybel.io.jupyter.to_html()
pybel.io.jupyter.to_jupyter()
Caveats¶
PyBEL extends the
networkx
for its core data structure. Many of the graphical aspects ofnetworkx
depend onmatplotlib
, which is an optional dependency.If
HTMLlib5
is installed, the test that’s supposed to fail on a web page being missing actually tries to parse it as RDFa, and doesn’t fail. Disregard this.
Upgrading¶
During the current development cycle, programmatic access to the definition and graph caches might become unstable. If you have any problems working with the database, try removing it with one of the following commands:
Running
pybel manage drop
(unix)Running
python3 -m pybel manage drop
(windows)Removing the folder
~/.pybel
PyBEL will build a new database and populate it on the next run.
Data Model¶
The pybel.struct
module houses functions for handling the main data structure in PyBEL.
Because BEL expresses how biological entities interact within many
different contexts, with descriptive annotations, PyBEL represents data as a directed multi-graph by sub-classing the
networkx.MultiDiGraph
. Each node is an instance of a subclass of the pybel.dsl.BaseEntity
and each
edge has a stable key and associated data dictionary for storing relevant contextual information.
The graph contains metadata for the PyBEL version, the BEL script metadata, the namespace definitions, the
annotation definitions, and the warnings produced in analysis. Like any networkx
graph, all attributes of
a given object can be accessed through the graph
property, like in: my_graph.graph['my key']
.
Convenient property definitions are given for these attributes that are outlined in the documentation for
pybel.BELGraph
.
This allows for much easier programmatic access to answer more complicated questions, which can be written with python
code. Because the data structure is the same in Neo4J, the data can be directly exported with pybel.to_neo4j()
.
Neo4J supports the Cypher querying language so that the same queries can be written in an elegant and simple way.
Constants¶
These documents refer to many aspects of the data model using constants, which can be found in the top-level module
pybel.constants
.
Terms describing abundances, annotations, and other internal data are designated in pybel.constants
with full-caps, such as pybel.constants.FUNCTION
and pybel.constants.PROTEIN
.
For normal usage, we suggest referring to values in dictionaries by these constants, in case the hard-coded strings behind these constants change.
Function Nomenclature¶
The following table shows PyBEL’s internal mapping from BEL functions to its own constants. This can be accessed
programatically via pybel.parser.language.abundance_labels
.
BEL Function |
PyBEL Constant |
PyBEL DSL |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Graph¶
-
class
pybel.
BELGraph
(name=None, version=None, description=None, authors=None, contact=None, license=None, copyright=None, disclaimer=None, path=None)[source]¶ An extension to
networkx.MultiDiGraph
to represent BEL.Initialize a BEL graph with its associated metadata.
- Parameters
version (
Optional
[str
]) – The graph’s version. Recommended to use semantic versioning orYYYYMMDD
format.
-
__add__
(other)[source]¶ Copy this graph and join it with another graph with it using
pybel.struct.left_full_join()
.Example usage:
>>> import pybel >>> g = pybel.from_bel_script('...') >>> h = pybel.from_bel_script('...') >>> k = g + h
-
__iadd__
(other)[source]¶ Join another graph into this one, in-place, using
pybel.struct.left_full_join()
.Example usage:
>>> import pybel >>> g = pybel.from_bel_script('...') >>> h = pybel.from_bel_script('...') >>> g += h
-
__and__
(other)[source]¶ Create a deep copy of this graph and left outer joins another graph.
Uses
pybel.struct.left_outer_join()
.Example usage:
>>> import pybel >>> g = pybel.from_bel_script('...') >>> h = pybel.from_bel_script('...') >>> k = g & h
-
__iand__
(other)[source]¶ Join another graph into this one, in-place, using
pybel.struct.left_outer_join()
.Example usage:
>>> import pybel >>> g = pybel.from_bel_script('...') >>> h = pybel.from_bel_script('...') >>> g &= h
-
property
name
¶ The graph’s name.
Hint
Can be set with the
SET DOCUMENT Name = "..."
entry in the source BEL script.
-
property
version
¶ The graph’s version.
Hint
Can be set with the
SET DOCUMENT Version = "..."
entry in the source BEL script.
-
property
description
¶ The graph’s description.
Hint
Can be set with the
SET DOCUMENT Description = "..."
entry in the source BEL document.
The graph’s authors.
Hint
Can be set with the
SET DOCUMENT Authors = "..."
entry in the source BEL document.
-
property
contact
¶ The graph’s contact information.
Hint
Can be set with the
SET DOCUMENT ContactInfo = "..."
entry in the source BEL document.
-
property
license
¶ The graph’s license.
Hint
Can be set with the
SET DOCUMENT Licenses = "..."
entry in the source BEL document
-
property
copyright
¶ The graph’s copyright.
Hint
Can be set with the
SET DOCUMENT Copyright = "..."
entry in the source BEL document
-
property
disclaimer
¶ The graph’s disclaimer.
Hint
Can be set with the
SET DOCUMENT Disclaimer = "..."
entry in the source BEL document.
-
property
namespace_url
¶ The mapping from the keywords used in this graph to their respective BEL namespace URLs.
Hint
Can be appended with the
DEFINE NAMESPACE [key] AS URL "[value]"
entries in the definitions section of the source BEL document.
-
property
defined_namespace_keywords
¶ The set of all keywords defined as namespaces in this graph.
-
property
namespace_pattern
¶ The mapping from the namespace keywords used to create this graph to their regex patterns.
Hint
Can be appended with the
DEFINE NAMESPACE [key] AS PATTERN "[value]"
entries in the definitions section of the source BEL document.
-
property
annotation_url
¶ The mapping from the annotation keywords used to create this graph to the URLs of the BELANNO files.
Hint
Can be appended with the
DEFINE ANNOTATION [key] AS URL "[value]"
entries in the definitions section of the source BEL document.
-
property
annotation_pattern
¶ The mapping from the annotation keywords used to create this graph to their regex patterns as strings.
Hint
Can be appended with the
DEFINE ANNOTATION [key] AS PATTERN "[value]"
entries in the definitions section of the source BEL document.
-
property
annotation_list
¶ The mapping from the keywords of locally defined annotations to their respective sets of values.
Hint
Can be appended with the
DEFINE ANNOTATION [key] AS LIST {"[value]", ...}
entries in the definitions section of the source BEL document.
-
property
defined_annotation_keywords
¶ Get the set of all keywords defined as annotations in this graph.
-
property
pybel_version
¶ The version of PyBEL with which this graph was produced as a string.
- Return type
-
property
warnings
¶ A list of warnings associated with this graph.
-
number_of_citations
()[source]¶ Return the number of citations contained within the graph.
- Return type
Return the number of citations contained within the graph.
- Return type
-
add_unqualified_edge
(u, v, relation)[source]¶ Add a unique edge that has no annotations.
- Parameters
u (
BaseEntity
) – The source nodev (
BaseEntity
) – The target noderelation (
str
) – A relationship label frompybel.constants
- Return type
- Returns
The key for this edge (a unique hash)
-
add_transcription
(gene, rna)[source]¶ Add a transcription relation from a gene to an RNA or miRNA node.
-
add_translation
(rna, protein)[source]¶ Add a translation relation from a RNA to a protein.
- Parameters
rna (
Rna
) – An RNA nodeprotein (
Protein
) – A protein node
- Return type
-
add_equivalence
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add two equivalence relations for the nodes.
-
add_orthology
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add two orthology relations for the nodes such that
u orthologousTo v
andv orthologousTo u
.
-
add_is_a
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *, relation: str = 'isA') → str¶ Add an
isA
relationship such thatu isA v
.
-
add_part_of
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *, relation: str = 'partOf') → str¶ Add a
partOf
relationship such thatu partOf v
.
-
add_has_variant
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *, relation: str = 'hasVariant') → str¶ Add a
hasVariant
relationship such thatu hasVariant v
.
-
add_has_reactant
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *, relation: str = 'hasReactant') → str¶ Add a
hasReactant
relationship such thatu hasReactant v
.
-
add_has_product
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *, relation: str = 'hasProduct') → str¶ Add a
hasProduct
relationship such thatu hasProduct v
.
-
add_qualified_edge
(u, v, *, relation, evidence, citation, annotations=None, subject_modifier=None, object_modifier=None, **attr)[source]¶ Add a qualified edge.
Qualified edges have a relation, evidence, citation, and optional annotations, subject modifications, and object modifications.
- Parameters
u – The source node
v – The target node
relation (
str
) – The type of relation this edge representsevidence (
str
) – The evidence string from an articlecitation (
Union
[str
,Tuple
[str
,str
],CitationDict
]) – The citation data dictionary for this evidence. If a string is given, assumes it’s a PubMed identifier and auto-fills the citation type.annotations (
Union
[Mapping
[str
,str
],Mapping
[str
,Set
[str
]],Mapping
[str
,Mapping
[str
,bool
]],None
]) – The annotations data dictionarysubject_modifier (
Optional
[Mapping
]) – The modifiers (like activity) on the subject node. See data model documentation.object_modifier (
Optional
[Mapping
]) – The modifiers (like activity) on the object node. See data model documentation.
- Return type
- Returns
The hash of the edge
-
add_binds
(u, v, *, evidence, citation, annotations=None, **attr)[source]¶ Add a “binding” relationship between the two entities such that
u => complex(u, v)
.- Return type
-
add_increases
(u, v, *, relation: str = 'increases', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = None, **attr) → str¶ Wrap
add_qualified_edge()
for thepybel.constants.INCREASES
relation.
-
add_directly_increases
(u, v, *, relation: str = 'directlyIncreases', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = None, **attr) → str¶ Add a
pybel.constants.DIRECTLY_INCREASES
withadd_qualified_edge()
.
-
add_decreases
(u, v, *, relation: str = 'decreases', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = None, **attr) → str¶ Add a
pybel.constants.DECREASES
relationship withadd_qualified_edge()
.
-
add_directly_decreases
(u, v, *, relation: str = 'directlyDecreases', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = None, **attr) → str¶ Add a
pybel.constants.DIRECTLY_DECREASES
relationship withadd_qualified_edge()
.
-
add_association
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add a
pybel.constants.ASSOCIATION
relationship withadd_qualified_edge()
.
-
add_regulates
(u, v, *, relation: str = 'regulates', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = None, **attr) → str¶ Add a
pybel.constants.REGULATES
relationship withadd_qualified_edge()
.
-
add_correlation
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add a
pybel.constants.CORRELATION
relationship withadd_qualified_edge()
.
-
add_no_correlation
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add a
pybel.constants.NO_CORRELATION
relationship withadd_qualified_edge()
.
-
add_positive_correlation
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add a
pybel.constants.POSITIVE_CORRELATION
relationship withadd_qualified_edge()
.
-
add_negative_correlation
(u: pybel.dsl.node_classes.BaseEntity, v: pybel.dsl.node_classes.BaseEntity, *args, **kwargs) → str¶ Add a
pybel.constants.NEGATIVE_CORRELATION
relationship withadd_qualified_edge()
.
-
add_causes_no_change
(u, v, *, relation: str = 'causesNoChange', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = None, **attr) → str¶ Add a
pybel.constants.CAUSES_NO_CHANGE
relationship withadd_qualified_edge()
.
-
add_inhibits
(u, v, *, relation: str = 'decreases', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = {'modifier': 'Activity'}, **attr) → str¶ Add an “inhibits” relationship.
A more specific version of
add_decreases()
that automatically populates the object modifier with an activity.
-
add_activates
(u, v, *, relation: str = 'increases', evidence: str, citation: Union[str, Tuple[str, str], pybel.utils.CitationDict], annotations: Union[Mapping[str, str], Mapping[str, Set[str]], Mapping[str, Mapping[str, bool]], None] = None, subject_modifier: Optional[Mapping] = None, object_modifier: Optional[Mapping] = {'modifier': 'Activity'}, **attr) → str¶ Add an “inhibits” relationship.
A more specific version of
add_increases()
that automatically populates the object modifier with an activity.
-
get_edge_citation
(u, v, key)[source]¶ Get the citation for a given edge.
- Return type
Optional
[CitationDict
]
-
static
edge_to_bel
(u, v, edge_data, sep=None, use_identifiers=False)[source]¶ Serialize a pair of nodes and related edge data as a BEL relation.
- Return type
-
iter_equivalent_nodes
(node)[source]¶ Iterate over nodes that are equivalent to the given node, including the original.
- Return type
Iterable
[BaseEntity
]
-
get_equivalent_nodes
(node)[source]¶ Get a set of equivalent nodes to this node, excluding the given node.
- Return type
Set
[BaseEntity
]
-
node_has_namespace
(node, namespace)[source]¶ Check if the node have the given namespace.
This also should look in the equivalent nodes.
- Return type
Nodes¶
Nodes (or entities) in a pybel.BELGraph
represent physical entities’ abundances. Most contain information
about the identifier for the entity using a namespace/name pair. The PyBEL parser converts BEL terms to an internal
representation using an internal domain specific language (DSL) that allows for writing BEL directly in Python.
For example, after the BEL term p(HGNC:GSK3B)
is parsed, it is instantiated as a Python object using the
DSL function corresponding to the p()
function in BEL, pybel.dsl.Protein
, like:
from pybel.dsl import Protein
gsk3b_protein = Protein(namespace='HGNC', name='GSK3B')
pybel.dsl.Protein
, like the others mentioned before, inherit from pybel.dsl.BaseEntity
, which itself
inherits from dict
. Therefore, the resulting object can be used like a dict that looks like:
from pybel.constants import *
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'GSK3B',
}
Alternatively, it can be used in more exciting ways, outlined later in the documentation for pybel.dsl
.
Variants¶
The addition of a variant tag results in an entry called ‘variants’ in the data dictionary associated with a given node. This entry is a list with dictionaries describing each of the variants. All variants have the entry ‘kind’ to identify whether it is a post-translational modification (PTM), gene modification, fragment, or HGVS variant.
Warning
The canonical ordering for the elements of the VARIANTS
list correspond to the sorted
order of their corresponding node tuples using pybel.parser.canonicalize.sort_dict_list()
. Rather than
directly modifying the BELGraph’s structure, use pybel.BELGraph.add_node_from_data()
, which takes care of
automatically canonicalizing this dictionary.
HGVS Variants.
For example, the BEL term p(HGNC:GSK3B, var(p.Gly123Arg))
is translated to the following internal DSL:
from pybel.dsl import Protein, Hgvs
gsk3b_variant = Protien(namespace='HGNC', name='GSK3B', variants=Hgvs('p.Gly123Arg'))
Further, the shorthand for protein substitutions, pybel.dsl.ProteinSubstitution
, can be used to produce the
same result, as it inherits from pybel.dsl.Hgvs
:
from pybel.dsl import Protein, ProteinSubstitution
gsk3b_variant = Protien(namespace='HGNC', name='GSK3B', variants=ProteinSubstitution('Gly', 123, 'Arg'))
Either way, the resulting object can be used like a dict that looks like:
from pybel.constants import *
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'GSK3B',
VARIANTS: [
{
KIND: HGVS,
IDENTIFIER: 'p.Gly123Arg',
},
],
}
See also
BEL 2.0 specification on variants
HGVS conventions
PyBEL module
pybel.parser.modifiers.get_hgvs_language
Gene Substitutions¶
Gene Substitutions.
Gene substitutions are legacy statements defined in BEL 1.0. BEL 2.0 recommends using HGVS strings. Luckily,
the information contained in a BEL 1.0 encoding, such as g(HGNC:APP,sub(G,275341,C))
can be
automatically translated to the appropriate HGVS g(HGNC:APP, var(c.275341G>C))
, assuming that all
substitutions are using the reference coding gene sequence for numbering and not the genomic reference.
The previous statements both produce the underlying data:
from pybel.constants import *
{
FUNCTION: GENE,
NAMESPACE: 'HGNC',
NAME: 'APP',
VARIANTS: [
{
KIND: HGVS,
IDENTIFIER: 'c.275341G>C',
},
],
}
See also
BEL 2.0 specification on gene substitutions
PyBEL module
pybel.parser.modifiers.get_gene_substitution_language
Gene Modifications¶
Gene Modifications.
PyBEL introduces the gene modification tag, gmod(), to allow for the encoding of epigenetic modifications. Its syntax follows the same style s the pmod() tags for proteins, and can include the following values:
M
Me
methylation
A
Ac
acetylation
For example, the node g(HGNC:GSK3B, gmod(M))
is represented with the following:
from pybel.constants import *
{
FUNCTION: GENE,
NAMESPACE: 'HGNC',
NAME: 'GSK3B',
VARIANTS: [
{
KIND: GMOD,
IDENTIFIER: {
NAMESPACE: BEL_DEFAULT_NAMESPACE,
NAME: 'Me',
},
},
],
}
The addition of this function does not preclude the use of all other standard functions in BEL; however, other compilers probably won’t support these standards. If you agree that this is useful, please contribute to discussion in the OpenBEL community.
See also
PyBEL module
pybel.parser.modifiers.get_gene_modification_language()
Protein Substitutions¶
Protein Substitutions.
Protein substitutions are legacy statements defined in BEL 1.0. BEL 2.0 recommends using HGVS strings. Luckily,
the information contained in a BEL 1.0 encoding, such as p(HGNC:APP,sub(R,275,H))
can be
automatically translated to the appropriate HGVS p(HGNC:APP, var(p.Arg275His))
, assuming that all
substitutions are using the reference protein sequence for numbering and not the genomic reference.
The previous statements both produce the underlying data:
from pybel.constants import *
{
FUNCTION: GENE,
NAMESPACE: 'HGNC',
NAME: 'APP',
VARIANTS: [
{
KIND: HGVS,
IDENTIFIER: 'p.Arg275His',
},
],
}
See also
BEL 2.0 specification on protein substitutions
PyBEL module
pybel.parser.modifiers.get_protein_substitution_language
Protein Modifications¶
Protein Modifications.
The addition of a post-translational modification (PTM) tag results in an entry called ‘variants’ in the data dictionary associated with a given node. This entry is a list with dictionaries describing each of the variants. All variants have the entry ‘kind’ to identify whether it is a PTM, gene modification, fragment, or HGVS variant. The ‘kind’ value for PTM is ‘pmod’.
Each PMOD contains an identifier, which is a dictionary with the namespace and name, and can optionally include the position (‘pos’) and/or amino acid code (‘code’).
For example, the node p(HGNC:GSK3B, pmod(P, S, 9))
is represented with the following:
from pybel.constants import *
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'GSK3B',
VARIANTS: [
{
KIND: PMOD,
IDENTIFIER: {
NAMESPACE: BEL_DEFAULT_NAMESPACE
NAME: 'Ph',
},
PMOD_CODE: 'Ser',
PMOD_POSITION: 9,
},
],
}
As an additional example, in p(HGNC:MAPK1, pmod(Ph, Thr, 202), pmod(Ph, Tyr, 204))
, MAPK is phosphorylated
twice to become active. This results in the following:
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'MAPK1',
VARIANTS: [
{
KIND: PMOD,
IDENTIFIER: {
NAMESPACE: BEL_DEFAULT_NAMESPACE
NAME: 'Ph',
},
PMOD_CODE: 'Thr',
PMOD_POSITION: 202
},
{
KIND: PMOD,
IDENTIFIER: {
NAMESPACE: BEL_DEFAULT_NAMESPACE
NAME: 'Ph',
},
PMOD_CODE: 'Tyr',
PMOD_POSITION: 204
}
]
}
See also
BEL 2.0 specification on protein modifications
PyBEL module
pybel.parser.modifiers.get_protein_modification_language
Protein Truncations¶
Truncations.
Truncations in the legacy BEL 1.0 specification are automatically translated to BEL 2.0 with HGVS nomenclature.
p(HGNC:AKT1, trunc(40))
becomes p(HGNC:AKT1, var(p.40*))
and is represented with the following
dictionary:
from pybel.constants import *
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'AKT1',
VARIANTS: [
{
KIND: HGVS,
IDENTIFIER: 'p.40*',
},
],
}
Unfortunately, the HGVS nomenclature requires the encoding of the terminal amino acid which is exchanged
for a stop codon, and this information is not required by BEL 1.0. For this example, the proper encoding
of the truncation at position also includes the information that the 40th amino acid in the AKT1 is Cys. Its
BEL encoding should be p(HGNC:AKT1, var(p.Cys40*))
. Temporary support has been added to
compile these statements, but it’s recommended they are upgraded by reexamining the supporting text, or
looking up the amino acid sequence.
See also
BEL 2.0 specification on truncations
PyBEL module
pybel.parser.modifiers.get_truncation_language
Protein Fragments¶
Fragments.
The addition of a fragment results in an entry called pybel.constants.VARIANTS
in the data dictionary associated with a given node. This entry is a list with dictionaries
describing each of the variants. All variants have the entry pybel.constants.KIND
to identify whether it is
a PTM, gene modification, fragment, or HGVS variant. The pybel.constants.KIND
value for a fragment is
pybel.constants.FRAGMENT
.
Each fragment contains an identifier, which is a dictionary with the namespace and name, and can optionally include the position (‘pos’) and/or amino acid code (‘code’).
For example, the node p(HGNC:GSK3B, frag(45_129))
is represented with the following:
from pybel.constants import *
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'GSK3B',
VARIANTS: [
{
KIND: FRAGMENT,
FRAGMENT_START: 45,
FRAGMENT_STOP: 129,
},
],
}
Additionally, nodes can have an asterick (*) or question mark (?) representing unbound or unknown fragments, respectively.
A fragment may also be unknown, such as in the node p(HGNC:GSK3B, frag(?))
. This
is represented with the key pybel.constants.FRAGMENT_MISSING
and the value of ‘?’ like:
from pybel.constants import *
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'GSK3B',
VARIANTS: [
{
KIND: FRAGMENT,
FRAGMENT_MISSING: '?',
},
],
}
See also
BEL 2.0 specification on proteolytic fragments (2.2.3)
PyBEL module
pybel.parser.modifiers.get_fragment_language
Fusions¶
Fusions.
Gene, RNA, miRNA, and protein fusions are all represented with the same underlying data structure. Below
it is shown with uppercase letters referring to constants from pybel.constants
and. For example,
g(HGNC:BCR, fus(HGNC:JAK2, 1875, 2626))
is represented as:
from pybel.constants import *
{
FUNCTION: GENE,
FUSION: {
PARTNER_5P: {NAMESPACE: 'HGNC', NAME: 'BCR'},
PARTNER_3P: {NAMESPACE: 'HGNC', NAME: 'JAK2'},
RANGE_5P: {
FUSION_REFERENCE: 'c',
FUSION_START: '?',
FUSION_STOP: 1875,
},
RANGE_3P: {
FUSION_REFERENCE: 'c',
FUSION_START: 2626,
FUSION_STOP: '?',
},
},
}
See also
BEL 2.0 specification on fusions (2.6.1)
PyBEL module
pybel.parser.modifiers.get_fusion_language
PyBEL module
pybel.parser.modifiers.get_legacy_fusion_language
Unqualified Edges¶
Unqualified edges are automatically inferred by PyBEL and do not contain citations or supporting evidence.
Variant and Modifications’ Parent Relations¶
All variants, modifications, fragments, and truncations are connected to their parent entity with an edge having
the relationship hasParent
.
For p(HGNC:GSK3B, var(p.Gly123Arg))
, the following edge is inferred:
p(HGNC:GSK3B, var(p.Gly123Arg)) hasParent p(HGNC:GSK3B)
All variants have this relationship to their reference node. BEL does not specify relationships between variants, such as the case when a given phosphorylation is necessary to make another one. This knowledge could be encoded directly like BEL, since PyBEL does not restrict users from manually asserting unqualified edges.
List Abundances¶
Complexes and composites that are defined by lists. As of version 0.9.0, they contain a list of the data dictionaries
that describe their members. For example complex(p(HGNC:FOS), p(HGNC:JUN))
becomes:
from pybel.constants import *
{
FUNCTION: COMPLEX,
MEMBERS: [
{
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'FOS',
}, {
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'JUN',
}
]
}
The following edges are also inferred:
complex(p(HGNC:FOS), p(HGNC:JUN)) hasMember p(HGNC:FOS)
complex(p(HGNC:FOS), p(HGNC:JUN)) hasMember p(HGNC:JUN)
See also
BEL 2.0 specification on complex abundances
Similarly, composite(a(CHEBI:malonate), p(HGNC:JUN))
becomes:
from pybel.constants import *
{
FUNCTION: COMPOSITE,
MEMBERS: [
{
FUNCTION: ABUNDANCE,
NAMESPACE: 'CHEBI',
NAME: 'malonate',
}, {
FUNCTION: PROTEIN,
NAMESPACE: 'HGNC',
NAME: 'JUN',
}
]
}
The following edges are inferred:
composite(a(CHEBI:malonate), p(HGNC:JUN)) hasComponent a(CHEBI:malonate)
composite(a(CHEBI:malonate), p(HGNC:JUN)) hasComponent p(HGNC:JUN)
Warning
The canonical ordering for the elements of the pybel.constantsMEMBERS
list correspond to the sorted
order of their corresponding node tuples using pybel.parser.canonicalize.sort_dict_list()
. Rather than
directly modifying the BELGraph’s structure, use BELGraph.add_node_from_data()
, which takes care of
automatically canonicalizing this dictionary.
See also
BEL 2.0 specification on composite abundances
Reactions¶
The usage of a reaction causes many nodes and edges to be created. The following example will illustrate what is added to the network for
rxn(reactants(a(CHEBI:"(3S)-3-hydroxy-3-methylglutaryl-CoA"), a(CHEBI:"NADPH"), \
a(CHEBI:"hydron")), products(a(CHEBI:"mevalonate"), a(CHEBI:"NADP(+)")))
As of version 0.9.0, the reactants’ and products’ data dictionaries are included as sub-lists keyed REACTANTS
and
PRODUCTS
. It becomes:
from pybel.constants import *
{
FUNCTION: REACTION
REACTANTS: [
{
FUNCTION: ABUNDANCE,
NAMESPACE: 'CHEBI',
NAME: '(3S)-3-hydroxy-3-methylglutaryl-CoA'
}, {
FUNCTION: ABUNDANCE,
NAMESPACE: 'CHEBI',
NAME: 'NADPH',
}, {
FUNCTION: ABUNDANCE,
NAMESPACE: 'CHEBI',
NAME: 'hydron',
}
],
PRODUCTS: [
{
FUNCTION: ABUNDANCE,
NAMESPACE: 'CHEBI',
NAME: 'mevalonate',
}, {
FUNCTION: ABUNDANCE,
NAMESPACE: 'CHEBI',
NAME: 'NADP(+)',
}
]
}
Warning
The canonical ordering for the elements of the REACTANTS
and PRODUCTS
lists correspond to the sorted
order of their corresponding node tuples using pybel.parser.canonicalize.sort_dict_list()
. Rather than
directly modifying the BELGraph’s structure, use BELGraph.add_node_from_data()
, which takes care of
automatically canonicalizing this dictionary.
The following edges are inferred, where X
represents the previous reaction, for brevity:
X hasReactant a(CHEBI:"(3S)-3-hydroxy-3-methylglutaryl-CoA")
X hasReactant a(CHEBI:"NADPH")
X hasReactant a(CHEBI:"hydron")
X hasProduct a(CHEBI:"mevalonate")
X hasProduct a(CHEBI:"NADP(+)"))
See also
BEL 2.0 specification on reactions
Edges¶
Design Choices¶
In the OpenBEL Framework, modifiers such as activities (kinaseActivity, etc.) and transformations (translocations,
degradations, etc.) were represented as their own nodes. In PyBEL, these modifiers are represented as a property
of the edge. In reality, an edge like sec(p(HGNC:A)) -> activity(p(HGNC:B), ma(kinaseActivity))
represents
a connection between HGNC:A
and HGNC:B
. Each of these modifiers explains the context of the relationship
between these physical entities. Further, querying a network where these modifiers are part of a relationship
is much more straightforward. For example, finding all proteins that are upregulated by the kinase activity of another
protein now can be directly queried by filtering all edges for those with a subject modifier whose modification is
molecular activity, and whose effect is kinase activity. Having fewer nodes also allows for a much easier display
and visual interpretation of a network. The information about the modifier on the subject and activity can be displayed
as a color coded source and terminus of the connecting edge.
The compiler in OpenBEL framework created nodes for molecular activities like kin(p(HGNC:YFG))
and induced an
edge like p(HGNC:YFG) actsIn kin(p(HGNC:YFG))
. For transformations, a statement like
tloc(p(HGNC:YFG), GOCC:intracellular, GOCC:"cell membrane")
also induced
tloc(p(HGNC:YFG), GOCC:intracellular, GOCC:"cell membrane") translocates p(HGNC:YFG)
.
In PyBEL, we recognize that these modifications are actually annotations to the type of relationship between the
subject’s entity and the object’s entity. p(HGNC:ABC) -> tloc(p(HGNC:YFG), GOCC:intracellular, GOCC:"cell membrane")
is about the relationship between p(HGNC:ABC)
and p(HGNC:YFG)
, while
the information about the translocation qualifies that the object is undergoing an event, and not just the abundance.
This is a confusion with the use of proteinAbundance
as a keyword, and perhaps is why many people prefer to use
just the keyword p
Example Edge Data Structure¶
Because this data is associated with an edge, the node data for the subject and object are not included explicitly. However, information about the activities, modifiers, and transformations on the subject and object are included. Below is the “skeleton” for the edge data model in PyBEL:
from pybel.constants import *
{
SUBJECT: {
# ... modifications to the subject node. Only present if non-empty.
},
RELATION: POSITIVE_CORRELATION,
OBJECT: {
# ... modifications to the object node. Only present if non-empty.
},
EVIDENCE: ...,
CITATION : {
CITATION_TYPE: CITATION_TYPE_PUBMED,
CITATION_REFERENCE: ...,
CITATION_DATE: 'YYYY-MM-DD',
CITATION_AUTHORS: 'Jon Snow|John Doe',
},
ANNOTATIONS: {
'Disease': {
'Colorectal Cancer': True,
},
# ... additional annotations as tuple[str,dict[str,bool]] pairs
},
}
Each edge must contain the RELATION
, EVIDENCE
, and CITATION
entries. The CITATION
must minimally contain CITATION_TYPE
and CITATION_REFERENCE
since these can be used to look up additional
metadata.
Note
Since version 0.10.2, annotations now always appear as dictionaries, even if only one value is present.
Activities¶
Modifiers are added to this structure as well. Under this schema,
p(HGNC:GSK3B, pmod(P, S, 9)) pos act(p(HGNC:GSK3B), ma(kin))
becomes:
from pybel.constants import *
{
RELATION: POSITIVE_CORRELATION,
OBJECT: {
MODIFIER: ACTIVITY,
EFFECT: {
NAME: 'kin',
NAMESPACE: BEL_DEFAULT_NAMESPACE,
}
},
CITATION: { ... },
EVIDENCE: ...,
ANNOTATIONS: { ... },
}
Activities without molecular activity annotations do not contain an pybel.constants.EFFECT
entry: Under this
schema, p(HGNC:GSK3B, pmod(P, S, 9)) pos act(p(HGNC:GSK3B))
becomes:
from pybel.constants import *
{
RELATION: POSITIVE_CORRELATION,
OBJECT: {
MODIFIER: ACTIVITY
},
CITATION: { ... },
EVIDENCE: ...,
ANNOTATIONS: { ... },
}
Locations¶
Locations.
Location data also is added into the information in the edge for the node (subject or object) for which it was
annotated. p(HGNC:GSK3B, pmod(P, S, 9), loc(GO:lysozome)) pos act(p(HGNC:GSK3B), ma(kin))
becomes:
from pybel.constants import *
{
SUBJECT: {
LOCATION: {
NAMESPACE: 'GO',
NAME: 'lysozome',
}
},
RELATION: POSITIVE_CORRELATION,
OBJECT: {
MODIFIER: ACTIVITY,
EFFECT: {
NAMESPACE: BEL_DEFAULT_NAMESPACE
NAME: 'kin',
}
},
EVIDENCE: ...,
CITATION: { ... },
}
The addition of the location()
element in BEL 2.0 allows for the unambiguous expression of the differences
between the process of hypothetical HGNC:A
moving from one place to another and the existence of
hypothetical HGNC:A
in a specific location having different effects. In BEL 1.0, this action had its own node,
but this introduced unnecessary complexity to the network and made querying more difficult.
This calls for thoughtful consideration of the following two statements:
tloc(p(HGNC:A), fromLoc(GO:intracellular), toLoc(GO:"cell membrane")) -> p(HGNC:B)
p(HGNC:A, location(GO:"cell membrane")) -> p(HGNC:B)
See also
BEL 2.0 specification on cellular location (2.2.4)
PyBEL module
pybel.parser.modifiers.get_location_language
Translocations¶
Translocations have their own unique syntax. p(HGNC:YFG1) -> sec(p(HGNC:YFG2))
becomes:
from pybel.constants import *
{
RELATION: INCREASES,
OBJECT: {
MODIFIER: TRANSLOCATION,
EFFECT: {
FROM_LOC: {
NAMESPACE: 'GO',
NAME: 'intracellular',
},
TO_LOC: {
NAMESPACE: 'GO',
NAME: 'extracellular space',
}
}
},
CITATION: { ... },
EVIDENCE: ...,
ANNOTATIONS: { ... },
}
See also
BEL 2.0 specification on translocations
Degradations¶
Degradations are more simple, because there’s no :pybel.constants.EFFECT
entry.
p(HGNC:YFG1) -> deg(p(HGNC:YFG2))
becomes:
from pybel.constants import *
{
RELATION: INCREASES,
OBJECT: {
MODIFIER: DEGRADATION,
},
CITATION: { ... },
EVIDENCE: ...,
ANNOTATIONS: { ... },
}
See also
BEL 2.0 specification on degradations
Example Networks¶
This directory contains example networks, precompiled as BEL graphs that are appropriate to use in examples.
An example describing EGF’s effect on cellular processes.
SET Citation = {"PubMed","Clin Cancer Res 2003 Jul 9(7) 2416-25","12855613"}
SET Evidence = "This induction was not seen either when LNCaP cells were treated with flutamide or conditioned medium were pretreated with antibody to the epidermal growth factor (EGF)"
SET Species = 9606
tscript(p(HGNC:AR)) increases p(HGNC:EGF)
UNSET ALL
SET Citation = {"PubMed","Int J Cancer 1998 Jul 3 77(1) 138-45","9639405"}
SET Evidence = "DU-145 cells treated with 5000 U/ml of IFNgamma and IFN alpha, both reduced EGF production with IFN gamma reduction more significant."
SET Species = 9606
p(HGNC:IFNA1) decreases p(HGNC:EGF)
p(HGNC:IFNG) decreases p(HGNC:EGF)
UNSET ALL
SET Citation = {"PubMed","DNA Cell Biol 2000 May 19(5) 253-63","10855792"}
SET Evidence = "Although found predominantly in the cytoplasm and, less abundantly, in the nucleus, VCP can be translocated from the nucleus after stimulation with epidermal growth factor."
SET Species = 9606
p(HGNC:EGF) increases tloc(p(HGNC:VCP), GO:nucleus, GO:cytoplasm)
UNSET ALL
SET Citation = {"PubMed","J Clin Oncol 2003 Feb 1 21(3) 447-52","12560433"}
SET Evidence = "Valosin-containing protein (VCP; also known as p97) has been shown to be associated with antiapoptotic function and metastasis via activation of the nuclear factor-kappaB signaling pathway."
SET Species = 9606
cat(p(HGNC:VCP)) increases tscript(complex(p(HGNC:NFKB1), p(HGNC:NFKB2), p(HGNC:REL), p(HGNC:RELA), p(HGNC:RELB)))
tscript(complex(p(HGNC:NFKB1), p(HGNC:NFKB2), p(HGNC:REL), p(HGNC:RELA), p(HGNC:RELB))) decreases bp(MESHPP:Apoptosis)
UNSET ALL
-
pybel.examples.
egf_graph
¶
Curation of the article “Genetics ignite focus on microglial inflammation in Alzheimer’s disease”.
SET Citation = {"PubMed", "26438529"}
SET Evidence = "Sialic acid binding activates CD33, resulting in phosphorylation of the CD33
immunoreceptor tyrosine-based inhibitory motif (ITIM) domains and activation of the SHP-1 and
SHP-2 tyrosine phosphatases [66, 67]."
complex(p(HGNC:CD33),a(CHEBI:"sialic acid")) -> p(HGNC:CD33, pmod(P))
act(p(HGNC:CD33, pmod(P))) => act(p(HGNC:PTPN6), ma(phos))
act(p(HGNC:CD33, pmod(P))) => act(p(HGNC:PTPN11), ma(phos))
UNSET {Evidence, Species}
SET Evidence = "These phosphatases act on multiple substrates, including Syk, to inhibit immune
activation [68, 69]. Hence, CD33 activation leads to increased SHP-1 and SHP-2 activity that antagonizes Syk,
inhibiting ITAM-signaling proteins, possibly including TREM2/DAP12 (Fig. 1, [70, 71])."
SET Species = 9606
act(p(HGNC:PTPN6)) =| act(p(HGNC:SYK))
act(p(HGNC:PTPN11)) =| act(p(HGNC:SYK))
act(p(HGNC:SYK)) -> act(p(HGNC:TREM2))
act(p(HGNC:SYK)) -> act(p(HGNC:TYROBP))
UNSET ALL
-
pybel.examples.
sialic_acid_graph
¶
An example describing a single evidence about BRAF.
SET Citation = {"PubMed", "11283246"}
SET Evidence = "Expression of both dominant negative forms, RasN17 and Rap1N17, in UT7-Mpl cells decreased
thrombopoietin-mediated Elk1-dependent transcription. This suggests that both Ras and Rap1 contribute to
thrombopoietin-induced ELK1 transcription."
SET Species = 9606
p(HGNC:THPO) increases kin(p(HGNC:BRAF))
p(HGNC:THPO) increases kin(p(HGNC:RAF1))
kin(p(HGNC:BRAF)) increases tscript(p(HGNC:ELK1))
UNSET ALL
-
pybel.examples.
braf_example_graph
¶
An example describing statins.
-
pybel.examples.
statin_graph
¶
An example describing a translocation.
SET Citation = {"PubMed", "16170185"}
SET Evidence = "These modifications render Ras functional and capable of localizing to the lipid-rich inner surface of the cell membrane. The first and most critical modification, farnesylation, which is principally catalyzed by protein FTase, adds a 15-carbon hydrobobic farnesyl isoprenyl tail to the carboxyl terminus of Ras."
SET TextLocation = Review
cat(complex(p(HGNC:FNTA),p(HGNC:FNTB))) directlyIncreases p(SFAM:"RAS Family",pmod(F))
p(SFAM:"RAS Family",pmod(F)) directlyIncreases tloc(p(SFAM:"RAS Family"),MESHCS:"Intracellular Space",MESHCS:"Cell Membrane")
-
pybel.examples.
ras_tloc_graph
¶
Summary¶
Summary functions for BEL graphs.
-
pybel.struct.summary.
get_syntax_errors
(graph)[source]¶ List the syntax errors encountered during compilation of a BEL script.
-
pybel.struct.summary.
count_error_types
(graph)[source]¶ Count the occurrence of each type of error in a graph.
-
pybel.struct.summary.
count_naked_names
(graph)[source]¶ Count the frequency of each naked name (names without namespaces).
-
pybel.struct.summary.
calculate_incorrect_name_dict
(graph)[source]¶ Get missing names grouped by namespace.
-
pybel.struct.summary.
calculate_error_by_annotation
(graph, annotation)[source]¶ Group error names by a given annotation.
-
pybel.struct.summary.
get_functions
(graph)[source]¶ Get the set of all functions used in this graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A set of functions
-
pybel.struct.summary.
count_functions
(graph)[source]¶ Count the frequency of each function present in a graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A Counter from {function: frequency}
-
pybel.struct.summary.
count_namespaces
(graph)[source]¶ Count the frequency of each namespace across all nodes (that have namespaces).
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Returns
A Counter from {namespace: frequency}
- Return type
-
pybel.struct.summary.
get_namespaces
(graph)[source]¶ Get the set of all namespaces used in this graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A set of namespaces
-
pybel.struct.summary.
count_names_by_namespace
(graph, namespace)[source]¶ Get the set of all of the names in a given namespace that are in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
namespace (
str
) – A namespace keyword
- Return type
- Returns
A counter from {name: frequency}
- Raises
IndexError – if the namespace is not defined in the graph.
-
pybel.struct.summary.
get_names_by_namespace
(graph, namespace)[source]¶ Get the set of all of the names in a given namespace that are in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
namespace (
str
) – A namespace keyword
- Return type
- Returns
A set of names belonging to the given namespace that are in the given graph
- Raises
IndexError – if the namespace is not defined in the graph.
-
pybel.struct.summary.
get_unused_namespaces
(graph)[source]¶ Get the set of all namespaces that are defined in a graph, but are never used.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A set of namespaces that are included but not used
-
pybel.struct.summary.
count_variants
(graph)[source]¶ Count how many of each type of variant a graph has.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
-
pybel.struct.summary.
count_pathologies
(graph)[source]¶ Count the number of edges in which each pathology is incident.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
Counter
[BaseEntity
]
-
pybel.struct.summary.
get_top_pathologies
(graph, n=15)[source]¶ Get the top highest relationship-having edges in the graph by BEL.
-
pybel.struct.summary.
iterate_pubmed_identifiers
(graph)[source]¶ Iterate over all PubMed identifiers in a graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
An iterator over the PubMed identifiers in the graph
-
pybel.struct.summary.
get_pubmed_identifiers
(graph)[source]¶ Get the set of all PubMed identifiers cited in the construction of a graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A set of all PubMed identifiers cited in the construction of this graph
-
pybel.struct.summary.
iter_annotation_value_pairs
(graph)[source]¶ Iterate over the key/value pairs, with duplicates, for each annotation used in a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
-
pybel.struct.summary.
iter_annotation_values
(graph, annotation)[source]¶ Iterate over all of the values for an annotation used in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
annotation (str) – The annotation to grab
- Return type
-
pybel.struct.summary.
get_annotation_values_by_annotation
(graph)[source]¶ Get the set of values for each annotation used in a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A dictionary of {annotation key: set of annotation values}
-
pybel.struct.summary.
get_annotation_values
(graph, annotation)[source]¶ Get all values for the given annotation.
- Parameters
graph (pybel.BELGraph) – A BEL graph
annotation (
str
) – The annotation to summarize
- Return type
- Returns
A set of all annotation values
-
pybel.struct.summary.
count_relations
(graph)[source]¶ Return a histogram over all relationships in a graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A Counter from {relation type: frequency}
-
pybel.struct.summary.
get_annotations
(graph)[source]¶ Get the set of annotations used in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A set of annotation keys
-
pybel.struct.summary.
count_annotations
(graph)[source]¶ Count how many times each annotation is used in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A Counter from {annotation key: frequency}
-
pybel.struct.summary.
get_unused_annotations
(graph)[source]¶ Get the set of all annotations that are defined in a graph, but are never used.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A set of annotations
-
pybel.struct.summary.
get_unused_list_annotation_values
(graph)[source]¶ Get all of the unused values for list annotations.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
- Returns
A dictionary of {str annotation: set of str values that aren’t used}
Operations¶
This page outlines operations that can be done to BEL graphs.
-
pybel.struct.
left_full_join
(g, h)[source]¶ Add all nodes and edges from
h
tog
, in-place forg
.- Parameters
g (pybel.BELGraph) – A BEL graph
h (pybel.BELGraph) – A BEL graph
Example usage:
>>> import pybel >>> g = pybel.from_path('...') >>> h = pybel.from_path('...') >>> left_full_join(g, h)
- Return type
None
-
pybel.struct.
left_outer_join
(g, h)[source]¶ Only add components from the
h
that are touchingg
.Algorithm:
Identify all weakly connected components in
h
Add those that have an intersection with the
g
Example usage:
>>> import pybel >>> g = pybel.from_path('...') >>> h = pybel.from_path('...') >>> left_outer_join(g, h)
- Return type
None
-
pybel.struct.
union
(graphs, use_tqdm=False)[source]¶ Take the union over a collection of graphs into a new graph.
Assumes iterator is longer than 2, but not infinite.
- Parameters
- Returns
A merged graph
- Return type
Example usage:
>>> import pybel >>> g = pybel.from_path('...') >>> h = pybel.from_path('...') >>> k = pybel.from_path('...') >>> merged = union([g, h, k])
Filters¶
This module contains functions for filtering node and edge iterables.
It relies heavily on the concepts of functional programming and the concept of predicates.
-
pybel.struct.filters.
invert_edge_predicate
(edge_predicate)[source]¶ Build an edge predicate that is the inverse of the given edge predicate.
-
pybel.struct.filters.
and_edge_predicates
(edge_predicates)[source]¶ Concatenate multiple edge predicates to a new predicate that requires all predicates to be met.
-
pybel.struct.filters.
filter_edges
(graph, edge_predicates)[source]¶ Apply a set of filters to the edges iterator of a BEL graph.
-
pybel.struct.filters.
count_passed_edge_filter
(graph, edge_predicates)[source]¶ Return the number of edges passing a given set of predicates.
- Return type
-
pybel.struct.filters.
edge_predicate
(func)[source]¶ Decorate an edge predicate function that only takes a dictionary as its singular argument.
Apply this as a decorator to a function that takes a single argument, a PyBEL node data dictionary, to make sure that it can also accept a pair of arguments, a BELGraph and a PyBEL node tuple as well.
-
pybel.struct.filters.
keep_edge_permissive
(*args, **kwargs)[source]¶ Return true for all edges.
- Parameters
data (dict) – A PyBEL edge data dictionary from a
pybel.BELGraph
- Return type
- Returns
Always returns
True
-
pybel.struct.filters.
has_provenance
(edge_data)[source]¶ Check if the edge has provenance information (i.e. citation and evidence).
- Return type
-
pybel.struct.filters.
has_pubmed
(edge_data)[source]¶ Check if the edge has a PubMed citation.
- Return type
Check if the edge contains author information for its citation.
- Return type
-
pybel.struct.filters.
is_causal_relation
(edge_data)[source]¶ Check if the given relation is causal.
- Return type
-
pybel.struct.filters.
not_causal_relation
(edge_data)[source]¶ Check if the given relation is not causal.
- Return type
-
pybel.struct.filters.
is_direct_causal_relation
(edge_data)[source]¶ Check if the edge is a direct causal relation.
- Return type
-
pybel.struct.filters.
is_associative_relation
(edge_data)[source]¶ Check if the edge has an association relation.
- Return type
-
pybel.struct.filters.
edge_has_activity
(edge_data)[source]¶ Check if the edge contains an activity in either the subject or object.
- Return type
-
pybel.struct.filters.
edge_has_degradation
(edge_data)[source]¶ Check if the edge contains a degradation in either the subject or object.
- Return type
-
pybel.struct.filters.
edge_has_translocation
(edge_data)[source]¶ Check if the edge has a translocation in either the subject or object.
- Return type
-
pybel.struct.filters.
edge_has_annotation
(edge_data, key)[source]¶ Check if an edge has the given annotation.
- Parameters
- Return type
- Returns
If the annotation key is present in the current data dictionary
For example, it might be useful to print all edges that are annotated with ‘Subgraph’:
>>> from pybel.examples import sialic_acid_graph >>> for u, v, data in sialic_acid_graph.edges(data=True): >>> if edge_has_annotation(data, 'Species') >>> print(u, v, data)
-
pybel.struct.filters.
has_pathology_causal
(graph, u, v, k)[source]¶ Check if the subject is a pathology and has a causal relationship with a non bioprocess/pathology.
- Return type
- Returns
If the subject of this edge is a pathology and it participates in a causal reaction.
-
pybel.struct.filters.
build_annotation_dict_all_filter
(annotations)[source]¶ Build an edge predicate for edges whose annotations are super-dictionaries of the given dictionary.
If no annotations are given, will always evaluate to true.
-
pybel.struct.filters.
build_annotation_dict_any_filter
(annotations)[source]¶ Build an edge predicate that passes for edges whose data dictionaries match the given dictionary.
If the given dictionary is empty, will always evaluate to true.
-
pybel.struct.filters.
build_upstream_edge_predicate
(nodes)[source]¶ Build an edge predicate that pass for relations for which one of the given nodes is the object.
-
pybel.struct.filters.
build_downstream_edge_predicate
(nodes)[source]¶ Build an edge predicate that passes for edges for which one of the given nodes is the subject.
-
pybel.struct.filters.
build_relation_predicate
(relations)[source]¶ Build an edge predicate that passes for edges with the given relation.
-
pybel.struct.filters.
build_pmid_inclusion_filter
(pmids)[source]¶ Build an edge predicate that passes for edges with citations from the given PubMed identifier(s).
Build an edge predicate that passes for edges with citations written by the given author(s).
-
pybel.struct.filters.
invert_node_predicate
(node_predicate)[source]¶ Build a node predicate that is the inverse of the given node predicate.
-
pybel.struct.filters.
concatenate_node_predicates
(node_predicates)[source]¶ Concatenate multiple node predicates to a new predicate that requires all predicates to be met.
Example usage:
>>> from pybel.dsl import protein, gene >>> from pybel.struct.filters.node_predicates import not_pathology, node_exclusion_predicate_builder >>> app_protein = protein(name='APP', namespace='HGNC') >>> app_gene = gene(name='APP', namespace='HGNC') >>> app_predicate = node_exclusion_predicate_builder([app_protein, app_gene]) >>> my_predicate = concatenate_node_predicates([not_pathology, app_predicate])
-
pybel.struct.filters.
filter_nodes
(graph, node_predicates)[source]¶ Apply a set of predicates to the nodes iterator of a BEL graph.
- Return type
Iterable
[BaseEntity
]
-
pybel.struct.filters.
get_nodes
(graph, node_predicates)[source]¶ Get the set of all nodes that pass the predicates.
- Return type
Set
[BaseEntity
]
-
pybel.struct.filters.
count_passed_node_filter
(graph, node_predicates)[source]¶ Count how many nodes pass a given set of node predicates.
- Return type
-
pybel.struct.filters.
function_inclusion_filter_builder
(func)[source]¶ Build a filter that only passes on nodes of the given function(s).
-
pybel.struct.filters.
data_missing_key_builder
(key)[source]¶ Build a filter that passes only on nodes that don’t have the given key in their data dictionary.
-
pybel.struct.filters.
build_node_data_search
(key, data_predicate)[source]¶ Build a filter for nodes whose associated data with the given key passes the given predicate.
-
pybel.struct.filters.
build_node_graph_data_search
(key, data_predicate)[source]¶ Build a function for testing data associated with the node in the graph.
-
pybel.struct.filters.
build_node_key_search
(query, key)[source]¶ Build a node filter for nodes whose values for the given key are superstrings of the query string(s).
-
pybel.struct.filters.
build_node_name_search
(query)[source]¶ Search nodes’ names.
Is a thin wrapper around
build_node_key_search()
withpybel.constants.NAME
-
pybel.struct.filters.
namespace_inclusion_builder
(namespace)[source]¶ Build a predicate for namespace inclusion.
-
pybel.struct.filters.
node_predicate
(f)[source]¶ Tag a node predicate that takes a dictionary to also accept a pair of (BELGraph, node).
Apply this as a decorator to a function that takes a single argument, a PyBEL node, to make sure that it can also accept a pair of arguments, a BELGraph and a PyBEL node as well.
-
pybel.struct.filters.
keep_node_permissive
(_)[source]¶ Return true for all nodes.
Given BEL graph
graph
, applyingkeep_node_permissive()
with a predicate on the nodes iterable as infilter(keep_node_permissive, graph)
will result in the same iterable as iterating directly over aBELGraph
- Return type
-
pybel.struct.filters.
is_abundance
(node)[source]¶ Return true if the node is an abundance.
- Return type
-
pybel.struct.filters.
is_pathology
(node)[source]¶ Return true if the node is a pathology.
- Return type
-
pybel.struct.filters.
not_pathology
(node)[source]¶ Return false if the node is a pathology.
- Return type
-
pybel.struct.filters.
has_variant
(node)[source]¶ Return true if the node has any variants.
- Return type
-
pybel.struct.filters.
has_protein_modification
(node)[source]¶ Return true if the node has a protein modification variant.
- Return type
-
pybel.struct.filters.
has_gene_modification
(node)[source]¶ Return true if the node has a gene modification.
- Return type
-
pybel.struct.filters.
has_hgvs
(node)[source]¶ Return true if the node has an HGVS variant.
- Return type
-
pybel.struct.filters.
has_fragment
(node)[source]¶ Return true if the node has a fragment.
- Return type
-
pybel.struct.filters.
has_activity
(graph, node)[source]¶ Return true if over any of the node’s edges, it has a molecular activity.
- Return type
-
pybel.struct.filters.
is_degraded
(graph, node)[source]¶ Return true if over any of the node’s edges, it is degraded.
- Return type
-
pybel.struct.filters.
is_translocated
(graph, node)[source]¶ Return true if over any of the node’s edges, it is translocated.
- Return type
-
pybel.struct.filters.
has_causal_in_edges
(graph, node)[source]¶ Return true if the node contains any in_edges that are causal.
- Return type
-
pybel.struct.filters.
has_causal_out_edges
(graph, node)[source]¶ Return true if the node contains any out_edges that are causal.
- Return type
-
pybel.struct.filters.
node_inclusion_predicate_builder
(nodes)[source]¶ Build a function that returns true for the given nodes.
-
pybel.struct.filters.
node_exclusion_predicate_builder
(nodes)[source]¶ Build a node predicate that returns false for the given nodes.
-
pybel.struct.filters.
is_causal_source
(graph, node)[source]¶ Return true of the node is a causal source.
Doesn’t have any causal in edge(s)
Does have causal out edge(s)
- Return type
-
pybel.struct.filters.
is_causal_sink
(graph, node)[source]¶ Return true if the node is a causal sink.
Does have causal in edge(s)
Doesn’t have any causal out edge(s)
- Return type
-
pybel.struct.filters.
is_causal_central
(graph, node)[source]¶ Return true if the node is neither a causal sink nor a causal source.
Does have causal in edges(s)
Does have causal out edge(s)
- Return type
-
pybel.struct.filters.
is_isolated_list_abundance
(graph, node, cls=<class 'pybel.dsl.node_classes.ListAbundance'>)[source]¶ Return if the node is a list abundance but has no qualified edges.
- Return type
-
pybel.struct.filters.
get_nodes_by_function
(graph, func)[source]¶ Get all nodes with the given function(s).
- Return type
Set
[BaseEntity
]
-
pybel.struct.filters.
get_nodes_by_namespace
(graph, namespaces)[source]¶ Get all nodes identified by the given namespace(s).
- Return type
Set
[BaseEntity
]
-
pybel.struct.filters.
part_has_modifier
(edge_data, part, modifier)[source]¶ Return true if the modifier is in the given subject/object part.
- Parameters
edge_data (
Mapping
) – PyBEL edge data dictionarypart (
str
) – eitherpybel.constants.SUBJECT
orpybel.constants.OBJECT
modifier (
str
) – The modifier to look for
- Return type
Transformations¶
This module contains functions that mutate or make transformations on a network.
-
pybel.struct.mutation.
collapse_to_genes
(graph)[source]¶ Collapse all protein, RNA, and miRNA nodes to their corresponding gene nodes.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
collapse_pair
(graph, survivor, victim)[source]¶ Rewire all edges from the synonymous node to the survivor node, then deletes the synonymous node.
Does not keep edges between the two nodes.
- Parameters
graph (pybel.BELGraph) – A BEL graph
survivor (
BaseEntity
) – The BEL node to collapse all edges on the synonym tovictim (
BaseEntity
) – The BEL node to collapse into the surviving node
- Return type
None
-
pybel.struct.mutation.
collapse_nodes
(graph, survivor_mapping)[source]¶ Collapse all nodes in values to the key nodes, in place.
- Parameters
graph (pybel.BELGraph) – A BEL graph
survivor_mapping (
Mapping
[BaseEntity
,Set
[BaseEntity
]]) – A dictionary with survivors as their keys, and iterables of the corresponding victims as values.
- Return type
None
-
pybel.struct.mutation.
collapse_all_variants
(graph)[source]¶ Collapse all genes’, RNAs’, miRNAs’, and proteins’ variants to their parents.
- Parameters
graph (pybel.BELGraph) – A BEL Graph
- Return type
None
-
pybel.struct.mutation.
surviors_are_inconsistent
(survivor_mapping)[source]¶ Check that there’s no transitive shit going on.
- Return type
Set
[BaseEntity
]
-
pybel.struct.mutation.
prune_protein_rna_origins
(graph)[source]¶ Delete genes that are only connected to one node, their correspond RNA, by a translation edge.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_filtered_edges
(graph, edge_predicates=None)[source]¶ Remove edges passing the given edge predicates.
- Parameters
graph (pybel.BELGraph) – A BEL graph
edge_predicates (None or ((pybel.BELGraph, tuple, tuple, int) -> bool) or iter[(pybel.BELGraph, tuple, tuple, int) -> bool]]) – A predicate or list of predicates
- Returns
-
pybel.struct.mutation.
remove_filtered_nodes
(graph, node_predicates=None)[source]¶ Remove nodes passing the given node predicates.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_associations
(graph)[source]¶ Remove all associative relationships from the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_pathologies
(graph)[source]¶ Remove pathology nodes from the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_biological_processes
(graph)[source]¶ Remove biological process nodes from the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_isolated_list_abundances
(graph)[source]¶ Remove isolated list abundances from the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_non_causal_edges
(graph)[source]¶ Remove non-causal edges from the graph.
-
pybel.struct.mutation.
expand_node_predecessors
(universe, graph, node)[source]¶ Expand around the predecessors of the given node in the result graph.
- Parameters
universe (pybel.BELGraph) – The graph containing the stuff to add
graph (pybel.BELGraph) – The graph to add stuff to
node (
BaseEntity
) – A BEL node
- Return type
None
-
pybel.struct.mutation.
expand_node_successors
(universe, graph, node)[source]¶ Expand around the successors of the given node in the result graph.
- Parameters
universe (pybel.BELGraph) – The graph containing the stuff to add
graph (pybel.BELGraph) – The graph to add stuff to
node (
BaseEntity
) – A BEL node
- Return type
None
-
pybel.struct.mutation.
expand_node_neighborhood
(universe, graph, node)[source]¶ Expand around the neighborhoods of the given node in the result graph.
Note: expands complexes’ members
- Parameters
universe (pybel.BELGraph) – The graph containing the stuff to add
graph (pybel.BELGraph) – The graph to add stuff to
node (
BaseEntity
) – A BEL node
- Return type
None
-
pybel.struct.mutation.
expand_nodes_neighborhoods
(universe, graph, nodes)[source]¶ Expand around the neighborhoods of the given node in the result graph.
- Parameters
universe (pybel.BELGraph) – The graph containing the stuff to add
graph (pybel.BELGraph) – The graph to add stuff to
nodes (
Iterable
[BaseEntity
]) – Nodes from the query graph
- Return type
None
-
pybel.struct.mutation.
expand_all_node_neighborhoods
(universe, graph, filter_pathologies=False)[source]¶ Expand the neighborhoods of all nodes in the given graph.
- Parameters
universe (pybel.BELGraph) – The graph containing the stuff to add
graph (pybel.BELGraph) – The graph to add stuff to
filter_pathologies (
bool
) – Should expansion take place around pathologies?
- Return type
None
-
pybel.struct.mutation.
expand_upstream_causal
(universe, graph)[source]¶ Add the upstream causal relations to the given sub-graph.
- Parameters
universe (pybel.BELGraph) – A BEL graph representing the universe of all knowledge
graph (pybel.BELGraph) – The target BEL graph to enrich with upstream causal controllers of contained nodes
-
pybel.struct.mutation.
expand_downstream_causal
(universe, graph)[source]¶ Add the downstream causal relations to the given sub-graph.
- Parameters
universe (pybel.BELGraph) – A BEL graph representing the universe of all knowledge
graph (pybel.BELGraph) – The target BEL graph to enrich with upstream causal controllers of contained nodes
-
pybel.struct.mutation.
get_subgraph_by_annotation_value
(graph, annotation, values)[source]¶ Induce a sub-graph over all edges whose annotations match the given key and value.
- Parameters
graph (pybel.BELGraph) – A BEL graph
annotation (str) – The annotation to group by
- Returns
A subgraph of the original BEL graph
- Return type
-
pybel.struct.mutation.
get_subgraph_by_annotations
(graph, annotations, or_=None)[source]¶ Induce a sub-graph given an annotations filter.
- Parameters
- Returns
A subgraph of the original BEL graph
- Return type
-
pybel.struct.mutation.
get_subgraph_by_pubmed
(graph, pubmed_identifiers)[source]¶ Induce a sub-graph over the edges retrieved from the given PubMed identifier(s).
- Parameters
graph (pybel.BELGraph) – A BEL graph
or list[str] pubmed_identifiers (str) – A PubMed identifier or list of PubMed identifiers
- Return type
Induce a sub-graph over the edges retrieved publications by the given author(s).
- Parameters
graph (pybel.BELGraph) – A BEL graph
or list[str] authors (str) – An author or list of authors
- Return type
-
pybel.struct.mutation.
get_subgraph_by_neighborhood
(graph, nodes)[source]¶ Get a BEL graph around the neighborhoods of the given nodes.
Returns none if no nodes are in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
nodes (
Iterable
[BaseEntity
]) – An iterable of BEL nodes
- Returns
A BEL graph induced around the neighborhoods of the given nodes
- Return type
Optional[pybel.BELGraph]
-
pybel.struct.mutation.
get_nodes_in_all_shortest_paths
(graph, nodes, weight=None, remove_pathologies=False)[source]¶ Get a set of nodes in all shortest paths between the given nodes.
Thinly wraps
networkx.all_shortest_paths()
.- Parameters
graph (pybel.BELGraph) – A BEL graph
nodes (
Iterable
[BaseEntity
]) – The list of nodes to use to use to find all shortest pathsweight (
Optional
[str
]) – Edge data key corresponding to the edge weight. If none, uses unweighted search.remove_pathologies (
bool
) – Should pathology nodes be removed first?
- Returns
A set of nodes appearing in the shortest paths between nodes in the BEL graph
Note
This can be trivially parallelized using
networkx.single_source_shortest_path()
-
pybel.struct.mutation.
get_subgraph_by_all_shortest_paths
(graph, nodes, weight=None, remove_pathologies=False)[source]¶ Induce a subgraph over the nodes in the pairwise shortest paths between all of the nodes in the given list.
- Parameters
graph (pybel.BELGraph) – A BEL graph
nodes (
Iterable
[BaseEntity
]) – A set of nodes over which to calculate shortest pathsweight (
Optional
[str
]) – Edge data key corresponding to the edge weight. If None, performs unweighted searchremove_pathologies (
bool
) – Should the pathology nodes be deleted before getting shortest paths?
- Returns
A BEL graph induced over the nodes appearing in the shortest paths between the given nodes
- Return type
Optional[pybel.BELGraph]
-
pybel.struct.mutation.
get_random_path
(graph)[source]¶ Get a random path from the graph as a list of nodes.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
List
[BaseEntity
]
-
pybel.struct.mutation.
get_graph_with_random_edges
(graph, n_edges)[source]¶ Build a new graph from a seeding of edges.
- Parameters
n_edges (
int
) – Number of edges to randomly select from the given graph- Return type
-
pybel.struct.mutation.
get_random_node
(graph, node_blacklist, invert_degrees=None)[source]¶ Choose a node from the graph with probabilities based on their degrees.
-
pybel.struct.mutation.
get_random_subgraph
(graph, number_edges=None, number_seed_edges=None, seed=None, invert_degrees=None)[source]¶ Generate a random subgraph based on weighted random walks from random seed edges.
- Parameters
number_edges (Optional[int]) – Maximum number of edges. Defaults to
pybel_tools.constants.SAMPLE_RANDOM_EDGE_COUNT
(250).number_seed_edges (Optional[int]) – Number of nodes to start with (which likely results in different components in large graphs). Defaults to
SAMPLE_RANDOM_EDGE_SEED_COUNT
(5).seed (Optional[int]) – A seed for the random state
invert_degrees (Optional[bool]) – Should the degrees be inverted? Defaults to true.
- Return type
-
pybel.struct.mutation.
get_upstream_causal_subgraph
(graph, nbunch)[source]¶ Induce a sub-graph from all of the upstream causal entities of the nodes in the nbunch.
- Return type
-
pybel.struct.mutation.
get_downstream_causal_subgraph
(graph, nbunch)[source]¶ Induce a sub-graph from all of the downstream causal entities of the nodes in the nbunch.
- Return type
-
pybel.struct.mutation.
get_subgraph_by_edge_filter
(graph, edge_predicates=None)[source]¶ Induce a sub-graph on all edges that pass the given filters.
- Parameters
- Returns
A BEL sub-graph induced over the edges passing the given filters
- Return type
-
pybel.struct.mutation.
get_subgraph_by_induction
(graph, nodes)[source]¶ Induce a sub-graph over the given nodes or return None if none of the nodes are in the given graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
nodes (
Iterable
[BaseEntity
]) – A list of BEL nodes in the graph
- Return type
Optional[pybel.BELGraph]
-
pybel.struct.mutation.
get_multi_causal_upstream
(graph, nbunch)[source]¶ Get the union of all the 2-level deep causal upstream subgraphs from the nbunch.
- Parameters
graph (pybel.BELGraph) – A BEL graph
nbunch (
Union
[BaseEntity
,Iterable
[BaseEntity
]]) – A BEL node or list of BEL nodes
- Returns
A subgraph of the original BEL graph
- Return type
-
pybel.struct.mutation.
get_multi_causal_downstream
(graph, nbunch)[source]¶ Get the union of all of the 2-level deep causal downstream subgraphs from the nbunch.
- Parameters
graph (pybel.BELGraph) – A BEL graph
nbunch (
Union
[BaseEntity
,Iterable
[BaseEntity
]]) – A BEL node or list of BEL nodes
- Returns
A subgraph of the original BEL graph
- Return type
-
pybel.struct.mutation.
get_subgraph_by_second_neighbors
(graph, nodes, filter_pathologies=False)[source]¶ Get a graph around the neighborhoods of the given nodes and expand to the neighborhood of those nodes.
Returns none if none of the nodes are in the graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
nodes (
Iterable
[BaseEntity
]) – An iterable of BEL nodesfilter_pathologies (
bool
) – Should expansion take place around pathologies?
- Returns
A BEL graph induced around the neighborhoods of the given nodes
- Return type
Optional[pybel.BELGraph]
-
pybel.struct.mutation.
enrich_rnas_with_genes
(graph)[source]¶ Add the corresponding gene node for each RNA/miRNA node and connect them with a transcription edge.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
None
-
pybel.struct.mutation.
enrich_proteins_with_rnas
(graph)[source]¶ Add the corresponding RNA node for each protein node and connect them with a translation edge.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
None
-
pybel.struct.mutation.
enrich_protein_and_rna_origins
(graph)[source]¶ Add the corresponding RNA for each protein then the corresponding gene for each RNA/miRNA.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
None
-
pybel.struct.mutation.
strip_annotations
(graph)[source]¶ Strip all the annotations from a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
None
-
pybel.struct.mutation.
add_annotation_value
(graph, annotation, value, strict=True)[source]¶ Add the given annotation/value pair to all qualified edges.
- Parameters
graph (pybel.BELGraph) –
annotation (
str
) –value (
str
) –strict (
bool
) – Should the function ensure the annotation has already been defined?
- Return type
None
-
pybel.struct.mutation.
remove_annotation_value
(graph, annotation, value)[source]¶ Remove the given annotation/value pair to all qualified edges.
- Parameters
graph (pybel.BELGraph) –
annotation (
str
) –value (
str
) –
- Return type
None
-
pybel.struct.mutation.
remove_citation_metadata
(graph)[source]¶ Remove the metadata associated with a citation.
Best practice is to add this information programmatically.
- Return type
None
-
pybel.struct.mutation.
infer_child_relations
(graph, node)[source]¶ Propagate causal relations to children.
-
pybel.struct.mutation.
remove_isolated_nodes
(graph)[source]¶ Remove isolated nodes from the network, in place.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
pybel.struct.mutation.
remove_isolated_nodes_op
(graph)[source]¶ Build a new graph excluding the isolated nodes.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
-
pybel.struct.mutation.
expand_by_edge_filter
(source, target, edge_predicates)[source]¶ Expand a target graph by edges in the source matching the given predicates.
- Parameters
source (pybel.BELGraph) – A BEL graph
target (pybel.BELGraph) – A BEL graph
edge_predicates (
Union
[Callable
[[BELGraph
,BaseEntity
,BaseEntity
,str
],bool
],Iterable
[Callable
[[BELGraph
,BaseEntity
,BaseEntity
,str
],bool
]]]) – An edge predicate or list of edge predicates
- Returns
A BEL sub-graph induced over the edges passing the given filters
- Return type
Grouping¶
Functions for grouping BEL graphs into sub-graphs.
-
pybel.struct.grouping.
get_subgraphs_by_annotation
(graph, annotation, sentinel=None)[source]¶ Stratify the given graph into sub-graphs based on the values for edges’ annotations.
- Parameters
graph (pybel.BELGraph) – A BEL graph
annotation (str) – The annotation to group by
sentinel (Optional[str]) – The value to stick unannotated edges into. If none, does not keep undefined.
- Return type
Pipeline¶
-
class
pybel.
Pipeline
(protocol=None)[source]¶ Build and runs analytical pipelines on BEL graphs.
Example usage:
>>> from pybel import BELGraph >>> from pybel.struct.pipeline import Pipeline >>> from pybel.struct.mutation import enrich_protein_and_rna_origins, prune_protein_rna_origins >>> graph = BELGraph() >>> example = Pipeline() >>> example.append(enrich_protein_and_rna_origins) >>> example.append(prune_protein_rna_origins) >>> result = example.run(graph)
Initialize the pipeline with an optional pre-defined protocol.
- Parameters
protocol (
Optional
[Iterable
[Dict
]]) – An iterable of dictionaries describing how to transform a network
-
static
from_functions
(functions)[source]¶ Build a pipeline from a list of functions.
- Parameters
functions (iter[((pybel.BELGraph) -> pybel.BELGraph) or ((pybel.BELGraph) -> None) or str]) – A list of functions or names of functions
Example with function:
>>> from pybel.struct.pipeline import Pipeline >>> from pybel.struct.mutation import remove_associations >>> pipeline = Pipeline.from_functions([remove_associations])
Equivalent example with function names:
>>> from pybel.struct.pipeline import Pipeline >>> pipeline = Pipeline.from_functions(['remove_associations'])
Lookup by name is possible for built in functions, and those that have been registered correctly using one of the four decorators:
pybel.struct.pipeline.transformation()
,pybel.struct.pipeline.in_place_transformation()
,pybel.struct.pipeline.uni_transformation()
,pybel.struct.pipeline.uni_in_place_transformation()
,
- Return type
Pipeline
-
append
(name, *args, **kwargs)[source]¶ Add a function (either as a reference, or by name) and arguments to the pipeline.
- Parameters
name (str or (pybel.BELGraph -> pybel.BELGraph)) – The name of the function
args – The positional arguments to call in the function
kwargs – The keyword arguments to call in the function
- Return type
Pipeline
- Returns
This pipeline for fluid query building
- Raises
MissingPipelineFunctionError – If the function is not registered
-
extend
(protocol)[source]¶ Add another pipeline to the end of the current pipeline.
- Parameters
protocol (
Union
[Iterable
[Dict
],Pipeline
]) – An iterable of dictionaries (or another Pipeline)- Return type
Pipeline
- Returns
This pipeline for fluid query building
Example: >>> p1 = Pipeline.from_functions([‘enrich_protein_and_rna_origins’]) >>> p2 = Pipeline.from_functions([‘remove_pathologies’]) >>> p1.extend(p2)
-
run
(graph, universe=None)[source]¶ Run the contained protocol on a seed graph.
- Parameters
graph (pybel.BELGraph) – The seed BEL graph
universe (pybel.BELGraph) – Allows just-in-time setting of the universe in case it wasn’t set before. Defaults to the given network.
- Returns
The new graph is returned if not applied in-place
- Return type
-
static
load
(file)[source]¶ Load a protocol from JSON contained in file.
- Return type
Pipeline
- Returns
The pipeline represented by the JSON in the file
- Raises
MissingPipelineFunctionError – If any functions are not registered
-
static
loads
(s)[source]¶ Load a protocol from a JSON string.
- Parameters
s (
str
) – A JSON string- Return type
Pipeline
- Returns
The pipeline represented by the JSON in the file
- Raises
MissingPipelineFunctionError – If any functions are not registered
Transformation Decorators¶
This module contains the functions for decorating transformation functions.
A transformation function takes in a pybel.BELGraph
and either returns None (in-place) or a new
pybel.BELGraph
(out-of-place).
-
pybel.struct.pipeline.decorators.
in_place_transformation
(func)¶ A decorator for functions that modify BEL graphs in-place
-
pybel.struct.pipeline.decorators.
uni_in_place_transformation
(func)¶ A decorator for functions that require a “universe” graph and modify BEL graphs in-place
-
pybel.struct.pipeline.decorators.
uni_transformation
(func)¶ A decorator for functions that require a “universe” graph and create new BEL graphs from old BEL graphs
-
pybel.struct.pipeline.decorators.
transformation
(func)¶ A decorator for functions that create new BEL graphs from old BEL graphs
-
pybel.struct.pipeline.decorators.
get_transformation
(name)[source]¶ Get a transformation function and error if its name is not registered.
- Parameters
name (
str
) – The name of a function to look up- Returns
A transformation function
- Raises
MissingPipelineFunctionError – If the given function name is not registered
Exceptions¶
Exceptions for the pybel.struct.pipeline
module.
-
exception
pybel.struct.pipeline.exc.
MissingPipelineFunctionError
[source]¶ Raised when trying to run the pipeline with a function that isn’t registered.
-
exception
pybel.struct.pipeline.exc.
MetaValueError
[source]¶ Raised when getting an invalid meta value.
-
exception
pybel.struct.pipeline.exc.
MissingUniverseError
[source]¶ Raised when running a universe function without a universe being present.
Input and Output¶
Input and output functions for BEL graphs.
PyBEL provides multiple lossless interchange options for BEL. Lossy output formats are also included for convenient export to other programs. Notably, a de facto interchange using Resource Description Framework (RDF) to match the ability of other existing software is excluded due the immaturity of the BEL to RDF mapping.
Import¶
Parsing Modes¶
The PyBEL parser has several modes that can be enabled and disabled. They are described below.
Allow Naked Names¶
By default, this is set to False
. The parser does not allow identifiers that are not qualified with
namespaces (naked names), like in p(YFG)
. A proper namespace, like p(HGNC:YFG)
must be used. By
setting this to True
, the parser becomes permissive to naked names. In general, this is bad practice and this
feature will be removed in the future.
Allow Nested¶
By default, this is set to False
. The parser does not allow nested statements is disabled. See overview.
By setting this to True
the parser will accept nested statements one level deep.
Citation Clearing¶
By default, this is set to True
. While the BEL specification clearly states how the language should be used as
a state machine, many BEL documents do not conform to the strict SET
/UNSET
rules. To guard against
annotations accidentally carried from one set of statements to the next, the parser has two modes. By default, in
citation clearing mode, when a SET CITATION
command is reached, it will clear all other annotations (except
the STATEMENT_GROUP
, which has higher priority). This behavior can be disabled by setting this to False
to re-enable strict parsing.
Reference¶
-
pybel.
from_bel_script
(path, **kwargs)[source]¶ Load a BEL graph from a file resource. This function is a thin wrapper around
from_lines()
.The remaining keyword arguments are passed to
pybel.io.line_utils.parse_lines()
, which populates aBELGraph
.- Return type
BELGraph
-
pybel.
from_bel_script_url
(url, **kwargs)[source]¶ Load a BEL graph from a URL resource.
- Parameters
url (
str
) – A valid URL pointing to a BEL document
The remaining keyword arguments are passed to
pybel.io.line_utils.parse_lines()
.- Return type
BELGraph
Transport¶
All transport pairs are reflective and data-preserving.
Bytes¶
Conversion functions for BEL graphs with bytes and Python pickles.
-
pybel.
from_bytes
(bytes_graph, check_version=True)[source]¶ Read a graph from bytes (the result of pickling the graph).
-
pybel.
to_bytes
(graph, protocol=4)[source]¶ Convert a graph to bytes with pickle.
Note that the pickle module has some incompatibilities between Python 2 and 3. To export a universally importable pickle, choose 0, 1, or 2.
- Parameters
graph (
BELGraph
) – A BEL networkprotocol (
int
) – Pickling protocol to use. Defaults toHIGHEST_PROTOCOL
.
- Return type
Node-Link JSON¶
Conversion functions for BEL graphs with node-link JSON.
-
pybel.
from_nodelink
(graph_json_dict, check_version=True)[source]¶ Build a graph from node-link JSON Object.
- Return type
BELGraph
-
pybel.
from_nodelink_jsons
(graph_json_str, check_version=True)[source]¶ Read a BEL graph from a node-link JSON string.
- Return type
BELGraph
-
pybel.
to_nodelink_jsons
(graph, **kwargs)[source]¶ Dump this graph as a node-link JSON object to a string.
- Return type
-
pybel.
from_nodelink_file
(path, check_version=True)[source]¶ Build a graph from the node-link JSON contained in the given file.
-
pybel.
to_nodelink_file
(graph, path, **kwargs)[source]¶ Write this graph as node-link JSON to a file.
Cyberinfrastructure Exchange¶
This module wraps conversion between pybel.BELGraph
and the Cyberinfrastructure Exchange (CX) JSON.
CX is an aspect-oriented network interchange format encoded in JSON with a format inspired by the JSON-LD encoding of Resource Description Framework (RDF). It is primarily used by the Network Data Exchange (NDEx) and more recent versions of Cytoscape.
See also
The NDEx Data Model Specification
CX Support for Cytoscape.js on the Cytoscape App Store
-
pybel.
from_cx_jsons
(graph_json_str)[source]¶ Read a BEL graph from a CX JSON string.
- Return type
BELGraph
-
pybel.
to_cx_jsons
(graph, **kwargs)[source]¶ Dump this graph as a CX JSON object to a string.
- Return type
-
pybel.
to_cx_file
(graph, path, indent=2, **kwargs)[source]¶ Write a BEL graph to a JSON file in CX format.
- Parameters
Example: >>> from pybel.examples import sialic_acid_graph >>> from pybel import to_cx_file >>> with open(‘graph.cx’, ‘w’) as f: >>> … to_cx_file(sialic_acid_graph, f)
- Return type
None
JSON Graph Interchange Format¶
Conversion functions for BEL graphs with JGIF JSON.
The JSON Graph Interchange Format (JGIF) is specified similarly to the Node-Link JSON. Interchange with this format provides compatibilty with other software and repositories, such as the Causal Biological Network Database.
-
pybel.
to_jgif
(graph)[source]¶ Build a JGIF dictionary from a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
- Returns
A JGIF dictionary
- Return type
Warning
Untested! This format is not general purpose and is therefore time is not heavily invested. If you want to use Cytoscape.js, we suggest using
pybel.to_cx()
instead.Example: >>> import pybel, os, json >>> graph_url = ‘https://arty.scai.fraunhofer.de/artifactory/bel/knowledge/selventa-small-corpus/selventa-small-corpus-20150611.bel’ >>> graph = pybel.from_bel_script_url(graph_url) >>> graph_jgif_json = pybel.to_jgif(graph) >>> with open(os.path.expanduser(‘~/Desktop/small_corpus.json’), ‘w’) as f: … json.dump(graph_jgif_json, f)
-
pybel.
from_jgif_jsons
(graph_json_str)[source]¶ Read a BEL graph from a JGIF JSON string.
- Return type
BELGraph
-
pybel.
to_jgif_jsons
(graph, **kwargs)[source]¶ Dump this graph as a JGIF JSON object to a string.
- Return type
-
pybel.
to_jgif_gz
(graph, path, **kwargs)[source]¶ Write a graph as JGIF JSON to a gzip file.
- Return type
None
-
pybel.
post_jgif
(graph, url=None, **kwargs)[source]¶ Post the JGIF to a given URL.
- Return type
Response
-
pybel.
from_cbn_jgif
(graph_jgif_dict)[source]¶ Build a BEL graph from CBN JGIF.
Map the JGIF used by the Causal Biological Network Database to standard namespace and annotations, then builds a BEL graph using
pybel.from_jgif()
.- Parameters
graph_jgif_dict (dict) – The JSON object representing the graph in JGIF format
- Return type
Example: >>> import requests >>> from pybel import from_cbn_jgif >>> apoptosis_url = ‘http://causalbionet.com/Networks/GetJSONGraphFile?networkId=810385422’ >>> graph_jgif_dict = requests.get(apoptosis_url).json() >>> graph = from_cbn_jgif(graph_jgif_dict)
Warning
Handling the annotations is not yet supported, since the CBN documents do not refer to the resources used to create them. This may be added in the future, but the annotations must be stripped from the graph before uploading to the network store using
pybel.struct.mutation.strip_annotations()
.
Export¶
Umbrella Node-Link JSON¶
The Umbrella Node-Link JSON format is similar to node-link but uses full BEL terms as nodes.
Given a BEL statement describing that X
phosphorylates Y
like act(p(X)) -> p(Y, pmod(Ph))
,
PyBEL usually stores the act()
information about X
as part of the relationship. In Umbrella mode,
this stays as part of the node.
Note that this generates additional nodes in the network for each of the “modified” versions of
the node. For example, act(p(X))
will be represented as individual node instead of
p(X)
, as in the standard node-link JSON exporter.
A user might want to use this exporter in the following scenarios:
Represent transitivity in activities like in
p(X, pmod(Ph)) -> act(p(X)) -> p(Y, pmod(Ph)) -> act(p(Y))
with four nodes that are more ammenable to simulatons (e.g., boolean networks, petri nets).Visualizing networks that in similar way to the legacy BEL Cytoscape plugin from the BEL Framework (warning: now defunct) using tools like Cytoscape.
GraphDati¶
Conversion functions for BEL graphs with GraphDati.
-
pybel.
to_graphdati_file
(graph, path, use_identifiers=True, **kwargs)[source]¶ Write this graph as GraphDati JSON to a file.
-
pybel.
to_graphdati_gz
(graph, path, **kwargs)[source]¶ Write a graph as GraphDati JSON to a gzip file.
- Return type
None
-
pybel.
to_graphdati_jsonl
(graph, file, use_identifiers=True, use_tqdm=True)[source]¶ Write this graph as a GraphDati JSON lines file.
-
pybel.
to_graphdati_jsonl_gz
(graph, path, **kwargs)[source]¶ Write a graph as GraphDati JSONL to a gzip file.
- Return type
None
-
pybel.
to_graphdati_jsons
(graph, **kwargs)[source]¶ Dump this graph as a GraphDati JSON object to a string.
- Parameters
graph (
BELGraph
) – A BEL graph- Return type
-
pybel.
post_graphdati
(graph, username='demo@biodati.com', password='demo', base_url='https://nanopubstore.demo.biodati.com', chunksize=None, **kwargs)[source]¶ Post this graph to a BioDati server.
- Parameters
graph (
BELGraph
) – A BEL graphusername (
str
) – The email address to log in to BioDati. Defaults to “demo@biodati.com” for the demo serverpassword (
str
) – The password to log in to BioDati. Defaults to “demo” for the demo serverbase_url (
str
) – The BioDati server base url. Defaults to “https://nanopubstore.demo.biodati.com” for the demo serverchunksize (
Optional
[int
]) – The number of nanopubs to post at a time. By default, does all.
Warning
The default public BioDati server has been put here. You should switch it to yours.
- Return type
Response
GraphML¶
Conversion functions for BEL graphs with GraphML.
-
pybel.
to_graphml
(graph, path, schema=None)[source]¶ Write a graph to a GraphML XML file using
networkx.write_graphml()
.- Parameters
The .graphml file extension is suggested so Cytoscape can recognize it. By default, this function exports using the PyBEL schema of including modifier information into the edges. As an alternative, this function can also distinguish between
- Return type
None
Miscellaneous¶
This module contains IO functions for outputting BEL graphs to lossy formats, such as GraphML and CSV.
-
pybel.
to_csv
(graph, path, sep=None)[source]¶ Write the graph as a tab-separated edge list.
The resulting file will contain the following columns:
Source BEL term
Relation
Target BEL term
Edge data dictionary
See the Data Models section of the documentation for which data are stored in the edge data dictionary, such as queryable information about transforms on the subject and object and their associated metadata.
- Return type
None
-
pybel.
to_sif
(graph, path, sep=None)[source]¶ Write the graph as a tab-separated SIF file.
The resulting file will contain the following columns:
Source BEL term
Relation
Target BEL term
This format is simple and can be used readily with many applications, but is lossy in that it does not include relation metadata.
- Return type
None
-
pybel.
to_gsea
(graph, path)[source]¶ Write the genes/gene products to a GRP file for use with GSEA gene set enrichment analysis.
See also
GSEA publication
- Return type
None
Databases¶
SQL Databases¶
Conversion functions for BEL graphs with a SQL database.
-
pybel.
from_database
(name, version=None, manager=None)[source]¶ Load a BEL graph from a database.
If name and version are given, finds it exactly with
pybel.manager.Manager.get_network_by_name_version()
. If just the name is given, finds most recent withpybel.manager.Manager.get_network_by_name_version()
Neo4j¶
Output functions for BEL graphs to Neo4j.
-
pybel.
to_neo4j
(graph, neo_connection, use_tqdm=False)[source]¶ Upload a BEL graph to a Neo4j graph database using
py2neo
.- Parameters
graph (pybel.BELGraph) – A BEL Graph
neo_connection (str or py2neo.Graph) – A
py2neo
connection object. Refer to the py2neo documentation for how to build this object.
Example Usage:
>>> import py2neo >>> import pybel >>> from pybel.examples import sialic_acid_graph >>> neo_graph = py2neo.Graph("http://localhost:7474/db/data/") # use your own connection settings >>> pybel.to_neo4j(sialic_acid_graph, neo_graph)
BEL Commons¶
This module facilitates rudimentary data exchange with BEL Commons.
-
pybel.
from_web
(network_id, host=None)[source]¶ Retrieve a public network from BEL Commons.
In the future, this function may be extended to support authentication.
- Parameters
network_id (
int
) – The BEL Commons network identifierhost (
Optional
[str
]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_HOST
or the environment asPYBEL_REMOTE_HOST
Defaults topybel.constants.DEFAULT_SERVICE_URL
- Return type
BELGraph
-
pybel.
to_web
(graph, host=None, user=None, password=None, public=False)[source]¶ Send a graph to the receiver service and returns the
requests
response object.- Parameters
graph (
BELGraph
) – A BEL graphhost (
Optional
[str
]) – The location of the BEL Commons server. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_HOST
or the environment asPYBEL_REMOTE_HOST
Defaults topybel.constants.DEFAULT_SERVICE_URL
user (
Optional
[str
]) – Username for BEL Commons. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_USER
or the environment asPYBEL_REMOTE_USER
password (
Optional
[str
]) – Password for BEL Commons. Alternatively, looks up in PyBEL config withPYBEL_REMOTE_PASSWORD
or the environment asPYBEL_REMOTE_PASSWORD
- Return type
Response
- Returns
The response object from
requests
INDRA¶
Conversion functions for BEL graphs with INDRA.
After assembling a model with INDRA, a list of
indra.statements.Statement
can be converted to a pybel.BELGraph
with
indra.assemblers.pybel.PybelAssembler
.
from indra.assemblers.pybel import PybelAssembler
import pybel
stmts = [
# A list of INDRA statements
]
pba = PybelAssembler(
stmts,
name='Graph Name',
version='0.0.1',
description='Graph Description'
)
graph = pba.make_model()
# Write to BEL file
pybel.to_bel_path(belgraph, 'simple_pybel.bel')
Warning
These functions are hard to unit test because they rely on a whole set of java dependencies and will likely not be for a while.
-
pybel.
from_indra_statements
(stmts, name=None, version=None, description=None, authors=None, contact=None, license=None, copyright=None, disclaimer=None)[source]¶ Import a model from
indra
.- Parameters
stmts (List[indra.statements.Statement]) – A list of statements
version (
Optional
[str
]) – The graph’s version. Recommended to use semantic versioning orYYYYMMDD
format.
- Return type
-
pybel.
to_indra_statements
(graph)[source]¶ Export this graph as a list of INDRA statements using the
indra.sources.pybel.PybelProcessor
.- Parameters
graph (pybel.BELGraph) – A BEL graph
- Return type
list[indra.statements.Statement]
-
pybel.
from_biopax
(path, name=None, version=None, description=None, authors=None, contact=None, license=None, copyright=None, disclaimer=None)[source]¶ Import a model encoded in Pathway Commons BioPAX via
indra
.- Parameters
path (
str
) – Path to a BioPAX OWL file
- Return type
Warning
Not compatible with all BioPAX! See INDRA documentation.
Manager¶
Manager API¶
The BaseManager takes care of building and maintaining the connection to the database via SQLAlchemy.
-
class
pybel.manager.
BaseManager
(engine, session)[source]¶ A wrapper around a SQLAlchemy engine and session.
Instantiate a manager from an engine and session.
-
base
¶ alias of
sqlalchemy.ext.declarative.api.Base
-
create_all
(checkfirst=True)[source]¶ Create the PyBEL cache’s database and tables.
- Parameters
checkfirst (
bool
) – Check if the database exists before trying to re-make it- Return type
None
-
The Manager collates multiple groups of functions for interacting with the database. For sake of code clarity, they are separated across multiple classes that are documented below.
-
class
pybel.manager.
Manager
(connection=None, engine=None, session=None, **kwargs)[source]¶ Bases:
pybel.manager.cache_manager._Manager
A manager for the PyBEL database.
Create a connection to database and a persistent session using SQLAlchemy.
A custom default can be set as an environment variable with the name
pybel.constants.PYBEL_CONNECTION
, using an RFC-1738 string. For example, a MySQL string can be given with the following form:mysql+pymysql://<username>:<password>@<host>/<dbname>?charset=utf8[&<options>]
A SQLite connection string can be given in the form:
sqlite:///~/Desktop/cache.db
Further options and examples can be found on the SQLAlchemy documentation on engine configuration.
- Parameters
connection (
Optional
[str
]) – An RFC-1738 database connection string. IfNone
, tries to load from the environment variablePYBEL_CONNECTION
then from the config file~/.config/pybel/config.json
whose value forPYBEL_CONNECTION
defaults topybel.constants.DEFAULT_CACHE_LOCATION
.engine – Optional engine to use. Must be specified with a session and no connection.
session – Optional session to use. Must be specified with an engine and no connection.
echo (bool) – Turn on echoing sql
autoflush (Optional[bool]) – Defaults to True if not specified in kwargs or configuration.
autocommit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.
expire_on_commit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.
scopefunc – Scoped function to pass to
sqlalchemy.orm.scoped_session()
From the Flask-SQLAlchemy documentation:
An extra key
'scopefunc'
can be set on theoptions
dict to specify a custom scope function. If it’s not provided, Flask’s app context stack identity is used. This will ensure that sessions are created and removed with the request/response cycle, and should be fine in most cases.Allowed Usages:
Instantiation with connection string as positional argument
>>> my_connection = 'sqlite:///~/Desktop/cache.db' >>> manager = Manager(my_connection)
Instantiation with connection string as positional argument with keyword arguments
>>> my_connection = 'sqlite:///~/Desktop/cache.db' >>> manager = Manager(my_connection, echo=True)
Instantiation with connection string as keyword argument
>>> my_connection = 'sqlite:///~/Desktop/cache.db' >>> manager = Manager(connection=my_connection)
Instantiation with connection string as keyword argument with keyword arguments
>>> my_connection = 'sqlite:///~/Desktop/cache.db' >>> manager = Manager(connection=my_connection, echo=True)
Instantiation with user-supplied engine and session objects as keyword arguments
>>> my_engine, my_session = ... # magical creation! See SQLAlchemy documentation >>> manager = Manager(engine=my_engine, session=my_session)
Manager Components¶
-
class
pybel.manager.
NetworkManager
(engine, session)[source]¶ Groups functions for inserting and querying networks in the database’s network store.
Instantiate a manager from an engine and session.
-
has_name_version
(name, version)[source]¶ Check if there exists a network with the name/version combination in the database.
- Return type
-
drop_network
(network)[source]¶ Drop a network, while also cleaning up any edges that are no longer part of any network.
- Return type
None
-
query_singleton_edges_from_network
(network)[source]¶ Return a query selecting all edge ids that only belong to the given network.
- Return type
-
get_network_by_name_version
(name, version)[source]¶ Load the network with the given name and version if it exists.
-
get_graph_by_name_version
(name, version)[source]¶ Load the BEL graph with the given name, or allows for specification of version.
- Return type
Optional
[BELGraph
]
-
get_networks_by_name
(name)[source]¶ Get all networks with the given name. Useful for getting all versions of a given network.
-
get_most_recent_network_by_name
(name)[source]¶ Get the most recently created network with the given name.
-
get_graph_by_most_recent
(name)[source]¶ Get the most recently created network with the given name as a
pybel.BELGraph
.- Return type
Optional
[BELGraph
]
-
get_network_by_id
(network_id)[source]¶ Get a network from the database by its identifier.
- Return type
-
get_graph_by_id
(network_id)[source]¶ Get a network from the database by its identifier and converts it to a BEL graph.
- Return type
BELGraph
-
get_networks_by_ids
(network_ids)[source]¶ Get a list of networks with the given identifiers.
Note: order is not necessarily preserved.
-
-
class
pybel.manager.
QueryManager
(engine, session)[source]¶ An extension to the Manager to make queries over the database.
Instantiate a manager from an engine and session.
-
query_nodes
(bel=None, type=None, namespace=None, name=None)[source]¶ Query nodes in the database.
- Parameters
- Return type
-
get_edges_with_annotation
(annotation, value)[source]¶ Search edges with the given annotation/value pair.
-
query_edges
(bel=None, source_function=None, source=None, target_function=None, target=None, relation=None)[source]¶ Return a query over the edges in the database.
Usually this means that you should call
list()
or.all()
on this result.- Parameters
bel (
Optional
[str
]) – BEL statement that represents the desired edge.source_function (
Optional
[str
]) – Filter source nodes with the given BEL functionsource (
Union
[None
,str
,Node
]) – BEL term of source node e.g.p(HGNC:APP)
orNode
object.target_function (
Optional
[str
]) – Filter target nodes with the given BEL functiontarget (
Union
[None
,str
,Node
]) – BEL term of target node e.g.p(HGNC:APP)
orNode
object.relation (
Optional
[str
]) – The relation that should be present between source and target node.
-
query_citations
(db=None, db_id=None, name=None, author=None, date=None, evidence_text=None)[source]¶ Query citations in the database.
-
query_edges_by_pubmed_identifiers
(pubmed_identifiers)[source]¶ Get all edges annotated to the documents identified by the given PubMed identifiers.
-
Models¶
This module contains the SQLAlchemy database models that support the definition cache and graph cache.
-
class
pybel.manager.models.
Base
(**kwargs)¶ The most base type
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
class
pybel.manager.models.
Namespace
(**kwargs)[source]¶ Represents a BEL Namespace.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
uploaded
¶ The date of upload
-
keyword
¶ Keyword that is used in a BEL file to identify a specific namespace
-
pattern
¶ Contains regex pattern for value identification.
-
miriam_id
¶ MIRIAM resource identifier matching the regular expression
^MIR:001\d{5}$
-
version
¶ Version of the namespace
-
url
¶ BELNS Resource location as URL
-
name
¶ Name of the given namespace
-
domain
¶ Domain for which this namespace is valid
-
species
¶ Taxonomy identifiers for which this namespace is valid
-
description
¶ Optional short description of the namespace
-
created
¶ DateTime of the creation of the namespace definition file
-
query_url
¶ URL that can be used to query the namespace (externally from PyBEL)
The author of the namespace
-
license
¶ License information
-
contact
¶ Contact information
-
-
class
pybel.manager.models.
NamespaceEntry
(**kwargs)[source]¶ Represents a name within a BEL namespace.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
name
¶ Name that is defined in the corresponding namespace definition file
-
identifier
¶ The database accession number
-
encoding
¶ The biological entity types for which this name is valid
-
-
class
pybel.manager.models.
Network
(**kwargs)[source]¶ Represents a collection of edges, specified by a BEL Script.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
name
¶ Name of the given Network (from the BEL file)
-
version
¶ Release version of the given Network (from the BEL file)
Authors of the underlying BEL file
-
contact
¶ Contact email from the underlying BEL file
-
description
¶ Descriptive text from the underlying BEL file
-
copyright
¶ Copyright information
-
disclaimer
¶ Disclaimer information
-
licenses
¶ License information
-
blob
¶ A pickled version of this network
-
classmethod
name_contains
(name_query)[source]¶ Build a filter for networks whose names contain the query.
-
classmethod
description_contains
(description_query)[source]¶ Build a filter for networks whose descriptions contain the query.
-
classmethod
id_in
(network_ids)[source]¶ Build a filter for networks whose identifiers appear in the given sequence.
-
store_bel
(graph)[source]¶ Insert a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL Graph
-
-
class
pybel.manager.models.
Modification
(**kwargs)[source]¶ The modifications that are present in the network are stored in this table.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
type
¶ Type of the stored modification e.g. Fusion, gmod, pmod, etc
-
variantString
¶ HGVS string if sequence modification
-
residue
¶ Three letter amino acid code if PMOD
-
position
¶ Position of PMOD or GMOD
-
-
class
pybel.manager.models.
Node
(**kwargs)[source]¶ Represents a BEL Term.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
type
¶ The type of the represented biological entity e.g. Protein or Gene
-
is_variant
¶ Identifies weather or not the given node is a variant
-
has_fusion
¶ Identifies weather or not the given node is a fusion
-
bel
¶ Canonical BEL term that represents the given node
-
data
¶ PyBEL BaseEntity as JSON
-
-
class
pybel.manager.models.
Author
(**kwargs)[source]¶ Contains all author names.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
class
pybel.manager.models.
Citation
(**kwargs)[source]¶ The information about the citations that are used to prove a specific relation are stored in this table.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
db
¶ Type of the stored publication e.g. PubMed
-
db_id
¶ Reference identifier of the publication e.g. PubMed_ID
-
title
¶ Title of the publication
-
journal
¶ Journal name
-
volume
¶ Volume of the journal
-
issue
¶ Issue within the volume
-
pages
¶ Pages of the publication
-
date
¶ Publication date
-
first_id
¶ First author
-
last_id
¶ Last author
-
property
is_enriched
¶ Return if this citation has been enriched for name, title, and other metadata.
- Return type
-
-
class
pybel.manager.models.
Evidence
(**kwargs)[source]¶ This table contains the evidence text that proves a specific relationship and refers the source that is cited.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
text
¶ Supporting text from a given publication
-
-
class
pybel.manager.models.
Property
(**kwargs)[source]¶ The property table contains additional information that is used to describe the context of a relation.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
is_subject
¶ Identifies which participant of the edge if affected by the given property
-
modifier
¶ The modifier: one of activity, degradation, location, or translocation
-
relative_key
¶ Relative key of effect e.g. to_tloc or from_tloc
-
property
side
¶ Return either
pybel.constants.SUBJECT
orpybel.constants.OBJECT
.- Return type
-
-
class
pybel.manager.models.
Edge
(**kwargs)[source]¶ Relationships between BEL nodes and their properties, annotations, and provenance.
A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
bel
¶ Valid BEL statement that represents the given edge
-
md5
¶ The hash of the source, target, and associated metadata
-
data
¶ The stringified JSON representing this edge
-
to_json
(include_id=False)[source]¶ Create a dictionary of one BEL Edge that can be used to create an edge in a
BELGraph
.
-
insert_into_graph
(graph)[source]¶ Insert this edge into a BEL graph.
- Parameters
graph (pybel.BELGraph) – A BEL graph
-
Cookbook¶
An extensive set of examples can be found on the PyBEL Notebooks repository on GitHub. These notebooks contain basic usage and also make numerous references to the analytical package PyBEL Tools
Configuration¶
The default connection string can be set as an environment variable in your ~/.bashrc
. If you’re using MySQL or
MariaDB, it could look like this:
$ export PYBEL_CONNECTION="mysql+pymysql://user:password@server_name/database_name?charset=utf8"
Prepare a Cytoscape Network¶
Load, compile, and export to Cytoscape format:
$ pybel convert --path ~/Desktop/example.bel --graphml ~/Desktop/example.graphml
In Cytoscape, open with Import > Network > From File
.
Command Line Interface¶
Note
The command line wrapper might not work on Windows. Use python3 -m pybel
if it has issues.
PyBEL automatically installs the command pybel
. This command can be used to easily compile BEL documents
and convert to other formats. See pybel --help
for usage details. This command makes logs of all conversions
and warnings to the directory ~/.pybel/
.
pybel¶
PyBEL CLI on /home/docs/checkouts/readthedocs.org/user_builds/pybel/envs/v0.14.4/bin/python
pybel [OPTIONS] COMMAND [ARGS]...
Options
-
--version
¶
Show the version and exit.
-
-c
,
--connection
<connection>
¶ Database connection string. [default: sqlite:////home/docs/.pybel/pybel_0.14.0_cache.db]
compile¶
Compile a BEL script to a graph.
pybel compile [OPTIONS] PATH
Options
-
--allow-naked-names
¶
Enable lenient parsing for naked names
-
--disallow-nested
¶
Disable lenient parsing for nested statements
-
--disallow-unqualified-translocations
¶
Disallow unqualified translocations
-
--no-identifier-validation
¶
Turn off identifier validation
-
--no-citation-clearing
¶
Turn off citation clearing
-
-r
,
--required-annotations
<required_annotations>
¶ Specify multiple required annotations
-
--upgrade-urls
¶
-
--skip-tqdm
¶
-
-v
,
--verbose
¶
Arguments
-
PATH
¶
Required argument
insert¶
Insert a graph to the database.
pybel insert [OPTIONS] path
Arguments
-
path
¶
Required argument
machine¶
Get content from the INDRA machine and upload to BEL Commons.
pybel machine [OPTIONS] [AGENTS]...
Options
-
--local
¶
Upload to local database.
-
--host
<host>
¶ URL of BEL Commons. Defaults to https://bel-commons.scai.fraunhofer.de
Arguments
-
AGENTS
¶
Optional argument(s)
manage¶
Manage the database.
pybel manage [OPTIONS] COMMAND [ARGS]...
drop¶
Drop the database.
pybel manage drop [OPTIONS]
Options
-
--yes
¶
Confirm the action without prompting.
namespaces¶
Manage namespaces.
pybel manage namespaces [OPTIONS] COMMAND [ARGS]...
drop¶
Drop a namespace by URL.
pybel manage namespaces drop [OPTIONS] URL
Arguments
-
URL
¶
Required argument
networks¶
Manage networks.
pybel manage networks [OPTIONS] COMMAND [ARGS]...
neo¶
Upload to neo4j.
pybel neo [OPTIONS] path
Options
-
--connection
<connection>
¶ Connection string for neo4j upload.
-
--password
<password>
¶
Arguments
-
path
¶
Required argument
post¶
Upload a graph to BEL Commons.
pybel post [OPTIONS] path
Options
-
--host
<host>
¶ URL of BEL Commons. Defaults to https://bel-commons.scai.fraunhofer.de
Arguments
-
path
¶
Required argument
serialize¶
Serialize a graph to various formats.
pybel serialize [OPTIONS] path
Options
-
--tsv
<tsv>
¶ Path to output a TSV file.
-
--edgelist
<edgelist>
¶ Path to output a edgelist file.
-
--sif
<sif>
¶ Path to output an SIF file.
-
--gsea
<gsea>
¶ Path to output a GRP file for gene set enrichment analysis.
-
--graphml
<graphml>
¶ Path to output a GraphML file. Use .graphml for Cytoscape.
-
--nodelink
<nodelink>
¶ Path to output a node-link JSON file.
-
--bel
<bel>
¶ Output canonical BEL.
Arguments
-
path
¶
Required argument
Plugins¶
PyBEL’s command line interface uses click-plugins to load extensions.
Constants¶
Constants for PyBEL.
This module maintains the strings used throughout the PyBEL codebase to promote consistency.
-
pybel.constants.
get_cache_connection
()[source]¶ Get the preferred RFC-1738 database connection string.
Check the environment variable
PYBEL_CONNECTION
Check the
PYBEL_CONNECTION
key in the config file~/.config/pybel/config.json
. Optionally, this config file might be in a different place if the environment variablePYBEL_CONFIG_DIRECTORY
has been set.Return a default connection string using a SQLite database in the
~/.pybel
. Optionally, this directory might be in a different place if the environment variablePYBEL_RESOURCE_DIRECTORY
has been set.
- Return type
-
pybel.constants.
BEL_DEFAULT_NAMESPACE
= 'bel'¶ The default namespace given to entities in the BEL language
-
pybel.constants.
CITATION_TYPES
= {'Book': None, 'DOI': 'doi', 'Journal': None, 'Online Resource': None, 'Other': None, 'PubMed': 'pmid', 'PubMed Central': 'pmc', 'URL': None}¶ The valid citation types .. seealso:: https://wiki.openbel.org/display/BELNA/Citation
-
pybel.constants.
NAMESPACE_DOMAIN_TYPES
= {'BiologicalProcess', 'Chemical', 'Gene and Gene Products', 'Other'}¶ The valid namespace types .. seealso:: https://wiki.openbel.org/display/BELNA/Custom+Namespaces
-
pybel.constants.
CITATION_DB
= 'db'¶ Represents the key for the citation type in a citation dictionary
-
pybel.constants.
CITATION_IDENTIFIER
= 'db_id'¶ Represents the key for the citation reference in a citation dictionary
-
pybel.constants.
CITATION_DB_NAME
= 'db_name'¶ Represents the key for the optional PyBEL citation title entry in a citation dictionary
-
pybel.constants.
CITATION_DATE
= 'date'¶ Represents the key for the citation date in a citation dictionary
-
pybel.constants.
CITATION_AUTHORS
= 'authors'¶ Represents the key for the citation authors in a citation dictionary
-
pybel.constants.
CITATION_JOURNAL
= 'db_name'¶ Represents the key for the citation comment in a citation dictionary
-
pybel.constants.
CITATION_VOLUME
= 'volume'¶ Represents the key for the optional PyBEL citation volume entry in a citation dictionary
-
pybel.constants.
CITATION_ISSUE
= 'issue'¶ Represents the key for the optional PyBEL citation issue entry in a citation dictionary
-
pybel.constants.
CITATION_PAGES
= 'pages'¶ Represents the key for the optional PyBEL citation pages entry in a citation dictionary
-
pybel.constants.
CITATION_FIRST_AUTHOR
= 'first'¶ Represents the key for the optional PyBEL citation first author entry in a citation dictionary
-
pybel.constants.
CITATION_LAST_AUTHOR
= 'last'¶ Represents the key for the optional PyBEL citation last author entry in a citation dictionary
-
pybel.constants.
FUNCTION
= 'function'¶ The node data key specifying the node’s function (e.g.
GENE
,MIRNA
,BIOPROCESS
, etc.)
-
pybel.constants.
CONCEPT
= 'concept'¶ The key specifying a concept
-
pybel.constants.
NAMESPACE
= 'namespace'¶ The key specifying an identifier dictionary’s namespace. Used for nodes, activities, and transformations.
-
pybel.constants.
NAME
= 'name'¶ The key specifying an identifier dictionary’s name. Used for nodes, activities, and transformations.
-
pybel.constants.
IDENTIFIER
= 'identifier'¶ The key specifying an identifier dictionary
-
pybel.constants.
LABEL
= 'label'¶ The key specifying an optional label for the node
-
pybel.constants.
DESCRIPTION
= 'description'¶ The key specifying an optional description for the node
-
pybel.constants.
XREFS
= 'xref'¶ The key specifying xrefs
-
pybel.constants.
MEMBERS
= 'members'¶ They key representing the nodes that are a member of a composite or complex
-
pybel.constants.
REACTANTS
= 'reactants'¶ The key representing the nodes appearing in the reactant side of a biochemical reaction
-
pybel.constants.
PRODUCTS
= 'products'¶ The key representing the nodes appearing in the product side of a biochemical reaction
-
pybel.constants.
PARTNER_3P
= 'partner_3p'¶ The key specifying the identifier dictionary of the fusion’s 3-Prime partner
-
pybel.constants.
PARTNER_5P
= 'partner_5p'¶ The key specifying the identifier dictionary of the fusion’s 5-Prime partner
-
pybel.constants.
RANGE_3P
= 'range_3p'¶ The key specifying the range dictionary of the fusion’s 3-Prime partner
-
pybel.constants.
RANGE_5P
= 'range_5p'¶ The key specifying the range dictionary of the fusion’s 5-Prime partner
-
pybel.constants.
VARIANTS
= 'variants'¶ The key specifying the node has a list of associated variants
-
pybel.constants.
KIND
= 'kind'¶ The key representing what kind of variation is being represented
-
pybel.constants.
PYBEL_NODE_DATA_KEYS
= {'function', 'fusion', 'identifier', 'members', 'name', 'namespace', 'products', 'reactants', 'variants'}¶ The group of all BEL-provided keys for node data dictionaries, used for hashing.
-
pybel.constants.
DIRTY
= 'dirty'¶ Used as a namespace when none is given when lenient parsing mode is turned on. Not recommended!
-
pybel.constants.
ABUNDANCE
= 'Abundance'¶ Represents the BEL abundance, abundance()
-
pybel.constants.
GENE
= 'Gene'¶ Represents the BEL abundance, geneAbundance() .. seealso:: http://openbel.org/language/version_2.0/bel_specification_version_2.0.html#Xabundancea
-
pybel.constants.
RNA
= 'RNA'¶ Represents the BEL abundance, rnaAbundance()
-
pybel.constants.
MIRNA
= 'miRNA'¶ Represents the BEL abundance, microRNAAbundance()
-
pybel.constants.
PROTEIN
= 'Protein'¶ Represents the BEL abundance, proteinAbundance()
-
pybel.constants.
BIOPROCESS
= 'BiologicalProcess'¶ Represents the BEL function, biologicalProcess()
-
pybel.constants.
PATHOLOGY
= 'Pathology'¶ Represents the BEL function, pathology()
-
pybel.constants.
POPULATION
= 'Population'¶ Represents the BEL function, populationAbundance()
-
pybel.constants.
COMPOSITE
= 'Composite'¶ Represents the BEL abundance, compositeAbundance()
-
pybel.constants.
COMPLEX
= 'Complex'¶ Represents the BEL abundance, complexAbundance()
-
pybel.constants.
REACTION
= 'Reaction'¶ Represents the BEL transformation, reaction()
-
pybel.constants.
PYBEL_NODE_FUNCTIONS
= {'Abundance', 'BiologicalProcess', 'Complex', 'Composite', 'Gene', 'Pathology', 'Population', 'Protein', 'RNA', 'Reaction', 'miRNA'}¶ A set of all of the valid PyBEL node functions
-
pybel.constants.
rev_abundance_labels
= {'Abundance': 'a', 'BiologicalProcess': 'bp', 'Complex': 'complex', 'Composite': 'composite', 'Gene': 'g', 'Pathology': 'path', 'Population': 'pop', 'Protein': 'p', 'RNA': 'r', 'miRNA': 'm'}¶ The mapping from PyBEL node functions to BEL strings
-
pybel.constants.
RELATION
= 'relation'¶ The key for an internal edge data dictionary for the relation string
-
pybel.constants.
CITATION
= 'citation'¶ The key for an internal edge data dictionary for the citation dictionary
-
pybel.constants.
EVIDENCE
= 'evidence'¶ The key for an internal edge data dictionary for the evidence string
-
pybel.constants.
ANNOTATIONS
= 'annotations'¶ The key for an internal edge data dictionary for the annotations dictionary
-
pybel.constants.
SUBJECT
= 'subject'¶ The key for an internal edge data dictionary for the subject modifier dictionary
-
pybel.constants.
OBJECT
= 'object'¶ The key for an internal edge data dictionary for the object modifier dictionary
-
pybel.constants.
LINE
= 'line'¶ The key or an internal edge data dictionary for the line number
-
pybel.constants.
HASH
= 'hash'¶ The key representing the hash of the other
-
pybel.constants.
PYBEL_EDGE_DATA_KEYS
= {'annotations', 'citation', 'evidence', 'object', 'relation', 'subject'}¶ The group of all BEL-provided keys for edge data dictionaries, used for hashing.
-
pybel.constants.
PYBEL_EDGE_METADATA_KEYS
= {'hash', 'line'}¶ The group of all PyBEL-specific keys for edge data dictionaries, not used for hashing.
-
pybel.constants.
PYBEL_EDGE_ALL_KEYS
= {'annotations', 'citation', 'evidence', 'hash', 'line', 'object', 'relation', 'subject'}¶ The group of all PyBEL annotated keys for edge data dictionaries
-
pybel.constants.
HAS_REACTANT
= 'hasReactant'¶ A BEL relationship
-
pybel.constants.
HAS_PRODUCT
= 'hasProduct'¶ A BEL relationship
-
pybel.constants.
HAS_VARIANT
= 'hasVariant'¶ A BEL relationship
-
pybel.constants.
TRANSCRIBED_TO
= 'transcribedTo'¶
-
pybel.constants.
TRANSLATED_TO
= 'translatedTo'¶
-
pybel.constants.
INCREASES
= 'increases'¶ A BEL relationship
-
pybel.constants.
DIRECTLY_INCREASES
= 'directlyIncreases'¶ A BEL relationship
-
pybel.constants.
DECREASES
= 'decreases'¶ A BEL relationship
-
pybel.constants.
DIRECTLY_DECREASES
= 'directlyDecreases'¶ A BEL relationship
-
pybel.constants.
CAUSES_NO_CHANGE
= 'causesNoChange'¶ A BEL relationship
-
pybel.constants.
REGULATES
= 'regulates'¶ A BEL relationship
-
pybel.constants.
BINDS
= 'binds'¶ A BEL relationship
-
pybel.constants.
CORRELATION
= 'correlation'¶ A BEL relationship
-
pybel.constants.
NO_CORRELATION
= 'noCorrelation'¶ A BEL relationship
-
pybel.constants.
NEGATIVE_CORRELATION
= 'negativeCorrelation'¶ A BEL relationship
-
pybel.constants.
POSITIVE_CORRELATION
= 'positiveCorrelation'¶ A BEL relationship
-
pybel.constants.
ASSOCIATION
= 'association'¶ A BEL relationship
-
pybel.constants.
ORTHOLOGOUS
= 'orthologous'¶ A BEL relationship
-
pybel.constants.
ANALOGOUS_TO
= 'analogousTo'¶ A BEL relationship
-
pybel.constants.
IS_A
= 'isA'¶ A BEL relationship
-
pybel.constants.
RATE_LIMITING_STEP_OF
= 'rateLimitingStepOf'¶ A BEL relationship
-
pybel.constants.
SUBPROCESS_OF
= 'subProcessOf'¶ A BEL relationship
-
pybel.constants.
BIOMARKER_FOR
= 'biomarkerFor'¶ A BEL relationship
-
pybel.constants.
PROGONSTIC_BIOMARKER_FOR
= 'prognosticBiomarkerFor'¶ A BEL relationship
-
pybel.constants.
EQUIVALENT_TO
= 'equivalentTo'¶ A BEL relationship, added by PyBEL
-
pybel.constants.
PART_OF
= 'partOf'¶ A BEL relationship, added by PyBEL
-
pybel.constants.
CAUSAL_INCREASE_RELATIONS
= {'directlyIncreases', 'increases'}¶ A set of all causal relationships that have an increasing effect
-
pybel.constants.
CAUSAL_DECREASE_RELATIONS
= {'decreases', 'directlyDecreases'}¶ A set of all causal relationships that have a decreasing effect
-
pybel.constants.
DIRECT_CAUSAL_RELATIONS
= {'directlyDecreases', 'directlyIncreases'}¶ A set of direct causal relations
-
pybel.constants.
INDIRECT_CAUSAL_RELATIONS
= {'decreases', 'increases', 'regulates'}¶ A set of direct causal relations
-
pybel.constants.
CAUSAL_POLAR_RELATIONS
= {'decreases', 'directlyDecreases', 'directlyIncreases', 'increases'}¶ A set of causal relationships that are polar
-
pybel.constants.
CAUSAL_RELATIONS
= {'decreases', 'directlyDecreases', 'directlyIncreases', 'increases', 'regulates'}¶ A set of all causal relationships
-
pybel.constants.
CORRELATIVE_RELATIONS
= {'correlation', 'negativeCorrelation', 'noCorrelation', 'positiveCorrelation'}¶ A set of all correlative relationships
-
pybel.constants.
POLAR_RELATIONS
= {'decreases', 'directlyDecreases', 'directlyIncreases', 'increases', 'negativeCorrelation', 'positiveCorrelation'}¶ A set of polar relations
-
pybel.constants.
TWO_WAY_RELATIONS
= {'analogousTo', 'association', 'binds', 'correlation', 'equivalentTo', 'negativeCorrelation', 'noCorrelation', 'orthologous', 'positiveCorrelation'}¶ A set of all relationships that are inherently directionless, and are therefore added to the graph twice
-
pybel.constants.
UNQUALIFIED_EDGES
= {'equivalentTo', 'hasProduct', 'hasReactant', 'hasVariant', 'isA', 'orthologous', 'partOf', 'transcribedTo', 'translatedTo'}¶ A list of relationship types that don’t require annotations or evidence
-
pybel.constants.
GRAPH_METADATA
= 'document_metadata'¶ The key for the document metadata dictionary. Can be accessed by
graph.graph[GRAPH_METADATA]
, or by using the property built in to thepybel.BELGraph
,pybel.BELGraph.document()
-
pybel.constants.
METADATA_NAME
= 'name'¶ The key for the document name. Can be accessed by
graph.document[METADATA_NAME]
or by using the property built into thepybel.BELGraph
class,pybel.BELGraph.name()
-
pybel.constants.
METADATA_VERSION
= 'version'¶ The key for the document version. Can be accessed by
graph.document[METADATA_VERSION]
-
pybel.constants.
METADATA_DESCRIPTION
= 'description'¶ The key for the document description. Can be accessed by
graph.document[METADATA_DESCRIPTION]
-
pybel.constants.
METADATA_AUTHORS
= 'authors'¶ The key for the document authors. Can be accessed by
graph.document[METADATA_NAME]
-
pybel.constants.
METADATA_CONTACT
= 'contact'¶ The key for the document contact email. Can be accessed by
graph.document[METADATA_CONTACT]
-
pybel.constants.
METADATA_LICENSES
= 'licenses'¶ The key for the document licenses. Can be accessed by
graph.document[METADATA_LICENSES]
-
pybel.constants.
METADATA_COPYRIGHT
= 'copyright'¶ The key for the document copyright information. Can be accessed by
graph.document[METADATA_COPYRIGHT]
-
pybel.constants.
METADATA_DISCLAIMER
= 'disclaimer'¶ The key for the document disclaimer. Can be accessed by
graph.document[METADATA_DISCLAIMER]
-
pybel.constants.
METADATA_PROJECT
= 'project'¶ The key for the document project. Can be accessed by
graph.document[METADATA_PROJECT]
-
pybel.constants.
DOCUMENT_KEYS
= {'Authors': 'authors', 'ContactInfo': 'contact', 'Copyright': 'copyright', 'Description': 'description', 'Disclaimer': 'disclaimer', 'Licenses': 'licenses', 'Name': 'name', 'Project': 'project', 'Version': 'version'}¶ Provides a mapping from BEL language keywords to internal PyBEL strings
-
pybel.constants.
METADATA_INSERT_KEYS
= {'authors', 'contact', 'copyright', 'description', 'disclaimer', 'licenses', 'name', 'version'}¶ The keys to use when inserting a graph to the cache
-
pybel.constants.
INVERSE_DOCUMENT_KEYS
= {'authors': 'Authors', 'contact': 'ContactInfo', 'copyright': 'Copyright', 'description': 'Description', 'disclaimer': 'Disclaimer', 'licenses': 'Licenses', 'name': 'Name', 'project': 'Project', 'version': 'Version'}¶ Provides a mapping from internal PyBEL strings to BEL language keywords. Is the inverse of
DOCUMENT_KEYS
-
pybel.constants.
REQUIRED_METADATA
= {'authors', 'contact', 'description', 'name', 'version'}¶ A set representing the required metadata during BEL document parsing
-
pybel.constants.
FRAGMENT_START
= 'start'¶ The key for the starting position of a fragment range
-
pybel.constants.
FRAGMENT_STOP
= 'stop'¶ The key for the stopping position of a fragment range
-
pybel.constants.
FRAGMENT_MISSING
= 'missing'¶ The key signifying that there is neither a start nor stop position defined
-
pybel.constants.
FRAGMENT_DESCRIPTION
= 'description'¶ The key for any additional descriptive data about a fragment
-
pybel.constants.
GMOD_ORDER
= ['kind', 'identifier']¶ The order for serializing gene modification data
-
pybel.constants.
GSUB_REFERENCE
= 'reference'¶ The key for the reference nucleotide in a gene substitution. Only used during parsing since this is converted to HGVS.
-
pybel.constants.
GSUB_POSITION
= 'position'¶ The key for the position of a gene substitution. Only used during parsing since this is converted to HGVS
-
pybel.constants.
GSUB_VARIANT
= 'variant'¶ The key for the effect of a gene substitution. Only used during parsing since this is converted to HGVS
-
pybel.constants.
PMOD_CODE
= 'code'¶ The key for the protein modification code.
-
pybel.constants.
PMOD_POSITION
= 'pos'¶ The key for the protein modification position.
-
pybel.constants.
PMOD_ORDER
= ['kind', 'identifier', 'code', 'pos']¶ The order for serializing information about a protein modification
-
pybel.constants.
PSUB_REFERENCE
= 'reference'¶ The key for the reference amino acid in a protein substitution. Only used during parsing since this is concerted to HGVS
-
pybel.constants.
PSUB_POSITION
= 'position'¶ The key for the position of a protein substitution. Only used during parsing since this is converted to HGVS.
-
pybel.constants.
PSUB_VARIANT
= 'variant'¶ The key for the variant of a protein substitution.Only used during parsing since this is converted to HGVS.
-
pybel.constants.
TRUNCATION_POSITION
= 'position'¶ The key for the position at which a protein is truncated
-
pybel.constants.
belns_encodings
= {'A': {'Abundance', 'Complex', 'Gene', 'Protein', 'RNA', 'miRNA'}, 'B': {'BiologicalProcess', 'Pathology'}, 'C': {'Complex'}, 'G': {'Gene'}, 'M': {'miRNA'}, 'O': {'Pathology'}, 'P': {'Protein'}, 'R': {'RNA', 'miRNA'}}¶ The mapping from BEL namespace codes to PyBEL internal abundance constants ..seealso:: https://wiki.openbel.org/display/BELNA/Assignment+of+Encoding+%28Allowed+Functions%29+for+BEL+Namespaces
-
pybel.constants.
DEFAULT_SERVICE_URL
= 'https://bel-commons.scai.fraunhofer.de'¶ The default location of PyBEL Web
Language constants for BEL.
This module contains mappings between PyBEL’s internal constants and BEL language keywords.
-
class
pybel.language.
Entity
(*, namespace, name=None, identifier=None)[source]¶ Represents a named entity with a namespace and name/identifier.
Create a dictionary representing a reference to an entity.
- Parameters
-
pybel.language.
activity_labels
= {'cat': 'cat', 'catalyticActivity': 'cat', 'chap': 'chap', 'chaperoneActivity': 'chap', 'gap': 'gap', 'gef': 'gef', 'gtp': 'gtp', 'gtpBoundActivity': 'gtp', 'gtpaseActivatingProteinActivity': 'gap', 'guanineNucleotideExchangeFactorActivity': 'gef', 'kin': 'kin', 'kinaseActivity': 'kin', 'molecularActivity': 'molecularActivity', 'pep': 'pep', 'peptidaseActivity': 'pep', 'phos': 'phos', 'phosphataseActivity': 'phos', 'ribo': 'ribo', 'ribosylationActivity': 'ribo', 'tport': 'tport', 'transcriptionalActivity': 'tscript', 'transportActivity': 'tport', 'tscript': 'tscript'}¶ A dictionary of activity labels used in the ma() function in activity(p(X), ma(Y))
-
pybel.language.
activity_mapping
= {'cat': {'identifier': 'GO:0003824', 'name': 'catalytic activity', 'namespace': 'GO'}, 'chap': {'identifier': 'GO:0044183', 'name': 'protein binding involved in protein folding', 'namespace': 'GO'}, 'gap': {'identifier': 'GO:0032794', 'name': 'GTPase activating protein binding', 'namespace': 'GO'}, 'gef': {'identifier': 'GO:0005085', 'name': 'guanyl-nucleotide exchange factor activity', 'namespace': 'GO'}, 'gtp': {'identifier': 'GO:0005525', 'name': 'GTP binding', 'namespace': 'GO'}, 'kin': {'identifier': 'GO:0016301', 'name': 'kinase activity', 'namespace': 'GO'}, 'molecularActivity': {'identifier': 'GO:0003674', 'name': 'molecular_function', 'namespace': 'GO'}, 'pep': {'identifier': 'GO:0008233', 'name': 'peptidase activity', 'namespace': 'GO'}, 'phos': {'identifier': 'GO:0016791', 'name': 'phosphatase activity', 'namespace': 'GO'}, 'ribo': {'identifier': 'GO:0003956', 'name': 'NAD(P)+-protein-arginine ADP-ribosyltransferase activity', 'namespace': 'GO'}, 'tport': {'identifier': 'GO:0005215', 'name': 'transporter activity', 'namespace': 'GO'}, 'tscript': {'identifier': 'GO:0001071', 'name': 'nucleic acid binding transcription factor activity', 'namespace': 'GO'}}¶ Maps the default BEL molecular activities to Gene Ontology Molecular Functions
-
pybel.language.
compartment_mapping
= {'cell surface': {'identifier': 'GO:0009986', 'name': 'cell surface', 'namespace': 'GO'}, 'cytoplasm': {'identifier': 'GO:0005737', 'name': 'cytoplasm', 'namespace': 'GO'}, 'extracellular space': {'identifier': 'GO:0005615', 'name': 'extracellular space', 'namespace': 'GO'}, 'intracellular': {'identifier': 'GO:0005622', 'name': 'intracellular', 'namespace': 'GO'}, 'nucleus': {'identifier': 'GO:0005634', 'name': 'nucleus', 'namespace': 'GO'}}¶ Maps the default BEL cellular components to Gene Ontology Cellular Components
-
pybel.language.
abundance_labels
= {'a': 'Abundance', 'abundance': 'Abundance', 'biologicalProcess': 'BiologicalProcess', 'bp': 'BiologicalProcess', 'complex': 'Complex', 'complexAbundance': 'Complex', 'composite': 'Composite', 'compositeAbundance': 'Composite', 'g': 'Gene', 'geneAbundance': 'Gene', 'm': 'miRNA', 'microRNAAbundance': 'miRNA', 'p': 'Protein', 'path': 'Pathology', 'pathology': 'Pathology', 'proteinAbundance': 'Protein', 'r': 'RNA', 'rnaAbundance': 'RNA'}¶ Provides a mapping from BEL terms to PyBEL internal constants
-
pybel.language.
abundance_sbo_mapping
= {'BiologicalProcess': {'identifier': 'SBO:0000375', 'name': 'process', 'namespace': 'SBO'}, 'Complex': {'identifier': 'SBO:0000297', 'name': 'protein complex', 'namespace': 'SBO'}, 'Gene': {'identifier': 'SBO:0000243', 'name': 'gene', 'namespace': 'SBO'}, 'Pathology': {'identifier': 'SBO:0000358', 'name': 'phenotype', 'namespace': 'SBO'}, 'RNA': {'identifier': 'SBO:0000278', 'name': 'messenger RNA', 'namespace': 'SBO'}, 'miRNA': {'identifier': 'SBO:0000316', 'name': 'microRNA', 'namespace': 'SBO'}}¶ Maps the BEL abundance types to the Systems Biology Ontology
-
pybel.language.
pmod_namespace
= {'ADP-ribosylation': 'ADPRib', 'ADPRib': 'ADPRib', 'Ac': 'Ac', 'Farn': 'Farn', 'Gerger': 'Gerger', 'Glyco': 'Glyco', 'Hy': 'Hy', 'ISG': 'ISG', 'ISG15-protein conjugation': 'ISG', 'ISGylation': 'ISG', 'Lysine 48-linked polyubiquitination': 'UbK48', 'Lysine 63-linked polyubiquitination': 'UbK63', 'Me': 'Me', 'Me1': 'Me1', 'Me2': 'Me2', 'Me3': 'Me3', 'Myr': 'Myr', 'N-linked glycosylation': 'NGlyco', 'NGlyco': 'NGlyco', 'NO': 'NO', 'Nedd': 'Nedd', 'Nitrosylation': 'NO', 'O-linked glycosylation': 'OGlyco', 'OGlyco': 'OGlyco', 'Ox': 'Ox', 'Palm': 'Palm', 'Ph': 'Ph', 'SUMOylation': 'Sumo', 'Sulf': 'Sulf', 'Sumo': 'Sumo', 'Ub': 'Ub', 'UbK48': 'UbK48', 'UbK63': 'UbK63', 'UbMono': 'UbMono', 'UbPoly': 'UbPoly', 'acetylation': 'Ac', 'adenosine diphosphoribosyl': 'ADPRib', 'di-methylation': 'Me2', 'dimethylation': 'Me2', 'farnesylation': 'Farn', 'geranylgeranylation': 'Gerger', 'glycosylation': 'Glyco', 'hydroxylation': 'Hy', 'methylation': 'Me', 'mono-methylation': 'Me1', 'monomethylation': 'Me1', 'monoubiquitination': 'UbMono', 'myristoylation': 'Myr', 'neddylation': 'Nedd', 'oxidation': 'Ox', 'palmitoylation': 'Palm', 'phosphorylation': 'Ph', 'polyubiquitination': 'UbPoly', 'sulfation': 'Sulf', 'sulfonation': 'sulfonation', 'sulfur addition': 'Sulf', 'sulphation': 'Sulf', 'sulphonation': 'sulfonation', 'sulphur addition': 'Sulf', 'tri-methylation': 'Me3', 'trimethylation': 'Me3', 'ubiquitination': 'Ub', 'ubiquitinylation': 'Ub', 'ubiquitylation': 'Ub'}¶ A dictionary of default protein modifications to their preferred value
-
pybel.language.
pmod_mappings
= {'ADPRib': {'synonyms': ['ADPRib', 'ADP-ribosylation', 'ADPRib', 'ADP-rybosylation', 'adenosine diphosphoribosyl'], 'xrefs': [{'namespace': 'GO', 'name': 'protein ADP-ribosylation', 'identifier': 'GO:0006471'}, {'namespace': 'MOD', 'name': 'adenosine diphosphoribosyl (ADP-ribosyl) modified residue', 'identifier': 'MOD:00752'}, {'namespace': 'MOP', 'name': 'adenosinediphosphoribosylation', 'identifier': 'MOP:0000220'}]}, 'Ac': {'synonyms': ['Ac', 'acetylation'], 'xrefs': [{'namespace': 'SBO', 'name': 'acetylation', 'identifier': 'SBO:0000215'}, {'namespace': 'GO', 'name': 'protein acetylation', 'identifier': 'GO:0006473'}, {'namespace': 'MOD', 'name': 'acetylated residue', 'identifier': 'MOD:00394'}, {'namespace': 'MOP', 'name': 'acetylation', 'identifier': 'MOP:0000030'}]}, 'Farn': {'synonyms': ['Farn', 'farnesylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein farnesylation', 'identifier': 'GO:0018343'}, {'namespace': 'MOD', 'name': 'farnesylated residue', 'identifier': 'MOD:00437'}, {'namespace': 'MOP', 'name': 'farnesylation', 'identifier': 'MOP:0000429'}]}, 'Gerger': {'synonyms': ['Gerger', 'geranylgeranylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein geranylgeranylation', 'identifier': 'GO:0018344'}, {'namespace': 'MOD', 'name': 'geranylgeranylated residue ', 'identifier': 'MOD:00441'}, {'namespace': 'MOP', 'name': 'geranylgeranylation', 'identifier': 'MOP:0000431'}]}, 'Glyco': {'synonyms': ['Glyco', 'glycosylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein glycosylation', 'identifier': 'GO:0006486'}, {'namespace': 'MOD', 'name': 'glycosylated residue', 'identifier': 'MOD:00693'}, {'namespace': 'MOP', 'name': 'glycosylation', 'identifier': 'MOP:0000162'}]}, 'Hy': {'synonyms': ['Hyhydroxylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein hydroxylation', 'identifier': 'GO:0018126'}, {'namespace': 'MOD', 'name': 'hydroxylated residue', 'identifier': 'MOD:00677'}, {'namespace': 'MOP', 'name': 'hydroxylation', 'identifier': 'MOP:0000673'}]}, 'ISG': {'activities': [{'namespace': 'GO', 'name': 'ISG15 transferase activity', 'identifier': 'GO:0042296'}], 'synonyms': ['ISG', 'ISGylation', 'ISG15-protein conjugation'], 'xrefs': [{'namespace': 'GO', 'name': 'ISG15-protein conjugation', 'identifier': 'GO:0032020'}]}, 'Me': {'synonyms': ['Me', 'methylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein methylation', 'identifier': 'GO:0006479'}, {'namespace': 'MOD', 'name': 'methylated residue', 'identifier': 'MOD:00427'}]}, 'Me1': {'is_a': ['Me'], 'synonyms': ['Me1', 'monomethylation', 'mono-methylation'], 'xrefs': [{'namespace': 'MOD', 'name': 'monomethylated residue', 'identifier': 'MOD:00599'}]}, 'Me2': {'is_a': ['Me'], 'synonyms': ['Me2', 'dimethylation', 'di-methylation'], 'xrefs': [{'namespace': 'MOD', 'name': 'dimethylated residue', 'identifier': 'MOD:00429'}]}, 'Me3': {'is_a': ['Me'], 'synonyms': ['Me3', 'trimethylation', 'tri-methylation'], 'xrefs': [{'namespace': 'MOD', 'name': 'trimethylated residue', 'identifier': 'MOD:00430'}]}, 'Myr': {'synonyms': ['Myr', 'myristoylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein myristoylation', 'identifier': 'GO:0018377'}, {'namespace': 'MOD', 'name': 'myristoylated residue', 'identifier': 'MOD:00438'}]}, 'NGlyco': {'is_a': ['Glyco'], 'synonyms': ['NGlyco', 'N-linked glycosylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein N-linked glycosylation', 'identifier': 'GO:0006487'}, {'namespace': 'MOD', 'name': 'N-glycosylated residue', 'identifier': 'MOD:00006'}, {'namespace': 'MOP', 'name': 'N-glycosylation', 'identifier': 'MOP:0002162'}]}, 'NO': {'synonyms': ['NO', 'Nitrosylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein nitrosylation', 'identifier': 'GO:0017014'}]}, 'Nedd': {'synonyms': ['Nedd', 'neddylation', 'RUB1-protein conjugation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein neddylation', 'identifier': 'GO:0045116'}, {'namespace': 'MOD', 'name': 'neddylated lysine', 'identifier': 'MOD:01150'}]}, 'OGlyco': {'is_a': ['Glyco'], 'synonyms': ['OGlyco', 'O-linked glycosylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein O-linked glycosylation', 'identifier': 'GO:0006493'}, {'namespace': 'MOD', 'name': 'O-glycosylated residue', 'identifier': 'MOD:00396'}, {'namespace': 'MOP', 'name': 'O-glycosylation', 'identifier': 'MOP:0003162'}]}, 'Ox': {'synonyms': ['Ox', 'oxidation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein oxidation', 'identifier': 'GO:0018158'}]}, 'Palm': {'synonyms': ['Palm', 'palmitoylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein palmitoylation', 'identifier': 'GO:0018345'}, {'namespace': 'MOD', 'name': 'palmitoylated residue', 'identifier': 'MOD:00440'}]}, 'Ph': {'synonyms': ['Ph', 'phosphorylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein phosphorylation', 'identifier': 'GO:0006468'}, {'namespace': 'MOD', 'identifier': 'MOD:00696'}]}, 'Sulf': {'synonyms': ['Sulf', 'sulfation', 'sulphation', 'sulfur addition', 'sulphur addition', 'sulfonation', 'sulphonation'], 'target': [{'namespace': 'CHEBI', 'name': 'sulfo group', 'identifier': 'CHEBI:29922'}], 'xrefs': [{'namespace': 'GO', 'name': 'protein sulfation', 'identifier': 'GO:0006477'}, {'namespace': 'MOD', 'name': 'sulfated residue', 'identifier': 'MOD:00695'}, {'namespace': 'MOP', 'name': 'sulfonation', 'identifier': 'MOP:0000559'}]}, 'Sumo': {'activities': [{'namespace': 'GO', 'name': 'SUMO transferase activity', 'identifier': 'GO:0019789'}], 'synonyms': ['Sumo', 'SUMOylation', 'Sumoylation'], 'xrefs': [{'namespace': 'GO', 'name': 'protein sumoylation', 'identifier': 'GO:0016925'}, {'namespace': 'MOD', 'name': 'sumoylated lysine', 'identifier': 'MOD:01149'}]}, 'Ub': {'synonyms': ['Ub', 'ubiquitination', 'ubiquitinylation', 'ubiquitylation'], 'xrefs': [{'namespace': 'SBO', 'name': 'ubiquitination', 'identifier': 'SBO:0000224'}, {'namespace': 'GO', 'name': 'protein ubiquitination', 'identifier': 'GO:0016567'}, {'namespace': 'MOD', 'name': 'ubiquitinylated lysine', 'identifier': 'MOD:01148'}]}, 'UbK48': {'synonyms': ['UbK48', 'Lysine 48-linked polyubiquitination'], 'xrefs': [{'namespace': 'GO', 'name': 'protein K48-linked ubiquitination', 'identifier': 'GO:0070936'}]}, 'UbK63': {'synonyms': ['UbK63', 'Lysine 63-linked polyubiquitination'], 'xrefs': [{'namespace': 'GO', 'name': 'protein K63-linked ubiquitination', 'identifier': 'GO:0070534'}]}, 'UbMono': {'synonyms': ['UbMono', 'monoubiquitination'], 'xrefs': [{'namespace': 'GO', 'name': 'protein monoubiquitination', 'identifier': 'GO:0006513'}]}, 'UbPoly': {'synonyms': ['UbPoly', 'polyubiquitination'], 'xrefs': [{'namespace': 'GO', 'name': 'protein polyubiquitination', 'identifier': 'GO:0000209'}]}}¶ Use Gene Ontology children of GO_0006464: “cellular protein modification process”
-
pybel.language.
pmod_legacy_labels
= {'A': 'Ac', 'F': 'Farn', 'G': 'Glyco', 'H': 'Hy', 'M': 'Me', 'O': 'Ox', 'P': 'Ph', 'R': 'ADPRib', 'S': 'Sumo', 'U': 'Ub'}¶ A dictionary of legacy (BEL 1.0) default namespace protein modifications to their BEL 2.0 preferred value
-
pybel.language.
gmod_namespace
= {'ADPRib': 'ADPRib', 'M': 'Me', 'Me': 'Me', 'methylation': 'Me'}¶ A dictionary of default gene modifications. This is a PyBEL variant to the BEL specification.
-
pybel.language.
gmod_mappings
= {'ADPRib': {'synonyms': ['ADPRib'], 'xrefs': [{'namespace': 'GO', 'name': 'DNA ADP-ribosylation', 'identifier': 'GO:0030592'}]}, 'Me': {'synonyms': ['Me', 'M', 'methylation'], 'xrefs': [{'namespace': 'GO', 'name': 'DNA methylation', 'identifier': 'GO:0006306'}]}}¶ Use Gene Ontology children of GO_0006304: “DNA modification”
Parsers¶
This page is for users who want to squeeze the most bizarre possibilities out of PyBEL. Most users will not need this reference.
PyBEL makes extensive use of the PyParsing module. The code is organized to different modules to reflect the different faces ot the BEL language. These parsers support BEL 2.0 and have some backwards compatibility for rewriting BEL v1.0 statements as BEL v2.0. The biologist and bioinformatician using this software will likely never need to read this page, but a developer seeking to extend the language will be interested to see the inner workings of these parsers.
See: https://github.com/OpenBEL/language/blob/master/version_2.0/MIGRATE_BEL1_BEL2.md
BEL Parser¶
-
class
pybel.parser.parse_bel.
BELParser
(graph, namespace_to_term_to_encoding=None, namespace_to_pattern=None, annotation_to_term=None, annotation_to_pattern=None, annotation_to_local=None, allow_naked_names=False, disallow_nested=False, disallow_unqualified_translocations=False, citation_clearing=True, skip_validation=False, autostreamline=True, required_annotations=None)[source]¶ Build a parser backed by a given dictionary of namespaces.
Build a BEL parser.
- Parameters
graph (pybel.BELGraph) – The BEL Graph to use to store the network
namespace_to_term_to_encoding (
Optional
[Mapping
[str
,Mapping
[Tuple
[Optional
[str
],str
],str
]]]) – A dictionary of {namespace: {name: encoding}}. Delegated topybel.parser.parse_identifier.IdentifierParser
namespace_to_pattern (
Optional
[Mapping
[str
,Pattern
]]) – A dictionary of {namespace: regular expression strings}. Delegated topybel.parser.parse_identifier.IdentifierParser
annotation_to_term (
Optional
[Mapping
[str
,Set
[str
]]]) – A dictionary of {annotation: set of values}. Delegated topybel.parser.ControlParser
annotation_to_pattern (
Optional
[Mapping
[str
,Pattern
]]) – A dictionary of {annotation: regular expression strings}. Delegated topybel.parser.ControlParser
annotation_to_local (
Optional
[Mapping
[str
,Set
[str
]]]) – A dictionary of {annotation: set of values}. Delegated topybel.parser.ControlParser
allow_naked_names (
bool
) – If true, turn off naked namespace failures. Delegated topybel.parser.parse_identifier.IdentifierParser
disallow_nested (
bool
) – If true, turn on nested statement failures. Delegated topybel.parser.parse_identifier.IdentifierParser
disallow_unqualified_translocations (
bool
) – If true, allow translocations without TO and FROM clauses.citation_clearing (
bool
) – ShouldSET Citation
statements clear evidence and all annotations? Delegated topybel.parser.ControlParser
autostreamline (
bool
) – Should the parser be streamlined on instantiation?required_annotations (
Optional
[List
[str
]]) – Optional list of required annotations
-
gmod
= None¶ PyBEL BEL Specification variant
-
clear
()[source]¶ Clear the graph and all control parser data (current citation, annotations, and statement group).
-
handle_nested_relation
(line, position, tokens)[source]¶ Handle nested statements.
If
self.disallow_nested
is True, raises aNestedRelationWarning
.- Raises
NestedRelationWarning
-
check_function_semantics
(line, position, tokens)[source]¶ Raise an exception if the function used on the tokens is wrong.
- Raises
InvalidFunctionSemantic
- Return type
ParseResults
-
handle_term
(_, __, tokens)[source]¶ Handle BEL terms (the subject and object of BEL relations).
- Return type
ParseResults
-
handle_has_members
(_, __, tokens)[source]¶ Handle list relations like
p(X) hasMembers list(p(Y), p(Z), ...)
.- Return type
ParseResults
-
handle_has_components
(_, __, tokens)[source]¶ Handle list relations like
p(X) hasComponents list(p(Y), p(Z), ...)
.- Return type
ParseResults
-
handle_unqualified_relation
(_, __, tokens)[source]¶ Handle unqualified relations.
- Return type
ParseResults
-
handle_inverse_unqualified_relation
(_, __, tokens)[source]¶ Handle unqualified relations that should go reverse.
- Return type
ParseResults
-
pybel.io.line_utils.
parse_lines
(graph, lines, manager=None, disallow_nested=False, citation_clearing=True, use_tqdm=False, tqdm_kwargs=None, no_identifier_validation=False, disallow_unqualified_translocations=False, allow_redefinition=False, allow_definition_failures=False, allow_naked_names=False, required_annotations=None, upgrade_urls=False)[source]¶ Parse an iterable of lines into this graph.
Delegates to
parse_document()
,parse_definitions()
, andparse_statements()
.- Parameters
graph (
BELGraph
) – A BEL graphlines (
Iterable
[str
]) – An iterable over lines of BEL scriptmanager (
Optional
[Manager
]) – A PyBEL database managerdisallow_nested (
bool
) – If true, turns on nested statement failurescitation_clearing (
bool
) – ShouldSET Citation
statements clear evidence and all annotations? Delegated topybel.parser.ControlParser
use_tqdm (
bool
) – Usetqdm
to show a progress bar?tqdm_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keywords to pass totqdm
disallow_unqualified_translocations (
bool
) – If true, allow translocations without TO and FROM clauses.required_annotations (
Optional
[List
[str
]]) – Annotations that are required for all statementsupgrade_urls (
bool
) – Automatically upgrade old namespace URLs. Defaults to false.
Warning
These options allow concessions for parsing BEL that is either WRONG or UNSCIENTIFIC. Use them at risk to reproducibility and validity of your results.
- Parameters
no_identifier_validation (
bool
) – If true, turns off namespace validationallow_naked_names (
bool
) – If true, turns off naked namespace failuresallow_redefinition (
bool
) – If true, doesn’t fail on second definition of same name or annotationallow_definition_failures (
bool
) – If true, allows parsing to continue if a terminology file download/parse fails
- Return type
None
Metadata Parser¶
-
class
pybel.parser.parse_metadata.
MetadataParser
(manager, namespace_to_term_to_encoding=None, namespace_to_pattern=None, annotation_to_term=None, annotation_to_pattern=None, annotation_to_local=None, default_namespace=None, allow_redefinition=False, skip_validation=False, upgrade_urls=False)[source]¶ A parser for the document and definitions section of a BEL document.
See also
BEL 1.0 Specification for the DEFINE keyword
Build a metadata parser.
- Parameters
manager – A cache manager
namespace_to_term_to_encoding (
Optional
[Mapping
[str
,Mapping
[Tuple
[Optional
[str
],str
],str
]]]) – An enumerated namespace mapping from {namespace keyword: {(identifier, name): encoding}}namespace_to_pattern (
Optional
[Mapping
[str
,Pattern
]]) – A regular expression namespace mapping from {namespace keyword: regex string}annotation_to_term (
Optional
[Mapping
[str
,Set
[str
]]]) – Enumerated annotation mapping from {annotation keyword: set of valid values}annotation_to_pattern (
Optional
[Mapping
[str
,Pattern
]]) – Regular expression annotation mapping from {annotation keyword: regex string}default_namespace (
Optional
[Set
[str
]]) – A set of strings that can be used without a namespaceskip_validation (
bool
) – If true, don’t download and cache namespaces/annotations
-
manager
= None¶ This metadata parser’s internal definition cache manager
-
namespace_to_term_to_encoding
= None¶ A dictionary of cached {namespace keyword: {(identifier, name): encoding}}
-
uncachable_namespaces
= None¶ A set of namespaces’s URLs that can’t be cached
-
namespace_to_pattern
= None¶ A dictionary of {namespace keyword: regular expression string}
-
default_namespace
= None¶ A set of names that can be used without a namespace
-
annotation_to_term
= None¶ A dictionary of cached {annotation keyword: set of values}
-
annotation_to_pattern
= None¶ A dictionary of {annotation keyword: regular expression string}
-
annotation_to_local
= None¶ A dictionary of cached {annotation keyword: set of values}
-
document_metadata
= None¶ A dictionary containing the document metadata
-
namespace_url_dict
= None¶ A dictionary from {namespace keyword: BEL namespace URL}
-
annotation_url_dict
= None¶ A dictionary from {annotation keyword: BEL annotation URL}
-
handle_document
(line, position, tokens)[source]¶ Handle statements like
SET DOCUMENT X = "Y"
.- Raises
InvalidMetadataException
- Raises
VersionFormatWarning
- Return type
ParseResults
-
raise_for_redefined_namespace
(line, position, namespace)[source]¶ Raise an exception if a namespace is already defined.
- Raises
RedefinedNamespaceError
- Return type
None
-
handle_namespace_url
(line, position, tokens)[source]¶ Handle statements like
DEFINE NAMESPACE X AS URL "Y"
.- Raises
RedefinedNamespaceError
- Raises
pybel.resources.exc.ResourceError
- Return type
ParseResults
-
handle_namespace_pattern
(line, position, tokens)[source]¶ Handle statements like
DEFINE NAMESPACE X AS PATTERN "Y"
.- Raises
RedefinedNamespaceError
- Return type
ParseResults
-
raise_for_redefined_annotation
(line, position, annotation)[source]¶ Raise an exception if the given annotation is already defined.
- Raises
RedefinedAnnotationError
- Return type
None
-
handle_annotations_url
(line, position, tokens)[source]¶ Handle statements like
DEFINE ANNOTATION X AS URL "Y"
.- Raises
RedefinedAnnotationError
- Return type
ParseResults
-
handle_annotation_list
(line, position, tokens)[source]¶ Handle statements like
DEFINE ANNOTATION X AS LIST {"Y","Z", ...}
.- Raises
RedefinedAnnotationError
- Return type
ParseResults
-
handle_annotation_pattern
(line, position, tokens)[source]¶ Handle statements like
DEFINE ANNOTATION X AS PATTERN "Y"
.- Raises
RedefinedAnnotationError
- Return type
ParseResults
-
has_enumerated_annotation
(annotation)[source]¶ Check if this annotation is defined by an enumeration.
- Return type
-
has_regex_annotation
(annotation)[source]¶ Check if this annotation is defined by a regular expression.
- Return type
-
has_local_annotation
(annotation)[source]¶ Check if this annotation is defined by an locally.
- Return type
-
has_enumerated_namespace
(namespace)[source]¶ Check if this namespace is defined by an enumeration.
- Return type
-
has_regex_namespace
(namespace)[source]¶ Check if this namespace is defined by a regular expression.
- Return type
Control Parser¶
-
class
pybel.parser.parse_control.
ControlParser
(annotation_to_term=None, annotation_to_pattern=None, annotation_to_local=None, citation_clearing=True, required_annotations=None)[source]¶ A parser for BEL control statements.
See also
BEL 1.0 specification on control records
Initialize the control statement parser.
- Parameters
annotation_to_term (
Optional
[Mapping
[str
,Set
[str
]]]) – A dictionary of {annotation: set of valid values} defined with URL for parsingannotation_to_pattern (
Optional
[Mapping
[str
,Pattern
]]) – A dictionary of {annotation: regular expression string}annotation_to_local (
Optional
[Mapping
[str
,Set
[str
]]]) – A dictionary of {annotation: set of valid values} for parsing defined with LISTcitation_clearing (
bool
) – ShouldSET Citation
statements clear evidence and all annotations?required_annotations (
Optional
[List
[str
]]) – Annotations that are required
-
has_enumerated_annotation
(annotation)[source]¶ Check if the annotation is defined as an enumeration.
- Return type
-
has_regex_annotation
(annotation)[source]¶ Check if the annotation is defined as a regular expression.
- Return type
-
raise_for_undefined_annotation
(line, position, annotation)[source]¶ Raise an exception if the annotation is not defined.
- Raises
UndefinedAnnotationWarning
- Return type
None
-
raise_for_invalid_annotation_value
(line, position, key, value)[source]¶ Raise an exception if the annotation is not defined.
- Raises
IllegalAnnotationValueWarning or MissingAnnotationRegexWarning
- Return type
None
-
raise_for_missing_citation
(line, position)[source]¶ Raise an exception if there is no citation present in the parser.
- Raises
MissingCitationException
- Return type
None
-
handle_annotation_key
(line, position, tokens)[source]¶ Handle an annotation key before parsing to validate that it’s either enumerated or as a regex.
- Raise
MissingCitationException or UndefinedAnnotationWarning
- Return type
ParseResults
-
handle_set_statement_group
(_, __, tokens)[source]¶ Handle a
SET STATEMENT_GROUP = "X"
statement.- Return type
ParseResults
-
handle_set_citation
(line, position, tokens)[source]¶ Handle a
SET Citation = {"X", "Y", "Z", ...}
statement.- Return type
ParseResults
-
handle_set_evidence
(_, __, tokens)[source]¶ Handle a
SET Evidence = ""
statement.- Return type
ParseResults
-
handle_set_command
(line, position, tokens)[source]¶ Handle a
SET X = "Y"
statement.- Return type
ParseResults
-
handle_set_command_list
(line, position, tokens)[source]¶ Handle a
SET X = {"Y", "Z", ...}
statement.- Return type
ParseResults
-
handle_unset_statement_group
(line, position, tokens)[source]¶ Unset the statement group, or raises an exception if it is not set.
- Raises
MissingAnnotationKeyWarning
- Return type
ParseResults
-
handle_unset_citation
(line, position, tokens)[source]¶ Unset the citation, or raise an exception if it is not set.
- Raises
MissingAnnotationKeyWarning
- Return type
ParseResults
-
handle_unset_evidence
(line, position, tokens)[source]¶ Unset the evidence, or throws an exception if it is not already set.
The value for
tokens[EVIDENCE]
corresponds to which alternate of SupportingText or Evidence was used in the BEL script.- Raises
MissingAnnotationKeyWarning
- Return type
ParseResults
-
validate_unset_command
(line, position, annotation)[source]¶ Raise an exception when trying to
UNSET X
ifX
is not already set.- Raises
MissingAnnotationKeyWarning
- Return type
None
-
handle_unset_command
(line, position, tokens)[source]¶ Handle an
UNSET X
statement or raises an exception if it is not already set.- Raises
MissingAnnotationKeyWarning
- Return type
ParseResults
-
handle_unset_list
(line, position, tokens)[source]¶ Handle
UNSET {A, B, ...}
or raises an exception of any of them are not present.Consider that all unsets are in peril if just one of them is wrong!
- Raises
MissingAnnotationKeyWarning
- Return type
ParseResults
Concept Parser¶
-
class
pybel.parser.parse_concept.
ConceptParser
(namespace_to_term_to_encoding=None, namespace_to_pattern=None, default_namespace=None, allow_naked_names=False)[source]¶ A parser for concepts in the form of
namespace:name
ornamespace:identifier!name
.Can be made more lenient when given a default namespace or enabling the use of naked names.
Initialize the concept parser.
- Parameters
namespace_to_term_to_encoding (
Optional
[Mapping
[str
,Mapping
[Tuple
[Optional
[str
],str
],str
]]]) – A dictionary of {namespace: {(identifier, name): encoding}}namespace_to_pattern (
Optional
[Mapping
[str
,Pattern
]]) – A dictionary of {namespace: regular expression string} to compiledefault_namespace (
Optional
[Set
[str
]]) – A set of strings that can be used without a namespaceallow_naked_names (
bool
) – If true, turn off naked namespace failures
-
has_enumerated_namespace
(namespace)[source]¶ Check that the namespace has been defined by an enumeration.
- Return type
-
has_regex_namespace
(namespace)[source]¶ Check that the namespace has been defined by a regular expression.
- Return type
-
has_namespace
(namespace)[source]¶ Check that the namespace has either been defined by an enumeration or a regular expression.
- Return type
-
has_enumerated_namespace_name
(namespace, name)[source]¶ Check that the namespace is defined by an enumeration and that the name is a member.
- Return type
-
has_regex_namespace_name
(namespace, name)[source]¶ Check that the namespace is defined as a regular expression and the name matches it.
- Return type
-
has_namespace_name
(line, position, namespace, name)[source]¶ Check that the namespace is defined and has the given name.
- Return type
-
raise_for_missing_namespace
(line, position, namespace, name)[source]¶ Raise an exception if the namespace is not defined.
- Return type
None
-
raise_for_missing_name
(line, position, namespace, name)[source]¶ Raise an exception if the namespace is not defined or if it does not validate the given name.
- Return type
None
-
raise_for_missing_default
(line, position, name)[source]¶ Raise an exception if the name does not belong to the default namespace.
- Return type
None
-
handle_identifier_qualified
(line, position, tokens)[source]¶ Handle parsing a qualified identifier.
- Return type
ParseResults
-
handle_namespace_default
(line, position, tokens)[source]¶ Handle parsing an identifier for the default namespace.
- Return type
ParseResults
Sub-Parsers¶
Parsers for modifications to abundances.
Internal DSL¶
An internal domain-specific language (DSL) for BEL.
Logging Messages¶
Errors¶
This module contains base exceptions that are shared through the package.
Parse Exceptions¶
Exceptions for the BEL parser.
A message for “General Parser Failure” is displayed when a problem was caused due to an unforeseen error. The line number and original statement are printed for the user to debug.
-
exception
pybel.parser.exc.
BELParserWarning
(line_number, line, position, *args)[source]¶ The base PyBEL parser exception, which holds the line and position where a parsing problem occurred.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
BELSyntaxError
(line_number, line, position, *args)[source]¶ For general syntax errors.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
InconsistentDefinitionError
(line_number, line, position, definition)[source]¶ Base PyBEL error for redefinition.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
RedefinedNamespaceError
(line_number, line, position, definition)[source]¶ Raised when a namespace is redefined.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
RedefinedAnnotationError
(line_number, line, position, definition)[source]¶ Raised when an annotation is redefined.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
NameWarning
(line_number, line, position, name, *args)[source]¶ The base class for errors related to nomenclature.
Build a warning wrapping a given name.
-
exception
pybel.parser.exc.
NakedNameWarning
(line_number, line, position, name, *args)[source]¶ Raised when there is an identifier without a namespace. Enable lenient mode to suppress.
Build a warning wrapping a given name.
-
exception
pybel.parser.exc.
MissingDefaultNameWarning
(line_number, line, position, name, *args)[source]¶ Raised if reference to value not in default namespace.
Build a warning wrapping a given name.
-
exception
pybel.parser.exc.
NamespaceIdentifierWarning
(line_number, line, position, namespace, name)[source]¶ The base class for warnings related to namespace:name identifiers.
Initialize the namespace identifier warning.
-
exception
pybel.parser.exc.
UndefinedNamespaceWarning
(line_number, line, position, namespace, name)[source]¶ Raised if reference made to undefined namespace.
Initialize the namespace identifier warning.
-
exception
pybel.parser.exc.
MissingNamespaceNameWarning
(line_number, line, position, namespace, name)[source]¶ Raised if reference to value not in namespace.
Initialize the namespace identifier warning.
-
exception
pybel.parser.exc.
MissingNamespaceRegexWarning
(line_number, line, position, namespace, name)[source]¶ Raised if reference not matching regex.
Initialize the namespace identifier warning.
-
exception
pybel.parser.exc.
AnnotationWarning
(line_number, line, position, annotation, *args)[source]¶ Base exception for annotation warnings.
Build an AnnotationWarning.
-
exception
pybel.parser.exc.
UndefinedAnnotationWarning
(line_number, line, position, annotation, *args)[source]¶ Raised when an undefined annotation is used.
Build an AnnotationWarning.
-
exception
pybel.parser.exc.
MissingAnnotationKeyWarning
(line_number, line, position, annotation, *args)[source]¶ Raised when trying to unset an annotation that is not set.
Build an AnnotationWarning.
-
exception
pybel.parser.exc.
AnnotationIdentifierWarning
(line_number, line, position, annotation, value)[source]¶ Base exception for annotation:value pairs.
Build an AnnotationWarning.
-
exception
pybel.parser.exc.
IllegalAnnotationValueWarning
(line_number, line, position, annotation, value)[source]¶ Raised when an annotation has a value that does not belong to the original set of valid annotation values.
Build an AnnotationWarning.
-
exception
pybel.parser.exc.
MissingAnnotationRegexWarning
(line_number, line, position, annotation, value)[source]¶ Raised if annotation doesn’t match regex.
Build an AnnotationWarning.
-
exception
pybel.parser.exc.
VersionFormatWarning
(line_number, line, position, version_string)[source]¶ Raised if the version string doesn’t adhere to semantic versioning or
YYYYMMDD
format.Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
MetadataException
(line_number, line, position, *args)[source]¶ Base exception for issues with document metadata.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
MalformedMetadataException
(line_number, line, position, *args)[source]¶ Raised when an invalid metadata line is encountered.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
InvalidMetadataException
(line_number, line, position, key, value)[source]¶ Raised when an incorrect document metadata key is used.
Hint
Valid document metadata keys are:
Authors
ContactInfo
Copyright
Description
Disclaimer
Licenses
Name
Version
See also
BEL specification on the properties section
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
MissingMetadataException
(line_number, line, position, key)[source]¶ Raised when a BEL Script is missing critical metadata.
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
InvalidCitationLengthException
(line_number, line, position, *args)[source]¶ Base exception raised when the format for a citation is wrong.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
CitationTooShortException
(line_number, line, position, *args)[source]¶ Raised when a citation does not have the minimum of {type, name, reference}.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
CitationTooLongException
(line_number, line, position, *args)[source]¶ Raised when a citation has more than the allowed entries, {type, name, reference, date, authors, comments}.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
MissingCitationException
(line_number, line, position, *args)[source]¶ Raised when trying to parse a BEL statement, but no citation is currently set.
This might be due to a previous error in the formatting of a citation.
Though it’s not a best practice, some BEL curators set other annotations before the citation. If this is the case in your BEL document, and you’re absolutely sure that all
UNSET
statements are correctly written, you can usecitation_clearing=True
as a keyword argument in any of the IO functions inpybel.from_lines()
,pybel.from_url()
, orpybel.from_path()
.Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
MissingSupportWarning
(line_number, line, position, *args)[source]¶ Raised when trying to parse a BEL statement, but no evidence is currently set.
All BEL statements must be qualified with evidence.
If your data is serialized from a database and provenance information is not readily accessible, consider referencing the publication for the database, or a url pointing to the data from either a programmatically or human-readable endpoint.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
MissingAnnotationWarning
(line_number, line, position, required_annotations)[source]¶ Raised when trying to parse a BEL statement and a required annotation is not present.
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
InvalidCitationType
(line_number, line, position, citation_type)[source]¶ Raised when a citation is set with an incorrect type.
Hint
Valid citation types include:
Book
PubMed
Journal
Online Resource
URL
DOI
Other
See also
OpenBEL wiki on citations
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
InvalidPubMedIdentifierWarning
(line_number, line, position, reference)[source]¶ Raised when a citation is set whose type is
PubMed
but whose database identifier is not a valid integer.Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
MalformedTranslocationWarning
(line_number, line, position, tokens)[source]¶ Raised when there is a translocation statement without location information.
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
PlaceholderAminoAcidWarning
(line_number, line, position, code)[source]¶ Raised when an invalid amino acid code is given.
One example might be the usage of X, which is a colloquial signifier for a truncation in a given position. Text mining efforts for knowledge extraction make this mistake often. X might also signify a placeholder amino acid.
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
NestedRelationWarning
(line_number, line, position, *args)[source]¶ Raised when encountering a nested statement.
See our the docs for an explanation of why we explicitly do not support nested statements.
Initialize the BEL parser warning.
-
exception
pybel.parser.exc.
InvalidEntity
(line_number, line, position, namespace, name)[source]¶ Raised when using a non-entity name for a name.
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
-
exception
pybel.parser.exc.
InvalidFunctionSemantic
(line_number, line, position, func, namespace, name, allowed_functions)[source]¶ Raised when an invalid function is used for a given node.
For example, an HGNC symbol for a protein-coding gene YFG cannot be referenced as an miRNA with
m(HGNC:YFG)
Initialize the BEL parser warning.
- Parameters
line_number – The line number on which this warning occurred
line – The content of the line
position – The position within the line where the warning occurred
References¶
If you find PyBEL useful for your work, please consider citing 1:
- 1
Hoyt, C. T., et al. (2017). PyBEL: a Computational Framework for Biological Expression Language. Bioinformatics, 34(December), 1–2.
Software using PyBEL¶
Roadmap¶
This project road map documents not only the PyBEL repository, but the PyBEL Tools and BEL Commons repositories as well as the Bio2BEL project.
PyBEL¶
- Performance improvements
Parallelization of parsing
On-the-fly validation with OLS or MIRIAM
Bio2BEL¶
- Generation of new namespaces, equivalencies, and hierarchical knowledge (isA and partOf relations)
FlyBase
InterPro (Done)
UniProt (Done)
Human Phenotype Ontology
Uber Anatomy Ontology
HGNC Gene Families (Done)
Enyzme Classification (Done)
- Integration of knowledge sources
ChEMBL
Comparative Toxicogenomics Database (Done)
BRENDA
MetaCyc
Protein complex definitions
Data2BEL¶
Integration of analytical pipelines to convert data to BEL
LD Block Analysis
Gene Co-expression Analysis
Differential Gene Expression Analysis
PyBEL Tools¶
- Biological Grammar
Network motif identification
Stability analysis
- Prior knowledge comparision
Molecular activity annotation
SNP Impact
- Implementation of standard BEL Algorithms
RCR
NPA
SST
- Development of new algorithms
Heat diffusion algorithms
Cart Before the Horse
Metapath analysis
Reasoning and inference rules
Subgraph Expansion application in NeuroMMSigDB
Chemical Enrichment in NeuroMMSigDB
BEL Commons¶
Integration with BELIEF
Integration with NeuroMMSigDB (Done)
Import and export from NDEx
Current Issues¶
Speed¶
Speed is still an issue, because documents above 100K lines still take a couple minutes to run. This issue is exacerbated by (optionally) logging output to the console, which can make it more than 3x or 4x as slow.
Namespaces¶
The default namespaces from OpenBEL do not follow a standard file format. They are similar to INI config files, but do not use consistent delimiters. Also, many of the namespaces don’t respect that the delimiter should not be used in the namespace names. There are also lots of names with strange characters, which may have been caused by copying from a data source that had specfic escape characters without proper care.
Testing¶
Testing was very difficult because the example documents on the OpenBEL website had many semantic errors, such as using names and annotation values that were not defined within their respective namespace and annotation definition files. They also contained syntax errors like naked names, which are not only syntatically incorrect, but lead to bad science; and improper usage of activities, like illegally nesting an activity within a composite statement.
Technology¶
This page is meant to describe the development stack for PyBEL, and should be a useful introduction for contributors.
Versioning¶
PyBEL is versioned on GitHub so changes in its code can be tracked over time and to make use of the variety of software development plugins. Code is produced following the Git Flow philosophy, which means that new features are coded in branches off of the development branch and merged after they are triaged. Finally, develop is merged into master for releases. If there are bugs in releases that need to be fixed quickly, “hot fix” branches from master can be made, then merged back to master and develop after fixing the problem.
Testing in PyBEL¶
PyBEL is written with extensive unit testing and integration testing. Whenever possible, test- driven development is practiced. This means that new ideas for functions and features are encoded as blank classes/functions and directly writing tests for the desired output. After tests have been written that define how the code should work, the implementation can be written.
Test-driven development requires us to think about design before making quick and dirty implementations. This results in better code. Additionally, thorough testing suites make it possible to catch when changes break existing functionality.
Tests are written with the standard unittest
library.
Unit Testing¶
Unit tests check that the functionality of the different parts of PyBEL work independently.
An example unit test can be found in tests.test_parse_bel.TestAbundance.test_short_abundance
. It ensures that
the parser is able to handle a given string describing the abundance of a chemical/other entity in BEL. It tests that
the parser produces the correct output, that the BEL statement is converted to the correct internal representation. In
this example, this is a tuple describing the abundance of oxygen atoms. Finally, it tests that this representation
is added as a node in the underlying BEL graph with the appropriate attributes added.
Integration Testing¶
Integration tests are more high level, and ensure that the software accomplishes more complicated goals by using many components. An example integration test is found in tests.test_import.TestImport.test_from_fileURL. This test ensures that a BEL script can be read and results in a NetworkX object that contains all of the information described in the script
Tox¶
While IDEs like PyCharm provide excellent testing tools, they are not programmatic.
Tox is python package that provides
a CLI interface to run automated testing procedures (as well as other build functions, that aren’t important to explain
here). In PyBEL, it is used to run the unit tests in the tests
folder with the pytest
harness. It also
runs check-manifest
, builds the documentation with sphinx
, and computes the code coverage of the tests.
The entire procedure is defined in tox.ini
. Tox also allows test to be done on many different versions of
Python.
Continuous Integration¶
Continuous integration is a philosophy of automatically testing code as it changes. PyBEL makes use of the Travis CI
server to perform testing because of its tight integration with GitHub. Travis automatically installs git hooks
inside GitHub so it knows when a new commit is made. Upon each commit, Travis downloads the newest commit from GitHub
and runs the tests configured in the .travis.yml
file in the top level of the PyBEL repository. This file
effectively instructs the Travis CI server to run Tox. It also allows for the modification of the environment variables.
This is used in PyBEL to test many different versions of python.
Code Coverage¶
After building, Travis sends code coverage results to codecov.io. This site helps visualize untested code and track the improvement of testing coverage over time. It also integrates with GitHub to show which feature branches are inadequately tested. In development of PyBEL, inadequately tested code is not allowed to be merged into develop.
Versioning¶
PyBEL uses semantic versioning. In general, the project’s version string will has a suffix -dev
like in
0.3.4-dev
throughout the development cycle. After code is merged from feature branches to develop and it is
time to deploy, this suffix is removed and develop branch is merged into master.
The version string appears in multiple places throughout the project, so BumpVersion is used to automate the updating of these version strings. See .bumpversion.cfg for more information.
Deployment¶
PyBEL is also distributed through PyPI (pronounced Py-Pee-Eye).
Travis CI has a wonderful integration with PyPI, so any time a tag is made on the master branch (and also assuming the
tests pass), a new distribution is packed and sent to PyPI. Refer to the “deploy” section at the bottom of the
.travis.yml
file for more information, or the Travis CI PyPI deployment documentation.
As a side note, Travis CI has an encryption tool so the password for the PyPI account can be displayed publicly
on GitHub. Travis decrypts it before performing the upload to PyPI.
Steps¶
bumpversion release
on development branchPush to git
After tests pass, merge develop in to master
After tests pass, create a tag on GitHub with the same name as the version number (on master)
Travis will automatically deploy to PyPI after tests pass. After checking deployment has been successful, switch to develop and
bumpversion patch