Manager

Manager API

The BaseManager takes care of building and maintaining the connection to the database via SQLAlchemy.

class pybel.manager.BaseManager(engine, session)[source]

A wrapper around a SQLAlchemy engine and session.

Instantiate a manager from an engine and session.

classmethod from_connection(connection, echo=False, autoflush=None, autocommit=None, expire_on_commit=None, scopefunc=None)[source]

Create a connection to database and a persistent session using SQLAlchemy.

A custom default can be set as an environment variable with the name pybel.constants.PYBEL_CONNECTION, using an RFC-1738 string. For example, a MySQL string can be given with the following form:

mysql+pymysql://<username>:<password>@<host>/<dbname>?charset=utf8[&<options>]

A SQLite connection string can be given in the form:

sqlite:///~/Desktop/cache.db

Further options and examples can be found on the SQLAlchemy documentation on engine configuration.

Parameters:
  • connection (str) – An RFC-1738 database connection string.
  • echo (bool) – Turn on echoing sql
  • autoflush (Optional[bool]) – Defaults to True if not specified in kwargs or configuration.
  • autocommit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.
  • expire_on_commit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.
  • scopefunc – Scoped function to pass to sqlalchemy.orm.scoped_session()

From the Flask-SQLAlchemy documentation:

An extra key 'scopefunc' can be set on the options dict to specify a custom scope function. If it’s not provided, Flask’s app context stack identity is used. This will ensure that sessions are created and removed with the request/response cycle, and should be fine in most cases.

create_all(checkfirst=True)[source]

Create the PyBEL cache’s database and tables.

Parameters:checkfirst (bool) – Check if the database exists before trying to re-make it
drop_all(checkfirst=True)[source]

Drop all data, tables, and databases for the PyBEL cache.

Parameters:checkfirst (bool) – Check if the database exists before trying to drop it

The Manager collates multiple groups of functions for interacting with the database. For sake of code clarity, they are separated across multiple classes that are documented below.

class pybel.manager.Manager(connection=None, engine=None, session=None, **kwargs)[source]

Bases: pybel.manager.cache_manager._Manager

A manager for the PyBEL database.

Create a connection to database and a persistent session using SQLAlchemy.

A custom default can be set as an environment variable with the name pybel.constants.PYBEL_CONNECTION, using an RFC-1738 string. For example, a MySQL string can be given with the following form:

mysql+pymysql://<username>:<password>@<host>/<dbname>?charset=utf8[&<options>]

A SQLite connection string can be given in the form:

sqlite:///~/Desktop/cache.db

Further options and examples can be found on the SQLAlchemy documentation on engine configuration.

Parameters:
  • connection (Optional[str]) – An RFC-1738 database connection string. If None, tries to load from the environment variable PYBEL_CONNECTION then from the config file ~/.config/pybel/config.json whose value for PYBEL_CONNECTION defaults to pybel.constants.DEFAULT_CACHE_LOCATION.
  • engine – Optional engine to use. Must be specified with a session and no connection.
  • session – Optional session to use. Must be specified with an engine and no connection.
  • echo (bool) – Turn on echoing sql
  • autoflush (Optional[bool]) – Defaults to True if not specified in kwargs or configuration.
  • autocommit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.
  • expire_on_commit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.
  • scopefunc – Scoped function to pass to sqlalchemy.orm.scoped_session()

From the Flask-SQLAlchemy documentation:

An extra key 'scopefunc' can be set on the options dict to specify a custom scope function. If it’s not provided, Flask’s app context stack identity is used. This will ensure that sessions are created and removed with the request/response cycle, and should be fine in most cases.

Allowed Usages:

Instantiation with connection string as positional argument

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(my_connection)

Instantiation with connection string as positional argument with keyword arguments

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(my_connection, echo=True)

Instantiation with connection string as keyword argument

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(connection=my_connection)

Instantiation with connection string as keyword argument with keyword arguments

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(connection=my_connection, echo=True)

Instantiation with user-supplied engine and session objects as keyword arguments

>>> my_engine, my_session = ...  # magical creation! See SQLAlchemy documentation
>>> manager = Manager(engine=my_engine, session=my_session)

Manager Components

class pybel.manager.NetworkManager(use_namespace_cache=False, *args, **kwargs)[source]

Groups functions for inserting and querying networks in the database’s network store.

Parameters:use_namespace_cache – Should namespaces be cached in-memory?
count_networks()[source]

Counts the number of networks in the cache

Return type:int
list_networks()[source]

Lists all networks in the cache

Return type:list[Network]
list_recent_networks()[source]

Lists the most recently created version of each network (by name)

Return type:list[Network]
has_name_version(name, version)[source]

Checks if the name/version combination is already in the database

Parameters:
  • name (str) – The network name
  • version (str) – The network version
Return type:

bool

static iterate_singleton_edges_from_network(network)[source]

Gets all edges that only belong to the given network

Return type:iter[Edge]
drop_network(network)[source]

Drops a network, while also cleaning up any edges that are no longer part of any network.

drop_network_by_id(network_id)[source]

Drops a network by its database identifier

Parameters:network_id (int) – The network’s database identifier
drop_networks()[source]

Drops all networks

get_network_versions(name)[source]

Returns all of the versions of a network with the given name

Parameters:name (str) – The name of the network to query
Return type:set[str]
get_network_by_name_version(name, version)[source]

Loads most network with the given name and version

Parameters:
  • name (str) – The name of the network.
  • version (str) – The version string of the network.
Return type:

Optional[Network]

get_graph_by_name_version(name, version)[source]

Loads most recently added graph with the given name, or allows for specification of version

Parameters:
  • name (str) – The name of the network.
  • version (str) – The version string of the network.
Return type:

Optional[BELGraph]

get_networks_by_name(name)[source]

Gets all networks with the given name. Useful for getting all versions of a given network.

Parameters:name (str) – The name of the network
Return type:list[Network]
get_most_recent_network_by_name(name)[source]

Gets the most recently created network with the given name.

Parameters:name (str) – The name of the network
Return type:Optional[Network]
get_graph_by_most_recent(name)[source]

Gets the most recently created network with the given name as a pybel.BELGraph.

Parameters:name (str) – The name of the network
Return type:Optional[BELGraph]
get_network_by_id(network_id)[source]

Gets a network from the database by its identifier.

Parameters:network_id (int) – The network’s database identifier
Return type:Network
get_graph_by_id(network_id)[source]

Gets a network from the database by its identifier and converts it to a BEL graph

Parameters:network_id (int) – The network’s database identifier
Return type:BELGraph
get_networks_by_ids(network_ids)[source]

Gets a list of networks with the given identifiers. Note: order is not necessarily preserved.

Parameters:network_ids (iter[int]) – The identifiers of networks in the database
Return type:list[Network]
get_graphs_by_ids(network_ids)[source]

Gets a list of networks with the given identifiers and converts to BEL graphs. Note: order is not necessarily preserved.

Parameters:network_ids (iter[int]) – The identifiers of networks in the database
Return type:list[BELGraph]
get_graph_by_ids(network_ids)[source]

Gets a combine BEL Graph from a list of network identifiers

Parameters:network_ids (list[int]) – A list of network identifiers
Return type:BELGraph
class pybel.manager.QueryManager(engine, session)[source]

Groups queries over the edge store

Instantiate a manager from an engine and session.

count_nodes()[source]

Counts the number of nodes in the cache

Return type:int
get_node_tuple_by_hash(node_hash)[source]

Looks up a node by the hash and returns the corresponding PyBEL node tuple

Parameters:node_hash (str) – The hash of a PyBEL node tuple from pybel.utils.hash_node()
Return type:Optional[tuple]
get_node_by_tuple(node)[source]

Looks up a node by the PyBEL node tuple

Parameters:node (tuple) – A PyBEL node tuple
Return type:Optional[Node]
query_nodes(bel=None, type=None, namespace=None, name=None)[source]

Builds and runs a query over all nodes in the PyBEL cache.

Parameters:
  • bel (str) – BEL term that describes the biological entity. e.g. p(HGNC:APP)
  • type (str) – Type of the biological entity. e.g. Protein
  • namespace (str) – Namespace keyword that is used in BEL. e.g. HGNC
  • name (str) – Name of the biological entity. e.g. APP
Return type:

list[Node]

count_edges()[source]

Counts the number of edges in the cache

Return type:int
get_edges_with_citation(citation)[source]

Gets the edges with the given citation

Parameters:citation (Citation) –
Return type:iter[Edge]
get_edges_with_citations(citations)[source]

Gets the edges with the given citations

Parameters:citations (iter[Citation]) –
Return type:list[Edge]
search_edges_with_evidence(evidence)[source]

Searches edges with the given evidence

Parameters:evidence (str) – A string to search evidences. Can use wildcard percent symbol (%).
Return type:list[Edge]
search_edges_with_bel(bel)[source]

Searches edges with given BEL

Parameters:bel (str) – A BEL string to use as a search
Return type:list[Edge]
get_edges_with_annotation(annotation, value)[source]
Parameters:
  • annotation (str) –
  • value (str) –
Return type:

list[Edge]

query_edges(bel=None, source_function=None, source=None, target_function=None, target=None, relation=None)[source]

Builds and runs a query over all edges in the PyBEL cache.

Parameters:
  • bel (str) – BEL statement that represents the desired edge.
  • source_function (str) – Filter source nodes with the given BEL function
  • source (str or Node) – BEL term of source node e.g. p(HGNC:APP) or Node object.
  • target_function (str) – Filter target nodes with the given BEL function
  • target (str or Node) – BEL term of target node e.g. p(HGNC:APP) or Node object.
  • relation (str) – The relation that should be present between source and target node.
Return type:

list[Edge]

query_citations(type=None, reference=None, name=None, author=None, date=None, evidence_text=None)[source]

Builds and runs a query over all citations in the PyBEL cache.

Parameters:
  • type (str) – Type of the citation. e.g. PubMed
  • reference (str) – The identifier used for the citation. e.g. PubMed_ID
  • name (str) – Title of the citation.
  • or list[str] author (str) – The name or a list of names of authors participated in the citation.
  • date (str or datetime.date) – Publishing date of the citation.
  • evidence_text (str) –
Return type:

list[Citation]

query_edges_by_pubmed_identifiers(pubmed_identifiers)[source]

Gets all edges annotated to the given documents

Parameters:pubmed_identifiers (list[str]) – A list of PubMed document identifiers
Return type:list[Edge]
query_induction(nodes)[source]

Gets all edges between any of the given nodes

Parameters:nodes (list[Node]) – A list of nodes (length > 2)
Return type:list[Edge]
query_neighbors(nodes)[source]

Gets all edges incident to any of the given nodes

Parameters:nodes (list[Node]) – A list of nodes
Return type:list[Edge]