Manager

Manager API

The BaseManager takes care of building and maintaining the connection to the database via SQLAlchemy.

class pybel.manager.BaseManager(engine, session)[source]

A wrapper around a SQLAlchemy engine and session.

Instantiate a manager from an engine and session.

base

alias of sqlalchemy.ext.declarative.api.Base

create_all(checkfirst=True)[source]

Create the PyBEL cache’s database and tables.

Parameters

checkfirst (bool) – Check if the database exists before trying to re-make it

Return type

None

drop_all(checkfirst=True)[source]

Drop all data, tables, and databases for the PyBEL cache.

Parameters

checkfirst (bool) – Check if the database exists before trying to drop it

Return type

None

bind()[source]

Bind the metadata to the engine and session.

Return type

None

The Manager collates multiple groups of functions for interacting with the database. For sake of code clarity, they are separated across multiple classes that are documented below.

class pybel.manager.Manager(connection=None, engine=None, session=None, **kwargs)[source]

Bases: pybel.manager.cache_manager._Manager

A manager for the PyBEL database.

Create a connection to database and a persistent session using SQLAlchemy.

A custom default can be set as an environment variable with the name pybel.constants.PYBEL_CONNECTION, using an RFC-1738 string. For example, a MySQL string can be given with the following form:

mysql+pymysql://<username>:<password>@<host>/<dbname>?charset=utf8[&<options>]

A SQLite connection string can be given in the form:

sqlite:///~/Desktop/cache.db

Further options and examples can be found on the SQLAlchemy documentation on engine configuration.

Parameters
  • connection (Optional[str]) – An RFC-1738 database connection string. If None, tries to load from the environment variable PYBEL_CONNECTION then from the config file ~/.config/pybel/config.json whose value for PYBEL_CONNECTION defaults to pybel.constants.DEFAULT_CACHE_LOCATION.

  • engine – Optional engine to use. Must be specified with a session and no connection.

  • session – Optional session to use. Must be specified with an engine and no connection.

  • echo (bool) – Turn on echoing sql

  • autoflush (Optional[bool]) – Defaults to True if not specified in kwargs or configuration.

  • autocommit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.

  • expire_on_commit (Optional[bool]) – Defaults to False if not specified in kwargs or configuration.

  • scopefunc – Scoped function to pass to sqlalchemy.orm.scoped_session()

From the Flask-SQLAlchemy documentation:

An extra key 'scopefunc' can be set on the options dict to specify a custom scope function. If it’s not provided, Flask’s app context stack identity is used. This will ensure that sessions are created and removed with the request/response cycle, and should be fine in most cases.

Allowed Usages:

Instantiation with connection string as positional argument

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(my_connection)

Instantiation with connection string as positional argument with keyword arguments

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(my_connection, echo=True)

Instantiation with connection string as keyword argument

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(connection=my_connection)

Instantiation with connection string as keyword argument with keyword arguments

>>> my_connection = 'sqlite:///~/Desktop/cache.db'
>>> manager = Manager(connection=my_connection, echo=True)

Instantiation with user-supplied engine and session objects as keyword arguments

>>> my_engine, my_session = ...  # magical creation! See SQLAlchemy documentation
>>> manager = Manager(engine=my_engine, session=my_session)

Manager Components

class pybel.manager.NetworkManager(engine, session)[source]

Groups functions for inserting and querying networks in the database’s network store.

Instantiate a manager from an engine and session.

count_networks()[source]

Count the networks in the database.

Return type

int

list_networks()[source]

List all networks in the database.

Return type

List[Network]

list_recent_networks()[source]

List the most recently created version of each network (by name).

Return type

List[Network]

has_name_version(name, version)[source]

Check if there exists a network with the name/version combination in the database.

Return type

bool

drop_networks()[source]

Drop all networks.

Return type

None

drop_network_by_id(network_id)[source]

Drop a network by its database identifier.

Return type

None

drop_network(network)[source]

Drop a network, while also cleaning up any edges that are no longer part of any network.

Return type

None

query_singleton_edges_from_network(network)[source]

Return a query selecting all edge ids that only belong to the given network.

Return type

Query

get_network_versions(name)[source]

Return all of the versions of a network with the given name.

Return type

Set[str]

get_network_by_name_version(name, version)[source]

Load the network with the given name and version if it exists.

Return type

Optional[Network]

get_graph_by_name_version(name, version)[source]

Load the BEL graph with the given name, or allows for specification of version.

Return type

Optional[BELGraph]

get_networks_by_name(name)[source]

Get all networks with the given name. Useful for getting all versions of a given network.

Return type

List[Network]

get_most_recent_network_by_name(name)[source]

Get the most recently created network with the given name.

Return type

Optional[Network]

get_graph_by_most_recent(name)[source]

Get the most recently created network with the given name as a pybel.BELGraph.

Return type

Optional[BELGraph]

get_network_by_id(network_id)[source]

Get a network from the database by its identifier.

Return type

Network

get_graph_by_id(network_id)[source]

Get a network from the database by its identifier and converts it to a BEL graph.

Return type

BELGraph

get_networks_by_ids(network_ids)[source]

Get a list of networks with the given identifiers.

Note: order is not necessarily preserved.

Return type

List[Network]

get_graphs_by_ids(network_ids)[source]

Get a list of networks with the given identifiers and converts to BEL graphs.

Return type

List[BELGraph]

get_graph_by_ids(network_ids)[source]

Get a combine BEL Graph from a list of network identifiers.

Return type

BELGraph

class pybel.manager.QueryManager(engine, session)[source]

An extension to the Manager to make queries over the database.

Instantiate a manager from an engine and session.

count_nodes()[source]

Count the number of nodes in the database.

Return type

int

query_nodes(bel=None, type=None, namespace=None, name=None)[source]

Query nodes in the database.

Parameters
  • bel (Optional[str]) – BEL term that describes the biological entity. e.g. p(HGNC:APP)

  • type (Optional[str]) – Type of the biological entity. e.g. Protein

  • namespace (Optional[str]) – Namespace keyword that is used in BEL. e.g. HGNC

  • name (Optional[str]) – Name of the biological entity. e.g. APP

Return type

List[Node]

count_edges()[source]

Count the number of edges in the database.

Return type

int

get_edges_with_citation(citation)[source]

Get the edges with the given citation.

Return type

List[Edge]

get_edges_with_citations(citations)[source]

Get edges with one of the given citations.

Return type

List[Edge]

search_edges_with_evidence(evidence)[source]

Search edges with the given evidence.

Parameters

evidence (str) – A string to search evidences. Can use wildcard percent symbol (%).

Return type

List[Edge]

search_edges_with_bel(bel)[source]

Search edges with given BEL.

Parameters

bel (str) – A BEL string to use as a search

Return type

List[Edge]

get_edges_with_annotation(annotation, value)[source]

Search edges with the given annotation/value pair.

Return type

List[Edge]

query_edges(bel=None, source_function=None, source=None, target_function=None, target=None, relation=None)[source]

Return a query over the edges in the database.

Usually this means that you should call list() or .all() on this result.

Parameters
  • bel (Optional[str]) – BEL statement that represents the desired edge.

  • source_function (Optional[str]) – Filter source nodes with the given BEL function

  • source (Union[None, str, Node]) – BEL term of source node e.g. p(HGNC:APP) or Node object.

  • target_function (Optional[str]) – Filter target nodes with the given BEL function

  • target (Union[None, str, Node]) – BEL term of target node e.g. p(HGNC:APP) or Node object.

  • relation (Optional[str]) – The relation that should be present between source and target node.

query_citations(db=None, db_id=None, name=None, author=None, date=None, evidence_text=None)[source]

Query citations in the database.

Parameters
  • db (Optional[str]) – Type of the citation. e.g. PubMed

  • db_id (Optional[str]) – The identifier used for the citation. e.g. PubMed_ID

  • name (Optional[str]) – Title of the citation.

  • author (Union[None, str, List[str]]) – The name or a list of names of authors participated in the citation.

  • date (Union[None, str, date]) – Publishing date of the citation.

  • evidence_text (Optional[str]) –

Return type

List[Citation]

query_edges_by_pubmed_identifiers(pubmed_identifiers)[source]

Get all edges annotated to the documents identified by the given PubMed identifiers.

Return type

List[Edge]

query_induction(nodes)[source]

Get all edges between any of the given nodes (minimum length of 2).

Return type

List[Edge]

query_neighbors(nodes)[source]

Get all edges incident to any of the given nodes.

Return type

List[Edge]