Selection¶
This module contains functions to help select data from networks
-
pybel_tools.selection.
group_nodes_by_annotation
(graph, annotation='Subgraph')[source]¶ Groups the nodes occurring in edges by the given annotation
Parameters: - graph (pybel.BELGraph) – A BEL graph
- annotation (str) – An annotation to use to group edges
Returns: dict of sets of BELGraph nodes
Return type:
-
pybel_tools.selection.
average_node_annotation
(graph, key, annotation='Subgraph', aggregator=None)[source]¶ Groups graph into subgraphs and assigns each subgraph a score based on the average of all nodes values for the given node key
Parameters: - graph (pybel.BELGraph) – A BEL graph
- key (str) – The key in the node data dictionary representing the experimental data
- annotation (str) – A BEL annotation to use to group nodes
- aggregator (lambda) – A function from list of values -> aggregate value. Defaults to taking the average of a list of floats.
-
pybel_tools.selection.
group_nodes_by_annotation_filtered
(graph, node_filters=None, annotation='Subgraph')[source]¶ Groups the nodes occurring in edges by the given annotation, with a node filter applied
Parameters: - graph (pybel.BELGraph) – A BEL graph
- node_filters (types.FunctionType or iter[types.FunctionType]) – A predicate or list of predicates (graph, node) -> bool
- annotation – The annotation to use for grouping
Returns: A dictionary of {annotation value: set of nodes}
Return type:
-
pybel_tools.selection.
get_subgraph_by_induction
(graph, nodes)[source]¶ Induce a sub-graph over the given nodes or return None if none of the nodes are in the given graph.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- nodes (iter[tuple]) – A list of BEL nodes in the graph
Return type: Optional[pybel.BELGraph]
-
pybel_tools.selection.
get_subgraph_by_node_filter
(graph, node_filters)[source]¶ Induces a graph on the nodes that pass all filters
Parameters: - graph (pybel.BELGraph) – A BEL graph
- node_filters (types.FunctionType or iter[types.FunctionType]) – A node filter or list/tuple of node filters
Returns: A subgraph induced over the nodes passing the given filters
Return type:
-
pybel_tools.selection.
get_subgraph_by_neighborhood
(graph, nodes)[source]¶ Get a BEL graph around the neighborhoods of the given nodes. Returns none if no nodes are in the graph.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- nodes (iter[tuple]) – An iterable of BEL nodes
Returns: A BEL graph induced around the neighborhoods of the given nodes
Return type: Optional[pybel.BELGraph]
-
pybel_tools.selection.
get_subgraph_by_second_neighbors
(graph, nodes, filter_pathologies=False)[source]¶ Get a graph around the neighborhoods of the given nodes and expand to the neighborhood of those nodes.
Returns none if none of the nodes are in the graph.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- nodes (iter[tuple]) – An iterable of BEL nodes
- filter_pathologies (bool) – Should expansion take place around pathologies?
Returns: A BEL graph induced around the neighborhoods of the given nodes
Return type: Optional[pybel.BELGraph]
-
pybel_tools.selection.
get_subgraph_by_all_shortest_paths
(graph, nodes, weight=None, remove_pathologies=False)[source]¶ Induce a subgraph over the nodes in the pairwise shortest paths between all of the nodes in the given list.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- nodes (iter[tuple]) – A set of nodes over which to calculate shortest paths
- weight (str) – Edge data key corresponding to the edge weight. If None, performs unweighted search
- remove_pathologies (bool) – Should the pathology nodes be deleted before getting shortest paths?
Returns: A BEL graph induced over the nodes appearing in the shortest paths between the given nodes
Return type: Optional[pybel.BELGraph]
-
pybel_tools.selection.
get_subgraph_by_annotation_value
(graph, annotation, values)[source]¶ Induce a sub-graph over all edges whose annotations match the given key and value.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- annotation (str) – The annotation to group by
- values (str or iter[str]) – The value(s) for the annotation
Returns: A subgraph of the original BEL graph
Return type:
-
pybel_tools.selection.
get_subgraph_by_annotations
(graph, annotations, or_=None)[source]¶ Induce a sub-graph given an annotations filter.
Parameters: - graph – pybel.BELGraph graph: A BEL graph
- annotations (dict[str,iter[str]]) – Annotation filters (match all with
pybel.utils.subdict_matches()
) - or (boolean) – if True any annotation should be present, if False all annotations should be present in the edge. Defaults to True.
Returns: A subgraph of the original BEL graph
Return type:
-
pybel_tools.selection.
get_subgraph_by_pubmed
(graph, pubmed_identifiers)[source]¶ Induce a sub-graph over the edges retrieved from the given PubMed identifier(s).
Parameters: - graph (pybel.BELGraph) – A BEL graph
- or list[str] pubmed_identifiers (str) – A PubMed identifier or list of PubMed identifiers
Return type:
Induce a sub-graph over the edges retrieved publications by the given author(s).
Parameters: - graph (pybel.BELGraph) – A BEL graph
- or list[str] authors (str) – An author or list of authors
Return type:
-
pybel_tools.selection.
get_subgraph_by_node_search
(graph, query)[source]¶ Gets a subgraph induced over all nodes matching the query string
Parameters: - graph (pybel.BELGraph) – A BEL Graph
- or iter[str] query (str) – A query string or iterable of query strings for node names
Returns: A subgraph induced over the original BEL graph
Return type: Thinly wraps
search_node_names()
andget_subgraph_by_induction()
.
-
pybel_tools.selection.
get_causal_subgraph
(graph)[source]¶ Builds a new subgraph induced over all edges that are causal
Parameters: graph (pybel.BELGraph) – A BEL graph Returns: A subgraph of the original BEL graph Return type: pybel.BELGraph
-
pybel_tools.selection.
get_subgraph
(graph, seed_method=None, seed_data=None, expand_nodes=None, remove_nodes=None)[source]¶ Run a pipeline query on graph with multiple sub-graph filters and expanders.
Order of Operations:
- Seeding by given function name and data
- Add nodes
- Remove nodes
Parameters: - graph (pybel.BELGraph) – A BEL graph
- seed_method (str) – The name of the get_subgraph_by_* function to use
- seed_data – The argument to pass to the get_subgraph function
- expand_nodes (list[tuple]) – Add the neighborhoods around all of these nodes
- remove_nodes (list[tuple]) – Remove these nodes and all of their in/out edges
Return type: Optional[pybel.BELGraph]
-
pybel_tools.selection.
get_multi_causal_upstream
(graph, nbunch)[source]¶ Get the union of all the 2-level deep causal upstream subgraphs from the nbunch.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- nbunch (tuple or list[tuple]) – A BEL node or list of BEL nodes
Returns: A subgraph of the original BEL graph
Return type:
-
pybel_tools.selection.
get_multi_causal_downstream
(graph, nbunch)[source]¶ Get the union of all of the 2-level deep causal downstream subgraphs from the nbunch.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- nbunch (tuple or list[tuple]) – A BEL node or list of BEL nodes
Returns: A subgraph of the original BEL graph
Return type:
-
pybel_tools.selection.
get_random_subgraph
(graph, number_edges=None, number_seed_edges=None, seed=None, invert_degrees=None)[source]¶ Generate a random subgraph based on weighted random walks from random seed edges.
Parameters: - number_edges (Optional[int]) – Maximum number of edges. Defaults to
pybel_tools.constants.SAMPLE_RANDOM_EDGE_COUNT
(250). - number_seed_edges (Optional[int]) – Number of nodes to start with (which likely results in different components
in large graphs). Defaults to
SAMPLE_RANDOM_EDGE_SEED_COUNT
(5). - seed (Optional[int]) – A seed for the random state
- invert_degrees (Optional[bool]) – Should the degrees be inverted? Defaults to true.
Return type: - number_edges (Optional[int]) – Maximum number of edges. Defaults to
-
pybel_tools.selection.
get_leaves_by_type
(graph, func=None, prune_threshold=1)[source]¶ - Returns an iterable over all nodes in graph (in-place) with only a connection to one node. Useful for gene and
- RNA. Allows for optional filter by function type.
Parameters: - graph (pybel.BELGraph) – A BEL graph
- func (str) – If set, filters by the node’s function from
pybel.constants
likepybel.constants.GENE
,pybel.constants.RNA
,pybel.constants.PROTEIN
, orpybel.constants.BIOPROCESS
- prune_threshold (int) – Removes nodes with less than or equal to this number of connections. Defaults to
1
Returns: An iterable over nodes with only a connection to one node
Return type: iter[tuple]
-
pybel_tools.selection.
get_nodes_in_all_shortest_paths
(graph, nodes, weight=None, remove_pathologies=False)[source]¶ Get a set of nodes in all shortest paths between the given nodes.
Thinly wraps
networkx.all_shortest_paths()
.Parameters: - graph (pybel.BELGraph) – A BEL graph
- nodes (iter[tuple]) – The list of nodes to use to use to find all shortest paths
- weight (Optional[str]) – Edge data key corresponding to the edge weight. If none, uses unweighted search.
- remove_pathologies (bool) – Should pathology nodes be removed first?
Returns: A set of nodes appearing in the shortest paths between nodes in the BEL graph
Return type: Note
This can be trivially parallelized using
networkx.single_source_shortest_path()
-
pybel_tools.selection.
get_shortest_directed_path_between_subgraphs
(graph, a, b)[source]¶ Calculate the shortest path that occurs between two disconnected subgraphs A and B going through nodes in the source graph
Parameters: - graph (pybel.BELGraph) – A BEL graph
- a (pybel.BELGraph) – A subgraph of
graph
, disjoint fromb
- b (pybel.BELGraph) – A subgraph of
graph
, disjoint froma
Returns: A list of the shortest paths between the two subgraphs
Return type:
-
pybel_tools.selection.
get_shortest_undirected_path_between_subgraphs
(graph, a, b)[source]¶ Get the shortest path between two disconnected subgraphs A and B, disregarding directionality of edges in graph
Parameters: - graph (pybel.BELGraph) – A BEL graph
- a (pybel.BELGraph) – A subgraph of
graph
, disjoint fromb
- b (pybel.BELGraph) – A subgraph of
graph
, disjoint froma
Returns: A list of the shortest paths between the two subgraphs
Return type:
-
pybel_tools.selection.
search_node_names
(graph, query)[source]¶ Search for nodes containing a given string(s).
Parameters: - graph (pybel.BELGraph) – A BEL graph
- query (str or iter[str]) – The search query
Returns: An iterator over nodes whose names match the search query
Return type: iter
Example:
>>> from pybel.examples import sialic_acid_graph >>> from pybel_tools.selection import search_node_names >>> list(search_node_names(sialic_acid_graph, 'CD33')) [('Protein', 'HGNC', 'CD33'), ('Protein', 'HGNC', 'CD33', ('pmod', ('bel', 'Ph')))]
-
pybel_tools.selection.
search_node_namespace_names
(graph, query, namespace)[source]¶ Search for nodes with the given namespace(s) and whose names containing a given string(s).
Parameters: - graph (pybel.BELGraph) – A BEL graph
- query (str or iter[str]) – The search query
- namespace (str or iter[str]) – The namespace(s) to filter
Returns: An iterator over nodes whose names match the search query
Return type: iter
-
pybel_tools.selection.
search_node_hgnc_names
(graph, query)[source]¶ Search for nodes with the HGNC namespace and whose names containing a given string(s).
Parameters: - graph (pybel.BELGraph) – A BEL graph
- query (str or iter[str]) – The search query
Returns: An iterator over nodes whose names match the search query
Return type: iter
-
pybel_tools.selection.
convert_path_to_metapath
(graph, nodes)[source]¶ Converts a list of nodes to their corresponding functions
Parameters: nodes (list[tuple]) – A list of BEL node tuples Return type: list[str]
-
pybel_tools.selection.
get_walks_exhaustive
[source]¶ Gets all walks under a given length starting at a given node
Parameters: - graph (networkx.Graph) – A graph
- node – Starting node
- length (int) – The length of walks to get
Returns: A list of paths
Return type:
-
pybel_tools.selection.
match_simple_metapath
(graph, node, simple_metapath)[source]¶ Matches a simple metapath starting at the given node
Parameters: - graph (pybel.BELGraph) – A BEL graph
- node (tuple) – A BEL node
- simple_metapath (list[str]) – A list of BEL Functions
Returns: An iterable over paths from the node matching the metapath
Return type: iter[tuple]