Selection

This module contains functions to help select data from networks

pybel_tools.selection.group_nodes_by_annotation(graph, annotation='Subgraph')[source]

Group the nodes occurring in edges by the given annotation.

Return type

Mapping[str, Set[BaseEntity]]

pybel_tools.selection.average_node_annotation(graph, key, annotation='Subgraph', aggregator=None)[source]

Groups graph into subgraphs and assigns each subgraph a score based on the average of all nodes values for the given node key

Parameters
  • graph (pybel.BELGraph) – A BEL graph

  • key (str) – The key in the node data dictionary representing the experimental data

  • annotation (str) – A BEL annotation to use to group nodes

  • aggregator (lambda) – A function from list of values -> aggregate value. Defaults to taking the average of a list of floats.

Return type

Mapping[str, ~X]

pybel_tools.selection.group_nodes_by_annotation_filtered(graph, node_predicates=None, annotation='Subgraph')[source]

Group the nodes occurring in edges by the given annotation, with a node filter applied.

Parameters
  • graph (BELGraph) – A BEL graph

  • node_predicates (Union[Callable[[BELGraph, BaseEntity], bool], Iterable[Callable[[BELGraph, BaseEntity], bool]], None]) – A predicate or list of predicates (graph, node) -> bool

  • annotation (str) – The annotation to use for grouping

Return type

Mapping[str, Set[BaseEntity]]

Returns

A dictionary of {annotation value: set of nodes}

pybel_tools.selection.get_subgraph_by_node_filter(graph, node_predicates)[source]

Induce a sub-graph on the nodes that pass the given predicate(s).

Return type

BELGraph

pybel_tools.selection.get_causal_subgraph(graph)[source]

Build a new sub-graph induced over the causal edges.

Return type

BELGraph

Get a sub-graph induced over all nodes matching the query string.

Parameters
  • graph (BELGraph) – A BEL Graph

  • query (Union[str, Iterable[str]]) – A query string or iterable of query strings for node names

Thinly wraps search_node_names() and get_subgraph_by_induction().

Return type

BELGraph

pybel_tools.selection.get_largest_component(graph)[source]

Get the giant component of a graph.

Return type

BELGraph

pybel_tools.selection.get_leaves_by_type(graph, func=None, prune_threshold=1)[source]
Returns an iterable over all nodes in graph (in-place) with only a connection to one node. Useful for gene and

RNA. Allows for optional filter by function type.

Parameters
Returns

An iterable over nodes with only a connection to one node

Return type

iter[tuple]

pybel_tools.selection.get_nodes_in_all_shortest_paths(graph, nodes, weight=None, remove_pathologies=False)[source]

Get a set of nodes in all shortest paths between the given nodes.

Thinly wraps networkx.all_shortest_paths().

Parameters
  • graph (pybel.BELGraph) – A BEL graph

  • nodes (iter[tuple]) – The list of nodes to use to use to find all shortest paths

  • weight (Optional[str]) – Edge data key corresponding to the edge weight. If none, uses unweighted search.

  • remove_pathologies (bool) – Should pathology nodes be removed first?

Returns

A set of nodes appearing in the shortest paths between nodes in the BEL graph

Return type

set[tuple]

Note

This can be trivially parallelized using networkx.single_source_shortest_path()

pybel_tools.selection.get_shortest_directed_path_between_subgraphs(graph, a, b)[source]

Calculate the shortest path that occurs between two disconnected subgraphs A and B going through nodes in the source graph

Parameters
Returns

A list of the shortest paths between the two subgraphs

Return type

list

pybel_tools.selection.get_shortest_undirected_path_between_subgraphs(graph, a, b)[source]

Get the shortest path between two disconnected subgraphs A and B, disregarding directionality of edges in graph

Parameters
Returns

A list of the shortest paths between the two subgraphs

Return type

list

pybel_tools.selection.search_node_names(graph, query)[source]

Search for nodes containing a given string(s).

Parameters
Returns

An iterator over nodes whose names match the search query

Return type

iter

Example:

>>> from pybel.examples import sialic_acid_graph
>>> from pybel_tools.selection import search_node_names
>>> list(search_node_names(sialic_acid_graph, 'CD33'))
[('Protein', 'HGNC', 'CD33'), ('Protein', 'HGNC', 'CD33', ('pmod', ('bel', 'Ph')))]
pybel_tools.selection.search_node_namespace_names(graph, query, namespace)[source]

Search for nodes with the given namespace(s) and whose names containing a given string(s).

Parameters
  • graph (pybel.BELGraph) – A BEL graph

  • query (str or iter[str]) – The search query

  • namespace (str or iter[str]) – The namespace(s) to filter

Returns

An iterator over nodes whose names match the search query

Return type

iter

pybel_tools.selection.search_node_hgnc_names(graph, query)[source]

Search for nodes with the HGNC namespace and whose names containing a given string(s).

Parameters
Returns

An iterator over nodes whose names match the search query

Return type

iter

pybel_tools.selection.convert_path_to_metapath(graph, nodes)[source]

Converts a list of nodes to their corresponding functions

Parameters

nodes (list[tuple]) – A list of BEL node tuples

Return type

list[str]

pybel_tools.selection.get_walks_exhaustive[source]

Gets all walks under a given length starting at a given node

Parameters
  • graph (networkx.Graph) – A graph

  • node – Starting node

  • length (int) – The length of walks to get

Returns

A list of paths

Return type

list[tuple]

pybel_tools.selection.match_simple_metapath(graph, node, simple_metapath)[source]

Matches a simple metapath starting at the given node

Parameters
Returns

An iterable over paths from the node matching the metapath

Return type

iter[tuple]