Utilities

Some utilities that are used throughout the software are explained here:

General Utilities

pybel.utils.expand_dict(flat_dict, sep='_')[source]

Expands a flattened dictionary

Parameters:
  • flat_dict (dict) – a nested dictionary that has been flattened so the keys are composite
  • sep (str) – the separator between concatenated keys
Return type:

dict

pybel.utils.flatten_dict(d, parent_key='', sep='_')[source]

Flattens a nested dictionary.

Parameters:
  • d (dict or MutableMapping) – A nested dictionary
  • parent_key (str) – The parent’s key. This is a value for tail recursion, so don’t set it yourself.
  • sep (str) – The separator used between dictionary levels
Return type:

dict

pybel.utils.flatten_graph_data(graph)[source]

Returns a new graph with flattened edge data dictionaries.

Parameters:graph (nx.MultiDiGraph) – A graph with nested edge data dictionaries
Returns:A graph with flattened edge data dictionaries
Return type:nx.MultiDiGraph
pybel.utils.list2tuple(l)[source]

Recursively converts a nested list to a nested tuple

Return type:tuple
pybel.utils.get_version()[source]

Gets the current PyBEL version

Returns:The current PyBEL version
Return type:str
pybel.utils.tokenize_version(version_string)[source]

Tokenizes a version string to a tuple. Truncates qualifiers like -dev.

Parameters:version_string (str) – A version string
Returns:A tuple representing the version string
Return type:tuple
>>> tokenize_version('0.1.2-dev')
(0, 1, 2)
pybel.utils.citation_dict_to_tuple(citation)[source]

Convert the d[CITATION] entry in an edge data dictionary to a tuple

Parameters:citation (dict) –
Return type:tuple[str]
pybel.utils.flatten_citation(citation)[source]

Flattens a citation dict, from the d[CITATION] entry in an edge data dictionary

Parameters:citation (dict[str,str]) – A PyBEL citation data dictionary
Return type:str
pybel.utils.ensure_quotes(s)[source]

Quote a string that isn’t solely alphanumeric

Return type:str
pybel.utils.valid_date(s)[source]

Checks that a string represents a valid date in ISO 8601 format YYYY-MM-DD

Return type:bool
pybel.utils.valid_date_version(s)[source]

Checks that the string is a valid date versions string

Return type:bool
pybel.utils.parse_datetime(s)[source]

Tries to parse a datetime object from a standard datetime format or date format

Parameters:s (str) – A string representing a date or datetime
Returns:A parsed date object
Return type:datetime.date
pybel.utils.hash_node(node_tuple)[source]

Converts a PyBEL node tuple to a hash

Parameters:node_tuple (tuple) – A BEL node
Returns:A hashed version of the node tuple using hashlib.sha512() hash of the binary pickle dump
Return type:str
pybel.utils.hash_edge(u, v, data)[source]

Converts an edge tuple to a hash

Parameters:
  • u (tuple) – The source BEL node
  • v (tuple) – The target BEL node
  • data (dict) – The edge’s data dictionary
Returns:

A hashed version of the edge tuple using md5 hash of the binary pickle dump of u, v, and the json dump of d

Return type:

str

pybel.utils.subdict_matches(target, query, partial_match=True)[source]

Checks if all the keys in the query dict are in the target dict, and that their values match

  1. Checks that all keys in the query dict are in the target dict
  2. Matches the values of the keys in the query dict
    1. If the value is a string, then must match exactly
    2. If the value is a set/list/tuple, then will match any of them
    3. If the value is a dict, then recursively check if that subdict matches
Parameters:
  • target (dict) – The dictionary to search
  • query (dict) – A query dict with keys to match
  • partial_match (bool) – Should the query values be used as partial or exact matches? Defaults to True.
Returns:

if all keys in b are in target_dict and their values match

Return type:

bool

pybel.utils.hash_dump(data)[source]

Hashes an arbitrary JSON dictionary by dumping it in sorted order, encoding it in UTF-8, then hashing the bytes

Parameters:data (dict or list or tuple) – An arbitrary JSON-serializable object
Return type:str
pybel.utils.hash_citation(type, reference)[source]

Creates a hash for a type/reference pair of a citation

Parameters:
  • type (str) – The corresponding citation type
  • reference (str) – The citation reference
Return type:

str

pybel.utils.hash_evidence(text, type, reference)[source]

Creates a hash for an evidence and its citation

Parameters:
  • text (str) – The evidence text
  • type (str) – The corresponding citation type
  • reference (str) – The citation reference
Return type:

str

IO Utilities

pybel.io.line_utils.parse_lines(graph, lines, manager=None, allow_nested=False, citation_clearing=True, **kwargs)[source]

Parses an iterable of lines into this graph. Delegates to parse_document(), parse_definitions(), and parse_statements().

Parameters:
  • graph (BELGraph) – A BEL graph
  • lines (iter[str]) – An iterable over lines of BEL script
  • manager (None or str or Manager) – An RFC-1738 database connection string, a pre-built Manager, or None for default connection
  • allow_nested (bool) – If true, turns off nested statement failures
  • citation_clearing (bool) – Should SET Citation statements clear evidence and all annotations? Delegated to pybel.parser.ControlParser

Warning

These options allow concessions for parsing BEL that is either WRONG or UNSCIENTIFIC. Use them at risk to reproducibility and validity of your results.

Parameters:
  • allow_naked_names (bool) – If true, turns off naked namespace failures
  • allow_unqualified_translocations (bool) – If true, allow translocations without TO and FROM clauses.
  • no_identifier_validation (bool) – If true, turns off namespace validation

Parser Utilities

pybel.parser.utils.is_int(s)[source]

Determines if an object can be cast to an int

Parameters:s – any object
Returns:true if argument can be cast to an int:
Return type:bool
pybel.parser.utils.nest(*content)[source]

Defines a delimited list by enumerating each element of the list

pybel.parser.utils.one_of_tags(tags, canonical_tag, name=None)[source]

This is a convenience method for defining the tags usable in the BelParser. For example, statements like g(HGNC:SNCA) can be expressed also as geneAbundance(HGNC:SNCA). The language must define multiple different tags that get normalized to the same thing.

Parameters:
  • tags (list[str]) – a list of strings that are the tags for a function. For example, [‘g’, ‘geneAbundance’] for the abundance of a gene
  • canonical_tag (str) – the preferred tag name. Does not have to be one of the tags. For example, ‘GeneAbundance’ (note capitalization) is used for the abundance of a gene
  • name (str) – this is the key under which the value for this tag is put in the PyParsing framework.
Return type:

pyparsing.ParseElement

pybel.parser.utils.triple(subject, relation, obj)[source]

Builds a simple triple in PyParsing that has a subject relation object format

Canonicalization Utilities

This module helps handle node data dictionaries

pybel.tokens.hash_node_dict(node_dict)[source]

Hashes a PyBEL node data dictionary

Parameters:node_dict (dict) –
Return type:str
pybel.tokens.node_to_tuple(tokens)[source]

Given tokens from either PyParsing, or following the PyBEL node data dictionary model, create a PyBEL node tuple.

Parameters:tokens (ParseObject or dict) – Either a PyParsing ParseObject or a PyBEL node data dictionary
Return type:tuple
pybel.tokens.sort_dict_list(tokens)[source]

Sorts a list of PyBEL data dictionaries to their canonical ordering