Parsers

This page is for users who want to squeeze the most bizarre possibilities out of PyBEL. Most users will not need this reference.

PyBEL makes extensive use of the PyParsing module. The code is organized to different modules to reflect the different faces ot the BEL language. These parsers support BEL 2.0 and have some backwards compatibility for rewriting BEL v1.0 statements as BEL v2.0. The biologist and bioinformatician using this software will likely never need to read this page, but a developer seeking to extend the language will be interested to see the inner workings of these parsers.

See: https://github.com/OpenBEL/language/blob/master/version_2.0/MIGRATE_BEL1_BEL2.md

BEL Parser

class pybel.parser.parse_bel.BELParser(graph, namespace_to_term_to_encoding=None, namespace_to_pattern=None, annotation_to_term=None, annotation_to_pattern=None, annotation_to_local=None, allow_naked_names=False, disallow_nested=False, disallow_unqualified_translocations=False, citation_clearing=True, skip_validation=False, autostreamline=True, required_annotations=None)[source]

Build a parser backed by a given dictionary of namespaces.

Build a BEL parser.

Parameters
  • graph (pybel.BELGraph) – The BEL Graph to use to store the network

  • namespace_to_term_to_encoding (Optional[Mapping[str, Mapping[Tuple[Optional[str], str], str]]]) – A dictionary of {namespace: {name: encoding}}. Delegated to pybel.parser.parse_identifier.IdentifierParser

  • namespace_to_pattern (Optional[Mapping[str, Pattern]]) – A dictionary of {namespace: regular expression strings}. Delegated to pybel.parser.parse_identifier.IdentifierParser

  • annotation_to_term (Optional[Mapping[str, Set[str]]]) – A dictionary of {annotation: set of values}. Delegated to pybel.parser.ControlParser

  • annotation_to_pattern (Optional[Mapping[str, Pattern]]) – A dictionary of {annotation: regular expression strings}. Delegated to pybel.parser.ControlParser

  • annotation_to_local (Optional[Mapping[str, Set[str]]]) – A dictionary of {annotation: set of values}. Delegated to pybel.parser.ControlParser

  • allow_naked_names (bool) – If true, turn off naked namespace failures. Delegated to pybel.parser.parse_identifier.IdentifierParser

  • disallow_nested (bool) – If true, turn on nested statement failures. Delegated to pybel.parser.parse_identifier.IdentifierParser

  • disallow_unqualified_translocations (bool) – If true, allow translocations without TO and FROM clauses.

  • citation_clearing (bool) – Should SET Citation statements clear evidence and all annotations? Delegated to pybel.parser.ControlParser

  • autostreamline (bool) – Should the parser be streamlined on instantiation?

  • required_annotations (Optional[List[str]]) – Optional list of required annotations

pmod = None

2.2.1

location = None

2.2.4

gmod = None

PyBEL BEL Specification variant

fusion = None

2.6.1

general_abundance = None

2.1.1

gene = None

2.1.4

mirna = None

2.1.5

protein = None

2.1.6

rna = None

2.1.7

complex_singleton = None

2.1.2

composite_abundance = None

2.1.3

molecular_activity = None

2.4.1

biological_process = None

2.3.1

pathology = None

2.3.2

activity = None

2.3.3

translocation = None

2.5.1

degradation = None

2.5.2

reactants = None

2.5.3

rate_limit = None

3.1.5

subprocess_of = None

3.4.6

transcribed = None

3.3.2

translated = None

3.3.3

has_member = None

3.4.1

abundance_list = None

3.4.2

get_annotations()[source]

Get the current annotations in this parser.

Return type

Dict

clear()[source]

Clear the graph and all control parser data (current citation, annotations, and statement group).

handle_nested_relation(line, position, tokens)[source]

Handle nested statements.

If self.disallow_nested is True, raises a NestedRelationWarning.

Raises

NestedRelationWarning

check_function_semantics(line, position, tokens)[source]

Raise an exception if the function used on the tokens is wrong.

Raises

InvalidFunctionSemantic

Return type

ParseResults

handle_term(_, __, tokens)[source]

Handle BEL terms (the subject and object of BEL relations).

Return type

ParseResults

handle_has_members(_, __, tokens)[source]

Handle list relations like p(X) hasMembers list(p(Y), p(Z), ...).

Return type

ParseResults

handle_has_components(_, __, tokens)[source]

Handle list relations like p(X) hasComponents list(p(Y), p(Z), ...).

Return type

ParseResults

handle_unqualified_relation(_, __, tokens)[source]

Handle unqualified relations.

Return type

ParseResults

handle_inverse_unqualified_relation(_, __, tokens)[source]

Handle unqualified relations that should go reverse.

Return type

ParseResults

handle_label_relation(line, position, tokens)[source]

Handle statements like p(X) label "Label for X".

Raises

RelabelWarning

Return type

ParseResults

ensure_node(tokens)[source]

Turn parsed tokens into canonical node name and makes sure its in the graph.

Return type

BaseEntity

handle_translocation_illegal(line, position, tokens)[source]

Handle a malformed translocation.

Return type

None

pybel.io.line_utils.parse_lines(graph, lines, manager=None, disallow_nested=False, citation_clearing=True, use_tqdm=False, tqdm_kwargs=None, no_identifier_validation=False, disallow_unqualified_translocations=False, allow_redefinition=False, allow_definition_failures=False, allow_naked_names=False, required_annotations=None, upgrade_urls=False)[source]

Parse an iterable of lines into this graph.

Delegates to parse_document(), parse_definitions(), and parse_statements().

Parameters
  • graph (BELGraph) – A BEL graph

  • lines (Iterable[str]) – An iterable over lines of BEL script

  • manager (Optional[Manager]) – A PyBEL database manager

  • disallow_nested (bool) – If true, turns on nested statement failures

  • citation_clearing (bool) – Should SET Citation statements clear evidence and all annotations? Delegated to pybel.parser.ControlParser

  • use_tqdm (bool) – Use tqdm to show a progress bar?

  • tqdm_kwargs (Optional[Mapping[str, Any]]) – Keywords to pass to tqdm

  • disallow_unqualified_translocations (bool) – If true, allow translocations without TO and FROM clauses.

  • required_annotations (Optional[List[str]]) – Annotations that are required for all statements

  • upgrade_urls (bool) – Automatically upgrade old namespace URLs. Defaults to false.

Warning

These options allow concessions for parsing BEL that is either WRONG or UNSCIENTIFIC. Use them at risk to reproducibility and validity of your results.

Parameters
  • no_identifier_validation (bool) – If true, turns off namespace validation

  • allow_naked_names (bool) – If true, turns off naked namespace failures

  • allow_redefinition (bool) – If true, doesn’t fail on second definition of same name or annotation

  • allow_definition_failures (bool) – If true, allows parsing to continue if a terminology file download/parse fails

Return type

None

Metadata Parser

class pybel.parser.parse_metadata.MetadataParser(manager, namespace_to_term_to_encoding=None, namespace_to_pattern=None, annotation_to_term=None, annotation_to_pattern=None, annotation_to_local=None, default_namespace=None, allow_redefinition=False, skip_validation=False, upgrade_urls=False)[source]

A parser for the document and definitions section of a BEL document.

See also

BEL 1.0 Specification for the DEFINE keyword

Build a metadata parser.

Parameters
  • manager – A cache manager

  • namespace_to_term_to_encoding (Optional[Mapping[str, Mapping[Tuple[Optional[str], str], str]]]) – An enumerated namespace mapping from {namespace keyword: {(identifier, name): encoding}}

  • namespace_to_pattern (Optional[Mapping[str, Pattern]]) – A regular expression namespace mapping from {namespace keyword: regex string}

  • annotation_to_term (Optional[Mapping[str, Set[str]]]) – Enumerated annotation mapping from {annotation keyword: set of valid values}

  • annotation_to_pattern (Optional[Mapping[str, Pattern]]) – Regular expression annotation mapping from {annotation keyword: regex string}

  • default_namespace (Optional[Set[str]]) – A set of strings that can be used without a namespace

  • skip_validation (bool) – If true, don’t download and cache namespaces/annotations

manager = None

This metadata parser’s internal definition cache manager

namespace_to_term_to_encoding = None

A dictionary of cached {namespace keyword: {(identifier, name): encoding}}

uncachable_namespaces = None

A set of namespaces’s URLs that can’t be cached

namespace_to_pattern = None

A dictionary of {namespace keyword: regular expression string}

default_namespace = None

A set of names that can be used without a namespace

annotation_to_term = None

A dictionary of cached {annotation keyword: set of values}

annotation_to_pattern = None

A dictionary of {annotation keyword: regular expression string}

annotation_to_local = None

A dictionary of cached {annotation keyword: set of values}

document_metadata = None

A dictionary containing the document metadata

namespace_url_dict = None

A dictionary from {namespace keyword: BEL namespace URL}

annotation_url_dict = None

A dictionary from {annotation keyword: BEL annotation URL}

handle_document(line, position, tokens)[source]

Handle statements like SET DOCUMENT X = "Y".

Raises

InvalidMetadataException

Raises

VersionFormatWarning

Return type

ParseResults

raise_for_redefined_namespace(line, position, namespace)[source]

Raise an exception if a namespace is already defined.

Raises

RedefinedNamespaceError

Return type

None

handle_namespace_url(line, position, tokens)[source]

Handle statements like DEFINE NAMESPACE X AS URL "Y".

Raises

RedefinedNamespaceError

Raises

pybel.resources.exc.ResourceError

Return type

ParseResults

handle_namespace_pattern(line, position, tokens)[source]

Handle statements like DEFINE NAMESPACE X AS PATTERN "Y".

Raises

RedefinedNamespaceError

Return type

ParseResults

raise_for_redefined_annotation(line, position, annotation)[source]

Raise an exception if the given annotation is already defined.

Raises

RedefinedAnnotationError

Return type

None

handle_annotations_url(line, position, tokens)[source]

Handle statements like DEFINE ANNOTATION X AS URL "Y".

Raises

RedefinedAnnotationError

Return type

ParseResults

handle_annotation_list(line, position, tokens)[source]

Handle statements like DEFINE ANNOTATION X AS LIST {"Y","Z", ...}.

Raises

RedefinedAnnotationError

Return type

ParseResults

handle_annotation_pattern(line, position, tokens)[source]

Handle statements like DEFINE ANNOTATION X AS PATTERN "Y".

Raises

RedefinedAnnotationError

Return type

ParseResults

has_enumerated_annotation(annotation)[source]

Check if this annotation is defined by an enumeration.

Return type

bool

has_regex_annotation(annotation)[source]

Check if this annotation is defined by a regular expression.

Return type

bool

has_local_annotation(annotation)[source]

Check if this annotation is defined by an locally.

Return type

bool

has_annotation(annotation)[source]

Check if this annotation is defined.

Return type

bool

has_enumerated_namespace(namespace)[source]

Check if this namespace is defined by an enumeration.

Return type

bool

has_regex_namespace(namespace)[source]

Check if this namespace is defined by a regular expression.

Return type

bool

has_namespace(namespace)[source]

Check if this namespace is defined.

Return type

bool

raise_for_version(line, position, version)[source]

Check that a version string is valid for BEL documents.

This means it’s either in the YYYYMMDD or semantic version format.

Parameters
  • line (str) – The line being parsed

  • position (int) – The position in the line being parsed

  • version (str) – A version string

Raises

VersionFormatWarning

Return type

None

Control Parser

class pybel.parser.parse_control.ControlParser(annotation_to_term=None, annotation_to_pattern=None, annotation_to_local=None, citation_clearing=True, required_annotations=None)[source]

A parser for BEL control statements.

See also

BEL 1.0 specification on control records

Initialize the control statement parser.

Parameters
  • annotation_to_term (Optional[Mapping[str, Set[str]]]) – A dictionary of {annotation: set of valid values} defined with URL for parsing

  • annotation_to_pattern (Optional[Mapping[str, Pattern]]) – A dictionary of {annotation: regular expression string}

  • annotation_to_local (Optional[Mapping[str, Set[str]]]) – A dictionary of {annotation: set of valid values} for parsing defined with LIST

  • citation_clearing (bool) – Should SET Citation statements clear evidence and all annotations?

  • required_annotations (Optional[List[str]]) – Annotations that are required

property citation_is_set

Check if the citation is set.

Return type

bool

has_enumerated_annotation(annotation)[source]

Check if the annotation is defined as an enumeration.

Return type

bool

has_regex_annotation(annotation)[source]

Check if the annotation is defined as a regular expression.

Return type

bool

has_local_annotation(annotation)[source]

Check if the annotation is defined locally.

Return type

bool

has_annotation(annotation)[source]

Check if the annotation is defined.

Return type

bool

raise_for_undefined_annotation(line, position, annotation)[source]

Raise an exception if the annotation is not defined.

Raises

UndefinedAnnotationWarning

Return type

None

raise_for_invalid_annotation_value(line, position, key, value)[source]

Raise an exception if the annotation is not defined.

Raises

IllegalAnnotationValueWarning or MissingAnnotationRegexWarning

Return type

None

raise_for_missing_citation(line, position)[source]

Raise an exception if there is no citation present in the parser.

Raises

MissingCitationException

Return type

None

handle_annotation_key(line, position, tokens)[source]

Handle an annotation key before parsing to validate that it’s either enumerated or as a regex.

Raise

MissingCitationException or UndefinedAnnotationWarning

Return type

ParseResults

handle_set_statement_group(_, __, tokens)[source]

Handle a SET STATEMENT_GROUP = "X" statement.

Return type

ParseResults

handle_set_citation(line, position, tokens)[source]

Handle a SET Citation = {"X", "Y", "Z", ...} statement.

Return type

ParseResults

handle_set_evidence(_, __, tokens)[source]

Handle a SET Evidence = "" statement.

Return type

ParseResults

handle_set_command(line, position, tokens)[source]

Handle a SET X = "Y" statement.

Return type

ParseResults

handle_set_command_list(line, position, tokens)[source]

Handle a SET X = {"Y", "Z", ...} statement.

Return type

ParseResults

handle_unset_statement_group(line, position, tokens)[source]

Unset the statement group, or raises an exception if it is not set.

Raises

MissingAnnotationKeyWarning

Return type

ParseResults

handle_unset_citation(line, position, tokens)[source]

Unset the citation, or raise an exception if it is not set.

Raises

MissingAnnotationKeyWarning

Return type

ParseResults

handle_unset_evidence(line, position, tokens)[source]

Unset the evidence, or throws an exception if it is not already set.

The value for tokens[EVIDENCE] corresponds to which alternate of SupportingText or Evidence was used in the BEL script.

Raises

MissingAnnotationKeyWarning

Return type

ParseResults

validate_unset_command(line, position, annotation)[source]

Raise an exception when trying to UNSET X if X is not already set.

Raises

MissingAnnotationKeyWarning

Return type

None

handle_unset_command(line, position, tokens)[source]

Handle an UNSET X statement or raises an exception if it is not already set.

Raises

MissingAnnotationKeyWarning

Return type

ParseResults

handle_unset_list(line, position, tokens)[source]

Handle UNSET {A, B, ...} or raises an exception of any of them are not present.

Consider that all unsets are in peril if just one of them is wrong!

Raises

MissingAnnotationKeyWarning

Return type

ParseResults

handle_unset_all(_, __, tokens)[source]

Handle an UNSET_ALL statement.

Return type

ParseResults

get_annotations()[source]

Get the current annotations.

Return type

Dict

get_citation()[source]

Get the citation dictionary.

Return type

Mapping[str, str]

get_missing_required_annotations()[source]

Return missing required annotations.

Return type

List[str]

clear_citation()[source]

Clear the citation and if citation clearing is enabled, clear the evidence and annotations.

Return type

None

clear()[source]

Clear the statement_group, citation, evidence, and annotations.

Return type

None

Concept Parser

class pybel.parser.parse_concept.ConceptParser(namespace_to_term_to_encoding=None, namespace_to_pattern=None, default_namespace=None, allow_naked_names=False)[source]

A parser for concepts in the form of namespace:name or namespace:identifier!name.

Can be made more lenient when given a default namespace or enabling the use of naked names.

Initialize the concept parser.

Parameters
  • namespace_to_term_to_encoding (Optional[Mapping[str, Mapping[Tuple[Optional[str], str], str]]]) – A dictionary of {namespace: {(identifier, name): encoding}}

  • namespace_to_pattern (Optional[Mapping[str, Pattern]]) – A dictionary of {namespace: regular expression string} to compile

  • default_namespace (Optional[Set[str]]) – A set of strings that can be used without a namespace

  • allow_naked_names (bool) – If true, turn off naked namespace failures

has_enumerated_namespace(namespace)[source]

Check that the namespace has been defined by an enumeration.

Return type

bool

has_regex_namespace(namespace)[source]

Check that the namespace has been defined by a regular expression.

Return type

bool

has_namespace(namespace)[source]

Check that the namespace has either been defined by an enumeration or a regular expression.

Return type

bool

has_enumerated_namespace_name(namespace, name)[source]

Check that the namespace is defined by an enumeration and that the name is a member.

Return type

bool

has_regex_namespace_name(namespace, name)[source]

Check that the namespace is defined as a regular expression and the name matches it.

Return type

bool

has_namespace_name(line, position, namespace, name)[source]

Check that the namespace is defined and has the given name.

Return type

bool

raise_for_missing_namespace(line, position, namespace, name)[source]

Raise an exception if the namespace is not defined.

Return type

None

raise_for_missing_name(line, position, namespace, name)[source]

Raise an exception if the namespace is not defined or if it does not validate the given name.

Return type

None

raise_for_missing_default(line, position, name)[source]

Raise an exception if the name does not belong to the default namespace.

Return type

None

handle_identifier_qualified(line, position, tokens)[source]

Handle parsing a qualified identifier.

Return type

ParseResults

handle_namespace_default(line, position, tokens)[source]

Handle parsing an identifier for the default namespace.

Return type

ParseResults

static handle_namespace_lenient(line, position, tokens)[source]

Handle parsing an identifier for names missing a namespace that are outside the default namespace.

Return type

ParseResults

handle_namespace_invalid(line, position, tokens)[source]

Raise an exception when parsing a name missing a namespace.

Return type

None

Sub-Parsers

Parsers for modifications to abundances.