Internal Domain Specific Language

PyBEL implements an internal domain-specific language (DSL).

This enables you to write BEL using Python scripts. Even better, you can programatically generate BEL using Python. See the Bio2BEL paper and repository for many examples.

Internally, the BEL parser converts BEL script into the BEL DSL then adds it to a BEL graph object. When you iterate through the pybel.BELGraph, the nodes are instances of subclasses of pybel.dsl.BaseEntity.

Primitives

class pybel.dsl.Entity(*, namespace, name=None, identifier=None)[source]

Represents a named entity with a namespace and name/identifier.

Create a dictionary representing a reference to an entity.

Parameters
  • namespace (str) – The namespace to which the entity belongs

  • name (Optional[str]) – The name of the entity

  • identifier (Optional[str]) – The identifier of the entity in the namespace

class pybel.dsl.BaseEntity[source]

This is the superclass for all BEL terms.

A BEL term has three properties:

  1. It has a type. Subclasses of this function should set the class variable function.

  2. It can be converted to BEL. Note, this is an abstract class, so all sub-classes must implement this functionality in as_bel().

  3. It can be hashed, based on the BEL conversion

class pybel.dsl.BaseAbundance(namespace, name=None, identifier=None, xrefs=None)[source]

The superclass for all named BEL terms.

A named BEL term has:

  1. A type (taken care of by being a subclass of BaseEntity)

  2. A named Entity. Though this doesn’t directly inherit from Entity, it creates one internally using the namespace, identifier, and name. Ideally, both the identifier and name are given. If one is missing, it can be looked up with pybel.grounding.ground()

  3. An optional list of xrefs, corresponding to the whole entity, not just the namespace/name. For example, the BEL term p(HGNC:APP, frag(672_713) could xref CHEBI:64647.

Build an abundance from a function, namespace, and a name and/or identifier.

Parameters
  • namespace (str) – The name of the namespace

  • name (Optional[str]) – The name of this abundance

  • identifier (Optional[str]) – The database identifier for this abundance

  • xrefs (Optional[List[Entity]]) – Alternate identifiers for the entity

class pybel.dsl.ListAbundance(members)[source]

The superclass for all BEL terms defined by lists, as opposed to by names like in BaseAbundance.

Build a list abundance node.

Parameters

members (Union[BaseAbundance, Iterable[BaseAbundance]]) – A list of PyBEL node data dictionaries

Named Entities

class pybel.dsl.Abundance(namespace, name=None, identifier=None, xrefs=None)[source]

Builds an abundance node.

>>> from pybel.dsl import Abundance
>>> Abundance(namespace='CHEBI', name='water')

Build an abundance from a function, namespace, and a name and/or identifier.

Parameters
  • namespace (str) – The name of the namespace

  • name (Optional[str]) – The name of this abundance

  • identifier (Optional[str]) – The database identifier for this abundance

  • xrefs (Optional[List[Entity]]) – Alternate identifiers for the entity

class pybel.dsl.BiologicalProcess(namespace, name=None, identifier=None, xrefs=None)[source]

Builds a biological process node.

>>> from pybel.dsl import BiologicalProcess
>>> BiologicalProcess(namespace='GO', name='apoptosis')

Build an abundance from a function, namespace, and a name and/or identifier.

Parameters
  • namespace (str) – The name of the namespace

  • name (Optional[str]) – The name of this abundance

  • identifier (Optional[str]) – The database identifier for this abundance

  • xrefs (Optional[List[Entity]]) – Alternate identifiers for the entity

class pybel.dsl.Pathology(namespace, name=None, identifier=None, xrefs=None)[source]

Build a pathology node.

>>> from pybel.dsl import Pathology
>>> Pathology(namespace='DO', name='Alzheimer Disease')

Build an abundance from a function, namespace, and a name and/or identifier.

Parameters
  • namespace (str) – The name of the namespace

  • name (Optional[str]) – The name of this abundance

  • identifier (Optional[str]) – The database identifier for this abundance

  • xrefs (Optional[List[Entity]]) – Alternate identifiers for the entity

class pybel.dsl.Population(namespace, name=None, identifier=None, xrefs=None)[source]

Builds a population node.

>>> from pybel.dsl import Population
>>> Population(namespace='uberon', name='blood')

Build an abundance from a function, namespace, and a name and/or identifier.

Parameters
  • namespace (str) – The name of the namespace

  • name (Optional[str]) – The name of this abundance

  • identifier (Optional[str]) – The database identifier for this abundance

  • xrefs (Optional[List[Entity]]) – Alternate identifiers for the entity

Central Dogma

class pybel.dsl.CentralDogma(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]

The base class for “central dogma” abundances (i.e., genes, miRNAs, RNAs, and proteins).

Build a node for a gene, RNA, miRNA, or protein.

Parameters
  • namespace (str) – The name of the database used to identify this entity

  • name (Optional[str]) – The database’s preferred name or label for this entity

  • identifier (Optional[str]) – The database’s identifier for this entity

  • xrefs (Optional[List[Entity]]) – Alternative database cross references

  • variants (Union[None, Variant, Iterable[Variant]]) – An optional variant or list of variants

class pybel.dsl.Gene(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]

Builds a gene node.

Build a node for a gene, RNA, miRNA, or protein.

Parameters
  • namespace (str) – The name of the database used to identify this entity

  • name (Optional[str]) – The database’s preferred name or label for this entity

  • identifier (Optional[str]) – The database’s identifier for this entity

  • xrefs (Optional[List[Entity]]) – Alternative database cross references

  • variants (Union[None, Variant, Iterable[Variant]]) – An optional variant or list of variants

class pybel.dsl.Transcribable(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]

A base class for RNA and micro-RNA to share getting of their corresponding genes.

Build a node for a gene, RNA, miRNA, or protein.

Parameters
  • namespace (str) – The name of the database used to identify this entity

  • name (Optional[str]) – The database’s preferred name or label for this entity

  • identifier (Optional[str]) – The database’s identifier for this entity

  • xrefs (Optional[List[Entity]]) – Alternative database cross references

  • variants (Union[None, Variant, Iterable[Variant]]) – An optional variant or list of variants

class pybel.dsl.Rna(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]

Builds an RNA node.

Example: AKT1 protein coding gene’s RNA:

>>> from pybel.dsl import Rna
>>> Rna(namespace='HGNC', name='AKT1', identifier='391')

Non-coding RNAs can also be encoded such as U85:

>>> from pybel.dsl import Rna
>>> Rna(namespace='SNORNABASE', identifier='SR0000073')

Build a node for a gene, RNA, miRNA, or protein.

Parameters
  • namespace (str) – The name of the database used to identify this entity

  • name (Optional[str]) – The database’s preferred name or label for this entity

  • identifier (Optional[str]) – The database’s identifier for this entity

  • xrefs (Optional[List[Entity]]) – Alternative database cross references

  • variants (Union[None, Variant, Iterable[Variant]]) – An optional variant or list of variants

class pybel.dsl.MicroRna(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]

Represents an micro-RNA.

Human miRNA’s are listed on HUGO’s MicroRNAs (MIR) gene family.

MIR1-1 from HGNC:

>>> from pybel.dsl import MicroRna
>>> MicroRna(namespace='HGNC', name='MIR1-1', identifier='31499')

MIR1-1 from miRBase:

>>> from pybel.dsl import MicroRna
>>> MicroRna(namespace='MIRBASE', identifier='MI0000651')

MIR1-1 from Entrez Gene

>>> from pybel.dsl import MicroRna
>>> MicroRna(namespace='ENTREZ', identifier='406904')

Build a node for a gene, RNA, miRNA, or protein.

Parameters
  • namespace (str) – The name of the database used to identify this entity

  • name (Optional[str]) – The database’s preferred name or label for this entity

  • identifier (Optional[str]) – The database’s identifier for this entity

  • xrefs (Optional[List[Entity]]) – Alternative database cross references

  • variants (Union[None, Variant, Iterable[Variant]]) – An optional variant or list of variants

class pybel.dsl.Protein(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]

Builds a protein node.

Example: AKT

>>> from pybel.dsl import Protein
>>> Protein(namespace='HGNC', name='AKT1')

Example: AKT with optionally included HGNC database identifier

>>> from pybel.dsl import Protein
>>> Protein(namespace='HGNC', name='AKT1', identifier='391')

Example: AKT with phosphorylation

>>> from pybel.dsl import Protein, ProteinModification
>>> Protein(namespace='HGNC', name='AKT', variants=[ProteinModification('Ph', code='Thr', position=308)])

Build a node for a gene, RNA, miRNA, or protein.

Parameters
  • namespace (str) – The name of the database used to identify this entity

  • name (Optional[str]) – The database’s preferred name or label for this entity

  • identifier (Optional[str]) – The database’s identifier for this entity

  • xrefs (Optional[List[Entity]]) – Alternative database cross references

  • variants (Union[None, Variant, Iterable[Variant]]) – An optional variant or list of variants

Variants

class pybel.dsl.Variant(kind)[source]

The superclass for variant dictionaries.

Build the variant data dictionary.

Parameters

kind (str) – The kind of variant

class pybel.dsl.ProteinModification(name, code=None, position=None, namespace=None, identifier=None, xrefs=None)[source]

Build a protein modification variant dictionary.

Build a protein modification variant data dictionary.

Parameters
  • name (str) – The name of the modification

  • code (Optional[str]) – The three letter amino acid code for the affected residue. Capital first letter.

  • position (Optional[int]) – The position of the affected residue

  • namespace (Optional[str]) – The namespace to which the name of this modification belongs

  • identifier (Optional[str]) – The identifier of the name of the modification

  • xrefs (Optional[List[Entity]]) – Alternative database xrefs

Either the name or the identifier must be used. If the namespace is omitted, it is assumed that a name is specified from the BEL default namespace.

Example from BEL default namespace:

>>> from pybel.dsl import ProteinModification
>>> ProteinModification('Ph', code='Thr', position=308)

Example from custom namespace:

>>> from pybel.dsl import ProteinModification
>>> ProteinModification(name='protein phosphorylation', namespace='GO', code='Thr', position=308)

Example from custom namespace additionally qualified with identifier:

>>> from pybel.dsl import ProteinModification
>>> ProteinModification(name='protein phosphorylation', namespace='GO',
>>>                     identifier='0006468', code='Thr', position=308)
class pybel.dsl.GeneModification(name, namespace=None, identifier=None, xrefs=None)[source]

Build a gene modification variant dictionary.

Build a protein modification variant data dictionary.

Parameters
  • name (str) – The name of the modification

  • namespace (Optional[str]) – The namespace to which the name of this modification belongs

  • identifier (Optional[str]) – The identifier of the name of the modification

  • xrefs (Optional[List[Entity]]) – Alternative database xrefs

Either the name or the identifier must be used. If the namespace is omitted, it is assumed that a name is specified from the BEL default namespace.

Example from BEL default namespace:

>>> from pybel.dsl import GeneModification
>>> GeneModification(name='Me')

Example from custom namespace:

>>> from pybel.dsl import GeneModification
>>> GeneModification(name='DNA methylation', namespace='GO', identifier='0006306')
class pybel.dsl.Hgvs(variant)[source]

Builds a HGVS variant dictionary.

Build an HGVS variant data dictionary.

Parameters

variant (str) – The HGVS variant string

>>> from pybel.dsl import Protein, Hgvs
>>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
class pybel.dsl.HgvsReference[source]

Represents the “reference” variant in HGVS.

Build an HGVS variant data dictionary.

Parameters

variant – The HGVS variant string

>>> from pybel.dsl import Protein, Hgvs
>>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
class pybel.dsl.HgvsUnspecified[source]

Represents an unspecified variant in HGVS.

Build an HGVS variant data dictionary.

Parameters

variant – The HGVS variant string

>>> from pybel.dsl import Protein, Hgvs
>>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
class pybel.dsl.ProteinSubstitution(from_aa, position, to_aa)[source]

A protein substitution variant.

Build an HGVS variant data dictionary for the given protein substitution.

Parameters
  • from_aa (str) – The 3-letter amino acid code of the original residue

  • position (int) – The position of the residue

  • to_aa (str) – The 3-letter amino acid code of the new residue

>>> from pybel.dsl import Protein, ProteinSubstitution
>>> Protein(namespace='HGNC', name='AKT1', variants=[ProteinSubstitution('Ala', 127, 'Tyr')])
class pybel.dsl.Fragment(start=None, stop=None, description=None)[source]

Represent the information about a protein fragment.

Build a protein fragment data dictionary.

Parameters
  • start (Union[None, int, str]) – The starting position

  • stop (Union[None, int, str]) – The stopping position

  • description (Optional[str]) – An optional description

Example of specified fragment:

>>> from pybel.dsl import Protein, Fragment
>>> Protein(name='APP', namespace='HGNC', variants=[Fragment(start=672, stop=713)])

Example of unspecified fragment:

>>> from pybel.dsl import Protein, Fragment
>>> Protein(name='APP', namespace='HGNC', variants=[Fragment()])

Fusions

class pybel.dsl.FusionBase(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]

The superclass for building fusion node data dictionaries.

Build a fusion node.

Parameters
  • partner_5p (CentralDogma) – A PyBEL node for the 5-prime partner

  • partner_3p (CentralDogma) – A PyBEL node for the 3-prime partner

  • range_5p (Optional[FusionRangeBase]) – A fusion range for the 5-prime partner

  • range_3p (Optional[FusionRangeBase]) – A fusion range for the 3-prime partner

class pybel.dsl.GeneFusion(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]

Builds a gene fusion node.

Example, using fusion ranges with the ‘c’ qualifier

>>> from pybel.dsl import GeneFusion, Gene
>>> GeneFusion(
>>> ... partner_5p=Gene(namespace='HGNC', name='TMPRSS2'),
>>> ... range_5p=EnumeratedFusionRange('c', 1, 79),
>>> ... partner_3p=Gene(namespace='HGNC', name='ERG'),
>>> ... range_3p=EnumeratedFusionRange('c', 312, 5034)
>>> )

Example with missing fusion ranges:

>>> from pybel.dsl import GeneFusion, Gene
>>> GeneFusion(
>>> ... partner_5p=Gene(namespace='HGNC', name='TMPRSS2'),
>>> ... partner_3p=Gene(namespace='HGNC', name='ERG'),
>>> )

Build a fusion node.

Parameters
  • partner_5p (CentralDogma) – A PyBEL node for the 5-prime partner

  • partner_3p (CentralDogma) – A PyBEL node for the 3-prime partner

  • range_5p (Optional[FusionRangeBase]) – A fusion range for the 5-prime partner

  • range_3p (Optional[FusionRangeBase]) – A fusion range for the 3-prime partner

class pybel.dsl.RnaFusion(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]

Builds an RNA fusion node.

Example, with fusion ranges using the ‘r’ qualifier:

>>> from pybel.dsl import RnaFusion, Rna
>>> RnaFusion(
>>> ... partner_5p=Rna(namespace='HGNC', name='TMPRSS2'),
>>> ... range_5p=EnumeratedFusionRange('r', 1, 79),
>>> ... partner_3p=Rna(namespace='HGNC', name='ERG'),
>>> ... range_3p=EnumeratedFusionRange('r', 312, 5034)
>>> )

Example with missing fusion ranges:

>>> from pybel.dsl import RnaFusion, Rna
>>> RnaFusion(
>>> ... partner_5p=Rna(namespace='HGNC', name='TMPRSS2'),
>>> ... partner_3p=Rna(namespace='HGNC', name='ERG'),
>>> )

Build a fusion node.

Parameters
  • partner_5p (CentralDogma) – A PyBEL node for the 5-prime partner

  • partner_3p (CentralDogma) – A PyBEL node for the 3-prime partner

  • range_5p (Optional[FusionRangeBase]) – A fusion range for the 5-prime partner

  • range_3p (Optional[FusionRangeBase]) – A fusion range for the 3-prime partner

class pybel.dsl.ProteinFusion(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]

Builds a protein fusion node.

Build a fusion node.

Parameters
  • partner_5p (CentralDogma) – A PyBEL node for the 5-prime partner

  • partner_3p (CentralDogma) – A PyBEL node for the 3-prime partner

  • range_5p (Optional[FusionRangeBase]) – A fusion range for the 5-prime partner

  • range_3p (Optional[FusionRangeBase]) – A fusion range for the 3-prime partner

Fusion Ranges

class pybel.dsl.FusionRangeBase[source]

The superclass for fusion range data dictionaries.

class pybel.dsl.EnumeratedFusionRange(reference, start, stop)[source]

Represents an enumerated fusion range.

Build an enumerated fusion range.

Parameters
  • reference (str) – The reference code

  • or str start (int) – The start position, either specified by its integer position, or ‘?’

  • or str stop (int) – The stop position, either specified by its integer position, ‘?’, or ‘*

Example fully specified RNA fusion range:

>>> EnumeratedFusionRange('r', 1, 79)
class pybel.dsl.MissingFusionRange[source]

Represents a fusion range with no defined start or end.

Build a missing fusion range.

List Abundances

class pybel.dsl.ComplexAbundance(members, namespace=None, name=None, identifier=None, xrefs=None)[source]

Build a complex abundance node with the optional ability to specify a name.

Build a complex list node.

Parameters
  • members (Iterable[BaseAbundance]) – A list of PyBEL node data dictionaries

  • namespace (Optional[str]) – The namespace from which the name originates

  • name (Optional[str]) – The name of the complex

  • identifier (Optional[str]) – The identifier in the namespace in which the name originates

  • xrefs (Optional[List[Entity]]) – Alternate identifiers for the entity if it is named

class pybel.dsl.CompositeAbundance(members)[source]

Build a composite abundance node.

This node is effectively the “AND” inside BEL, which can help represent when two things need to be true at the same time. For example, in COVID 19, if both the NF-KB and IL6-STAT complex are present, then acute respiratory distress syndrome happens.

>>> from pybel.dsl import CompositeAbundance, ComplexAbundance, Protein, NamedComplexAbundance
>>> CompositeAbundance([
...     NamedComplexAbundance('fplx', 'nfkb'),
...     ComplexAbundance([
...         Protein('hgnc', identifier='6018', name='IL6'),
...         Protein('hgnc', identifier='11364', name='STAT3'),
...     ]),
... ])

Build a list abundance node.

Parameters

members (Union[BaseAbundance, Iterable[BaseAbundance]]) – A list of PyBEL node data dictionaries

class pybel.dsl.Reaction(reactants, products)[source]

Build a reaction node.

Build a reaction node.

Parameters
  • reactants (Union[BaseAbundance, Iterable[BaseAbundance]]) – A list of PyBEL node data dictionaries representing the reactants

  • products (Union[BaseAbundance, Iterable[BaseAbundance]]) – A list of PyBEL node data dictionaries representing the products

>>> from pybel.dsl import Reaction, Protein, Abundance
>>> Reaction([Protein(namespace='HGNC', name='KNG1')], [Abundance(namespace='CHEBI', name='bradykinin')])

Utilities

The following functions are useful to build DSL objects from dictionaries:

pybel.tokens.parse_result_to_dsl(tokens)[source]

Convert a ParseResult to a PyBEL DSL object.

Return type

BaseEntity