Internal Domain Specific Language
PyBEL implements an internal domain-specific language (DSL).
This enables you to write BEL using Python scripts. Even better, you can programatically generate BEL using Python. See the Bio2BEL paper and repository for many examples.
Internally, the BEL parser converts BEL script into the BEL DSL
then adds it to a BEL graph object. When you iterate through
the pybel.BELGraph
, the nodes are instances of subclasses
of pybel.dsl.BaseEntity
.
Primitives
- class pybel.dsl.Entity(*, namespace, name=None, identifier=None)[source]
Represents a named entity with a namespace and name/identifier.
Create a dictionary representing a reference to an entity.
- class pybel.dsl.BaseEntity[source]
This is the superclass for all BEL terms.
A BEL term has three properties:
It has a type. Subclasses of this function should set the class variable
function
.It can be converted to BEL. Note, this is an abstract class, so all sub-classes must implement this functionality in
as_bel()
.It can be hashed, based on the BEL conversion
- class pybel.dsl.BaseAbundance(namespace, name=None, identifier=None, xrefs=None)[source]
The superclass for all named BEL terms.
A named BEL term has:
A type (taken care of by being a subclass of
BaseEntity
)A named
Entity
. Though this doesn’t directly inherit fromEntity
, it creates one internally using the namespace, identifier, and name. Ideally, both the identifier and name are given. If one is missing, it can be looked up withpybel.grounding.ground()
An optional list of xrefs, corresponding to the whole entity, not just the namespace/name. For example, the BEL term
p(HGNC:APP, frag(672_713)
could xref CHEBI:64647.
Build an abundance from a function, namespace, and a name and/or identifier.
- class pybel.dsl.ListAbundance(members)[source]
The superclass for all BEL terms defined by lists, as opposed to by names like in
BaseAbundance
.Build a list abundance node.
- Parameters
members (
Union
[BaseAbundance
,Iterable
[BaseAbundance
]]) – A list of PyBEL node data dictionaries
Named Entities
- class pybel.dsl.Abundance(namespace, name=None, identifier=None, xrefs=None)[source]
Builds an abundance node.
>>> from pybel.dsl import Abundance >>> Abundance(namespace='CHEBI', name='water')
Build an abundance from a function, namespace, and a name and/or identifier.
- class pybel.dsl.BiologicalProcess(namespace, name=None, identifier=None, xrefs=None)[source]
Builds a biological process node.
>>> from pybel.dsl import BiologicalProcess >>> BiologicalProcess(namespace='GO', name='apoptosis')
Build an abundance from a function, namespace, and a name and/or identifier.
- class pybel.dsl.Pathology(namespace, name=None, identifier=None, xrefs=None)[source]
Build a pathology node.
>>> from pybel.dsl import Pathology >>> Pathology(namespace='DO', name='Alzheimer Disease')
Build an abundance from a function, namespace, and a name and/or identifier.
Central Dogma
- class pybel.dsl.CentralDogma(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]
The base class for “central dogma” abundances (i.e., genes, miRNAs, RNAs, and proteins).
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
- class pybel.dsl.Gene(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]
Builds a gene node.
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
- class pybel.dsl.Transcribable(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]
A base class for RNA and micro-RNA to share getting of their corresponding genes.
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
- class pybel.dsl.Rna(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]
Builds an RNA node.
Example: AKT1 protein coding gene’s RNA:
>>> from pybel.dsl import Rna >>> Rna(namespace='HGNC', name='AKT1', identifier='391')
Non-coding RNAs can also be encoded such as U85:
>>> from pybel.dsl import Rna >>> Rna(namespace='SNORNABASE', identifier='SR0000073')
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
- class pybel.dsl.MicroRna(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]
Represents an micro-RNA.
Human miRNA’s are listed on HUGO’s MicroRNAs (MIR) gene family.
MIR1-1 from HGNC:
>>> from pybel.dsl import MicroRna >>> MicroRna(namespace='HGNC', name='MIR1-1', identifier='31499')
MIR1-1 from miRBase:
>>> from pybel.dsl import MicroRna >>> MicroRna(namespace='MIRBASE', identifier='MI0000651')
MIR1-1 from Entrez Gene
>>> from pybel.dsl import MicroRna >>> MicroRna(namespace='ENTREZ', identifier='406904')
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
- class pybel.dsl.Protein(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]
Builds a protein node.
Example: AKT
>>> from pybel.dsl import Protein >>> Protein(namespace='HGNC', name='AKT1')
Example: AKT with optionally included HGNC database identifier
>>> from pybel.dsl import Protein >>> Protein(namespace='HGNC', name='AKT1', identifier='391')
Example: AKT with phosphorylation
>>> from pybel.dsl import Protein, ProteinModification >>> Protein(namespace='HGNC', name='AKT', variants=[ProteinModification('Ph', code='Thr', position=308)])
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
Variants
- class pybel.dsl.Variant(kind)[source]
The superclass for variant dictionaries.
Build the variant data dictionary.
- Parameters
kind (
str
) – The kind of variant
- class pybel.dsl.ProteinModification(name, code=None, position=None, namespace=None, identifier=None, xrefs=None)[source]
Build a protein modification variant dictionary.
Build a protein modification variant data dictionary.
- Parameters
name (
str
) – The name of the modificationcode (
Optional
[str
]) – The three letter amino acid code for the affected residue. Capital first letter.position (
Optional
[int
]) – The position of the affected residuenamespace (
Optional
[str
]) – The namespace to which the name of this modification belongsidentifier (
Optional
[str
]) – The identifier of the name of the modification
Either the name or the identifier must be used. If the namespace is omitted, it is assumed that a name is specified from the BEL default namespace.
Example from BEL default namespace:
>>> from pybel.dsl import ProteinModification >>> ProteinModification('Ph', code='Thr', position=308)
Example from custom namespace:
>>> from pybel.dsl import ProteinModification >>> ProteinModification(name='protein phosphorylation', namespace='GO', code='Thr', position=308)
Example from custom namespace additionally qualified with identifier:
>>> from pybel.dsl import ProteinModification >>> ProteinModification(name='protein phosphorylation', namespace='GO', >>> identifier='0006468', code='Thr', position=308)
- class pybel.dsl.GeneModification(name, namespace=None, identifier=None, xrefs=None)[source]
Build a gene modification variant dictionary.
Build a protein modification variant data dictionary.
- Parameters
Either the name or the identifier must be used. If the namespace is omitted, it is assumed that a name is specified from the BEL default namespace.
Example from BEL default namespace:
>>> from pybel.dsl import GeneModification >>> GeneModification(name='Me')
Example from custom namespace:
>>> from pybel.dsl import GeneModification >>> GeneModification(name='DNA methylation', namespace='GO', identifier='0006306')
- class pybel.dsl.Hgvs(variant)[source]
Builds a HGVS variant dictionary.
Build an HGVS variant data dictionary.
- Parameters
variant (
str
) – The HGVS variant string
>>> from pybel.dsl import Protein, Hgvs >>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
- class pybel.dsl.HgvsReference[source]
Represents the “reference” variant in HGVS.
Build an HGVS variant data dictionary.
- Parameters
variant – The HGVS variant string
>>> from pybel.dsl import Protein, Hgvs >>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
- class pybel.dsl.HgvsUnspecified[source]
Represents an unspecified variant in HGVS.
Build an HGVS variant data dictionary.
- Parameters
variant – The HGVS variant string
>>> from pybel.dsl import Protein, Hgvs >>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
- class pybel.dsl.ProteinSubstitution(from_aa, position, to_aa)[source]
A protein substitution variant.
Build an HGVS variant data dictionary for the given protein substitution.
- Parameters
>>> from pybel.dsl import Protein, ProteinSubstitution >>> Protein(namespace='HGNC', name='AKT1', variants=[ProteinSubstitution('Ala', 127, 'Tyr')])
- class pybel.dsl.Fragment(start=None, stop=None, description=None)[source]
Represent the information about a protein fragment.
Build a protein fragment data dictionary.
- Parameters
Example of specified fragment:
>>> from pybel.dsl import Protein, Fragment >>> Protein(name='APP', namespace='HGNC', variants=[Fragment(start=672, stop=713)])
Example of unspecified fragment:
>>> from pybel.dsl import Protein, Fragment >>> Protein(name='APP', namespace='HGNC', variants=[Fragment()])
Fusions
- class pybel.dsl.FusionBase(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]
The superclass for building fusion node data dictionaries.
Build a fusion node.
- Parameters
partner_5p (
CentralDogma
) – A PyBEL node for the 5-prime partnerpartner_3p (
CentralDogma
) – A PyBEL node for the 3-prime partnerrange_5p (
Optional
[FusionRangeBase
]) – A fusion range for the 5-prime partnerrange_3p (
Optional
[FusionRangeBase
]) – A fusion range for the 3-prime partner
- class pybel.dsl.GeneFusion(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]
Builds a gene fusion node.
Example, using fusion ranges with the ‘c’ qualifier
>>> from pybel.dsl import GeneFusion, Gene >>> GeneFusion( >>> ... partner_5p=Gene(namespace='HGNC', name='TMPRSS2'), >>> ... range_5p=EnumeratedFusionRange('c', 1, 79), >>> ... partner_3p=Gene(namespace='HGNC', name='ERG'), >>> ... range_3p=EnumeratedFusionRange('c', 312, 5034) >>> )
Example with missing fusion ranges:
>>> from pybel.dsl import GeneFusion, Gene >>> GeneFusion( >>> ... partner_5p=Gene(namespace='HGNC', name='TMPRSS2'), >>> ... partner_3p=Gene(namespace='HGNC', name='ERG'), >>> )
Build a fusion node.
- Parameters
partner_5p (
CentralDogma
) – A PyBEL node for the 5-prime partnerpartner_3p (
CentralDogma
) – A PyBEL node for the 3-prime partnerrange_5p (
Optional
[FusionRangeBase
]) – A fusion range for the 5-prime partnerrange_3p (
Optional
[FusionRangeBase
]) – A fusion range for the 3-prime partner
- class pybel.dsl.RnaFusion(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]
Builds an RNA fusion node.
Example, with fusion ranges using the ‘r’ qualifier:
>>> from pybel.dsl import RnaFusion, Rna >>> RnaFusion( >>> ... partner_5p=Rna(namespace='HGNC', name='TMPRSS2'), >>> ... range_5p=EnumeratedFusionRange('r', 1, 79), >>> ... partner_3p=Rna(namespace='HGNC', name='ERG'), >>> ... range_3p=EnumeratedFusionRange('r', 312, 5034) >>> )
Example with missing fusion ranges:
>>> from pybel.dsl import RnaFusion, Rna >>> RnaFusion( >>> ... partner_5p=Rna(namespace='HGNC', name='TMPRSS2'), >>> ... partner_3p=Rna(namespace='HGNC', name='ERG'), >>> )
Build a fusion node.
- Parameters
partner_5p (
CentralDogma
) – A PyBEL node for the 5-prime partnerpartner_3p (
CentralDogma
) – A PyBEL node for the 3-prime partnerrange_5p (
Optional
[FusionRangeBase
]) – A fusion range for the 5-prime partnerrange_3p (
Optional
[FusionRangeBase
]) – A fusion range for the 3-prime partner
- class pybel.dsl.ProteinFusion(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]
Builds a protein fusion node.
Build a fusion node.
- Parameters
partner_5p (
CentralDogma
) – A PyBEL node for the 5-prime partnerpartner_3p (
CentralDogma
) – A PyBEL node for the 3-prime partnerrange_5p (
Optional
[FusionRangeBase
]) – A fusion range for the 5-prime partnerrange_3p (
Optional
[FusionRangeBase
]) – A fusion range for the 3-prime partner
Fusion Ranges
List Abundances
- class pybel.dsl.ComplexAbundance(members, namespace=None, name=None, identifier=None, xrefs=None)[source]
Build a complex abundance node with the optional ability to specify a name.
Build a complex list node.
- Parameters
members (
Iterable
[BaseAbundance
]) – A list of PyBEL node data dictionariesnamespace (
Optional
[str
]) – The namespace from which the name originatesidentifier (
Optional
[str
]) – The identifier in the namespace in which the name originatesxrefs (
Optional
[List
[Entity
]]) – Alternate identifiers for the entity if it is named
- class pybel.dsl.CompositeAbundance(members)[source]
Build a composite abundance node.
This node is effectively the “AND” inside BEL, which can help represent when two things need to be true at the same time. For example, in COVID 19, if both the NF-KB and IL6-STAT complex are present, then acute respiratory distress syndrome happens.
>>> from pybel.dsl import CompositeAbundance, ComplexAbundance, Protein, NamedComplexAbundance >>> CompositeAbundance([ ... NamedComplexAbundance('fplx', 'nfkb'), ... ComplexAbundance([ ... Protein('hgnc', identifier='6018', name='IL6'), ... Protein('hgnc', identifier='11364', name='STAT3'), ... ]), ... ])
Build a list abundance node.
- Parameters
members (
Union
[BaseAbundance
,Iterable
[BaseAbundance
]]) – A list of PyBEL node data dictionaries
- class pybel.dsl.Reaction(reactants, products, namespace=None, name=None, identifier=None, xrefs=None)[source]
Build a reaction node.
Build a reaction node.
- Parameters
reactants (
Union
[BaseAbundance
,Iterable
[BaseAbundance
]]) – A list of PyBEL node data dictionaries representing the reactantsproducts (
Union
[BaseAbundance
,Iterable
[BaseAbundance
]]) – A list of PyBEL node data dictionaries representing the productsnamespace (
Optional
[str
]) – The namespace from which the name originatesidentifier (
Optional
[str
]) – The identifier in the namespace in which the name originatesxrefs (
Optional
[List
[Entity
]]) – Alternate identifiers for the entity if it is named
>>> from pybel.dsl import Reaction, Protein, Abundance >>> Reaction([Protein(namespace='HGNC', name='KNG1')], [Abundance(namespace='CHEBI', name='bradykinin')])
Utilities
The following functions are useful to build DSL objects from dictionaries: