Internal Domain Specific Language¶
PyBEL implements an internal domain-specific language (DSL).
This enables you to write BEL using Python scripts. Even better, you can programatically generate BEL using Python. See the Bio2BEL paper and repository for many examples.
Internally, the BEL parser converts BEL script into the BEL DSL
then adds it to a BEL graph object. When you iterate through
the pybel.BELGraph
, the nodes are instances of subclasses
of pybel.dsl.BaseEntity
.
Primitives¶
-
class
pybel.dsl.
Entity
(*, namespace, name=None, identifier=None)[source]¶ Represents a named entity with a namespace and name/identifier.
Create a dictionary representing a reference to an entity.
-
class
pybel.dsl.
BaseEntity
[source]¶ This is the superclass for all BEL terms.
A BEL term has three properties:
It has a type. Subclasses of this function should set the class variable
function
.It can be converted to BEL. Note, this is an abstract class, so all sub-classes must implement this functionality in
as_bel()
.It can be hashed, based on the BEL conversion
-
class
pybel.dsl.
BaseAbundance
(namespace, name=None, identifier=None, xrefs=None)[source]¶ The superclass for all named BEL terms.
A named BEL term has:
A type (taken care of by being a subclass of
BaseEntity
)A named
Entity
. Though this doesn’t directly inherit fromEntity
, it creates one internally using the namespace, identifier, and name. Ideally, both the identifier and name are given. If one is missing, it can be looked up withpybel.grounding.ground()
An optional list of xrefs, corresponding to the whole entity, not just the namespace/name. For example, the BEL term
p(HGNC:APP, frag(672_713)
could xref CHEBI:64647.
Build an abundance from a function, namespace, and a name and/or identifier.
-
class
pybel.dsl.
ListAbundance
(members)[source]¶ The superclass for all BEL terms defined by lists, as opposed to by names like in
BaseAbundance
.Build a list abundance node.
Named Entities¶
-
class
pybel.dsl.
Abundance
(namespace, name=None, identifier=None, xrefs=None)[source]¶ Builds an abundance node.
>>> from pybel.dsl import Abundance >>> Abundance(namespace='CHEBI', name='water')
Build an abundance from a function, namespace, and a name and/or identifier.
-
class
pybel.dsl.
BiologicalProcess
(namespace, name=None, identifier=None, xrefs=None)[source]¶ Builds a biological process node.
>>> from pybel.dsl import BiologicalProcess >>> BiologicalProcess(namespace='GO', name='apoptosis')
Build an abundance from a function, namespace, and a name and/or identifier.
-
class
pybel.dsl.
Pathology
(namespace, name=None, identifier=None, xrefs=None)[source]¶ Build a pathology node.
>>> from pybel.dsl import Pathology >>> Pathology(namespace='DO', name='Alzheimer Disease')
Build an abundance from a function, namespace, and a name and/or identifier.
Central Dogma¶
-
class
pybel.dsl.
CentralDogma
(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]¶ The base class for “central dogma” abundances (i.e., genes, miRNAs, RNAs, and proteins).
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
-
class
pybel.dsl.
Gene
(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]¶ Builds a gene node.
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
-
class
pybel.dsl.
Transcribable
(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]¶ A base class for RNA and micro-RNA to share getting of their corresponding genes.
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
-
class
pybel.dsl.
Rna
(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]¶ Builds an RNA node.
Example: AKT1 protein coding gene’s RNA:
>>> from pybel.dsl import Rna >>> Rna(namespace='HGNC', name='AKT1', identifier='391')
Non-coding RNAs can also be encoded such as U85:
>>> from pybel.dsl import Rna >>> Rna(namespace='SNORNABASE', identifier='SR0000073')
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
-
class
pybel.dsl.
MicroRna
(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]¶ Represents an micro-RNA.
Human miRNA’s are listed on HUGO’s MicroRNAs (MIR) gene family.
MIR1-1 from HGNC:
>>> from pybel.dsl import MicroRna >>> MicroRna(namespace='HGNC', name='MIR1-1', identifier='31499')
MIR1-1 from miRBase:
>>> from pybel.dsl import MicroRna >>> MicroRna(namespace='MIRBASE', identifier='MI0000651')
MIR1-1 from Entrez Gene
>>> from pybel.dsl import MicroRna >>> MicroRna(namespace='ENTREZ', identifier='406904')
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
-
class
pybel.dsl.
Protein
(namespace, name=None, identifier=None, xrefs=None, variants=None)[source]¶ Builds a protein node.
Example: AKT
>>> from pybel.dsl import Protein >>> Protein(namespace='HGNC', name='AKT1')
Example: AKT with optionally included HGNC database identifier
>>> from pybel.dsl import Protein >>> Protein(namespace='HGNC', name='AKT1', identifier='391')
Example: AKT with phosphorylation
>>> from pybel.dsl import Protein, ProteinModification >>> Protein(namespace='HGNC', name='AKT', variants=[ProteinModification('Ph', code='Thr', position=308)])
Build a node for a gene, RNA, miRNA, or protein.
- Parameters
namespace (
str
) – The name of the database used to identify this entityname (
Optional
[str
]) – The database’s preferred name or label for this entityidentifier (
Optional
[str
]) – The database’s identifier for this entityxrefs (
Optional
[List
[Entity
]]) – Alternative database cross referencesvariants (
Union
[None
,Variant
,Iterable
[Variant
]]) – An optional variant or list of variants
Variants¶
-
class
pybel.dsl.
Variant
(kind)[source]¶ The superclass for variant dictionaries.
Build the variant data dictionary.
- Parameters
kind (
str
) – The kind of variant
-
class
pybel.dsl.
ProteinModification
(name, code=None, position=None, namespace=None, identifier=None, xrefs=None)[source]¶ Build a protein modification variant dictionary.
Build a protein modification variant data dictionary.
- Parameters
name (
str
) – The name of the modificationcode (
Optional
[str
]) – The three letter amino acid code for the affected residue. Capital first letter.position (
Optional
[int
]) – The position of the affected residuenamespace (
Optional
[str
]) – The namespace to which the name of this modification belongsidentifier (
Optional
[str
]) – The identifier of the name of the modification
Either the name or the identifier must be used. If the namespace is omitted, it is assumed that a name is specified from the BEL default namespace.
Example from BEL default namespace:
>>> from pybel.dsl import ProteinModification >>> ProteinModification('Ph', code='Thr', position=308)
Example from custom namespace:
>>> from pybel.dsl import ProteinModification >>> ProteinModification(name='protein phosphorylation', namespace='GO', code='Thr', position=308)
Example from custom namespace additionally qualified with identifier:
>>> from pybel.dsl import ProteinModification >>> ProteinModification(name='protein phosphorylation', namespace='GO', >>> identifier='0006468', code='Thr', position=308)
-
class
pybel.dsl.
GeneModification
(name, namespace=None, identifier=None, xrefs=None)[source]¶ Build a gene modification variant dictionary.
Build a gene modification variant data dictionary.
- Parameters
Either the name or the identifier must be used. If the namespace is omitted, it is assumed that a name is specified from the BEL default namespace.
Example from BEL default namespace:
>>> from pybel.dsl import GeneModification >>> GeneModification(name='Me')
Example from custom namespace:
>>> from pybel.dsl import GeneModification >>> GeneModification(name='DNA methylation', namespace='GO', identifier='0006306',)
-
class
pybel.dsl.
Hgvs
(variant)[source]¶ Builds a HGVS variant dictionary.
Build an HGVS variant data dictionary.
- Parameters
variant (
str
) – The HGVS variant string
>>> from pybel.dsl import Protein, Hgvs >>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
-
class
pybel.dsl.
HgvsReference
[source]¶ Represents the “reference” variant in HGVS.
Build an HGVS variant data dictionary.
- Parameters
variant – The HGVS variant string
>>> from pybel.dsl import Protein, Hgvs >>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
-
class
pybel.dsl.
HgvsUnspecified
[source]¶ Represents an unspecified variant in HGVS.
Build an HGVS variant data dictionary.
- Parameters
variant – The HGVS variant string
>>> from pybel.dsl import Protein, Hgvs >>> Protein(namespace='HGNC', name='AKT1', variants=[Hgvs('p.Ala127Tyr')])
-
class
pybel.dsl.
ProteinSubstitution
(from_aa, position, to_aa)[source]¶ A protein substitution variant.
Build an HGVS variant data dictionary for the given protein substitution.
- Parameters
>>> from pybel.dsl import Protein, ProteinSubstitution >>> Protein(namespace='HGNC', name='AKT1', variants=[ProteinSubstitution('Ala', 127, 'Tyr')])
-
class
pybel.dsl.
Fragment
(start=None, stop=None, description=None)[source]¶ Represent the information about a protein fragment.
Build a protein fragment data dictionary.
- Parameters
Example of specified fragment:
>>> from pybel.dsl import Protein, Fragment >>> Protein(name='APP', namespace='HGNC', variants=[Fragment(start=672, stop=713)])
Example of unspecified fragment:
>>> from pybel.dsl import Protein, Fragment >>> Protein(name='APP', namespace='HGNC', variants=[Fragment()])
Fusions¶
-
class
pybel.dsl.
FusionBase
(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]¶ The superclass for building fusion node data dictionaries.
Build a fusion node.
-
class
pybel.dsl.
GeneFusion
(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]¶ Builds a gene fusion node.
Example, using fusion ranges with the ‘c’ qualifier
>>> from pybel.dsl import GeneFusion, Gene >>> GeneFusion( >>> ... partner_5p=Gene(namespace='HGNC', name='TMPRSS2'), >>> ... range_5p=EnumeratedFusionRange('c', 1, 79), >>> ... partner_3p=Gene(namespace='HGNC', name='ERG'), >>> ... range_3p=EnumeratedFusionRange('c', 312, 5034) >>> )
Example with missing fusion ranges:
>>> from pybel.dsl import GeneFusion, Gene >>> GeneFusion( >>> ... partner_5p=Gene(namespace='HGNC', name='TMPRSS2'), >>> ... partner_3p=Gene(namespace='HGNC', name='ERG'), >>> )
Build a fusion node.
-
class
pybel.dsl.
RnaFusion
(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]¶ Builds an RNA fusion node.
Example, with fusion ranges using the ‘r’ qualifier:
>>> from pybel.dsl import RnaFusion, Rna >>> RnaFusion( >>> ... partner_5p=Rna(namespace='HGNC', name='TMPRSS2'), >>> ... range_5p=EnumeratedFusionRange('r', 1, 79), >>> ... partner_3p=Rna(namespace='HGNC', name='ERG'), >>> ... range_3p=EnumeratedFusionRange('r', 312, 5034) >>> )
Example with missing fusion ranges:
>>> from pybel.dsl import RnaFusion, Rna >>> RnaFusion( >>> ... partner_5p=Rna(namespace='HGNC', name='TMPRSS2'), >>> ... partner_3p=Rna(namespace='HGNC', name='ERG'), >>> )
Build a fusion node.
-
class
pybel.dsl.
ProteinFusion
(partner_5p, partner_3p, range_5p=None, range_3p=None)[source]¶ Builds a protein fusion node.
Build a fusion node.
Fusion Ranges¶
List Abundances¶
-
class
pybel.dsl.
ComplexAbundance
(members, namespace=None, name=None, identifier=None, xrefs=None)[source]¶ Build a complex abundance node with the optional ability to specify a name.
Build a complex list node.
- Parameters
members (
Iterable
[BaseAbundance
]) – A list of PyBEL node data dictionariesnamespace (
Optional
[str
]) – The namespace from which the name originatesidentifier (
Optional
[str
]) – The identifier in the namespace in which the name originatesxrefs (
Optional
[List
[Entity
]]) – Alternate identifiers for the entity if it is named
-
class
pybel.dsl.
CompositeAbundance
(members)[source]¶ Build a composite abundance node.
This node is effectively the “AND” inside BEL, which can help represent when two things need to be true at the same time. For example, in COVID 19, if both the NF-KB and IL6-STAT complex are present, then acute respiratory distress syndrome happens.
>>> from pybel.dsl import CompositeAbundance, ComplexAbundance, Protein, NamedComplexAbundance >>> CompositeAbundance([ ... NamedComplexAbundance('fplx', 'nfkb'), ... ComplexAbundance([ ... Protein('hgnc', identifier='6018', name='IL6'), ... Protein('hgnc', identifier='11364', name='STAT3'), ... ]), ... ])
Build a list abundance node.