pydna.seqrecord
A subclass of the Biopython SeqRecord class.
Has a number of extra methods and uses
the pydna._pretty_str.pretty_str
class instread of str for a
nicer output in the IPython shell.
- class pydna.seqrecord.SeqRecord(seq, *args, id='id', name='name', description='description', **kwargs)[source]
Bases:
SeqRecord
A subclass of the Biopython SeqRecord class.
Has a number of extra methods and uses the
pydna._pretty_str.pretty_str
class instread of str for a nicer output in the IPython shell.- classmethod from_Bio_SeqRecord(sr: SeqRecord)[source]
Creates a pydnaSeqRecord from a Biopython SeqRecord.
- property locus
Alias for name property.
- property accession
Alias for id property.
- property definition
Alias for description property.
- rc(*args, **kwargs)
Return the reverse complement of the sequence.
- isorf(table=1)[source]
Detect if sequence is an open reading frame (orf) in the 5’-3’.
direction.
Translation tables are numbers according to the NCBI numbering [1].
- Parameters:
table (int) – Sets the translation table, default is 1 (standard code)
- Returns:
True if sequence is an orf, False otherwise.
- Return type:
Examples
>>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord("atgtaa") >>> a.isorf() True >>> b=SeqRecord("atgaaa") >>> b.isorf() False >>> c=SeqRecord("atttaa") >>> c.isorf() False
References
- add_colors_to_features_for_ape()[source]
Assign colors to features.
compatible with the ApE editor.
- add_feature(x=None, y=None, seq=None, type_='misc', strand=1, *args, **kwargs)[source]
Add a feature of type misc to the feature list of the sequence.
Examples
>>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord("atgtaa") >>> a.features [] >>> a.add_feature(2,4) >>> a.features [SeqFeature(SimpleLocation(ExactPosition(2), ExactPosition(4), strand=1), type='misc', qualifiers=...)]
- list_features()[source]
Print ASCII table with all features.
Examples
>>> from pydna.seq import Seq >>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord(Seq("atgtaa")) >>> a.add_feature(2,4) >>> print(a.list_features()) +-----+---------------+-----+-----+-----+-----+------+------+ | Ft# | Label or Note | Dir | Sta | End | Len | type | orf? | +-----+---------------+-----+-----+-----+-----+------+------+ | 0 | L:ft2 | --> | 2 | 4 | 2 | misc | no | +-----+---------------+-----+-----+-----+-----+------+------+
- extract_feature(n)[source]
Extract feature and return a new SeqRecord object.
- Parameters:
n (int) –
extract (Indicates the feature to) –
Examples
>>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord("atgtaa") >>> a.add_feature(2,4) >>> b=a.extract_feature(0) >>> b SeqRecord(seq=Seq('gt'), id='ft2', name='part_name', description='description', dbxrefs=[])
- sorted_features()[source]
Return a list of the features sorted by start position.
Examples
>>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord("atgtaa") >>> a.add_feature(3,4) >>> a.add_feature(2,4) >>> print(a.features) [SeqFeature(SimpleLocation(ExactPosition(3), ExactPosition(4), strand=1), type='misc', qualifiers=...), SeqFeature(SimpleLocation(ExactPosition(2), ExactPosition(4), strand=1), type='misc', qualifiers=...)] >>> print(a.sorted_features()) [SeqFeature(SimpleLocation(ExactPosition(2), ExactPosition(4), strand=1), type='misc', qualifiers=...), SeqFeature(SimpleLocation(ExactPosition(3), ExactPosition(4), strand=1), type='misc', qualifiers=...)]
- seguid()[source]
Return the url safe SEGUID [2] for the sequence.
This checksum is the same as seguid but with base64.urlsafe encoding instead of the normal base 64. This means that the characters + and / are replaced with - and _ so that the checksum can be a part of and URL or a filename.
Examples
>>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord("gattaca") >>> a.seguid() # original seguid is +bKGnebMkia5kNg/gF7IORXMnIU 'lsseguid=tp2jzeCM2e3W4yxtrrx09CMKa_8'
References
[2]
- stamp(now=datefunction, tool='pydna', separator=' ', comment='')[source]
Add seguid checksum to COMMENTS sections
The checksum is stored in object.annotations[“comment”]. This shows in the COMMENTS section of a formatted genbank file.
For blunt linear sequences:
SEGUID <seguid>
For circular sequences:
cSEGUID <seguid>
Fore linear sequences which are not blunt:
lSEGUID <seguid>
Examples
>>> from pydna.seqrecord import SeqRecord >>> a = SeqRecord("aa") >>> a.stamp() 'lsseguid=gBw0Jp907Tg_yX3jNgS4qQWttjU' >>> a.annotations["comment"][:41] 'pydna lsseguid=gBw0Jp907Tg_yX3jNgS4qQWttj'
- lcs(other, *args, limit=25, **kwargs)[source]
Return the longest common substring between the sequence.
and another sequence (other). The other sequence can be a string, Seq, SeqRecord, Dseq or DseqRecord. The method returns a SeqFeature with type “read” as this method is mostly used to map sequence reads to the sequence. This can be changed by passing a type as keyword with some other string value.
Examples
>>> from pydna.seqrecord import SeqRecord >>> a = SeqRecord("GGATCC") >>> a.lcs("GGATCC", limit=6) SeqFeature(SimpleLocation(ExactPosition(0), ExactPosition(6), strand=1), type='read', qualifiers=...) >>> a.lcs("GATC", limit=4) SeqFeature(SimpleLocation(ExactPosition(1), ExactPosition(5), strand=1), type='read', qualifiers=...) >>> a = SeqRecord("CCCCC") >>> a.lcs("GGATCC", limit=6) SeqFeature(None)
- class pydna.seqrecord.ProteinSeqRecord(seq, *args, id='id', name='name', description='description', **kwargs)[source]
Bases:
SeqRecord
- rc(*args, **kwargs)
Return the reverse complement of the sequence.
- isorf(*args, **kwargs)[source]
Detect if sequence is an open reading frame (orf) in the 5’-3’.
direction.
Translation tables are numbers according to the NCBI numbering [3].
- Parameters:
table (int) – Sets the translation table, default is 1 (standard code)
- Returns:
True if sequence is an orf, False otherwise.
- Return type:
Examples
>>> from pydna.seqrecord import SeqRecord >>> a=SeqRecord("atgtaa") >>> a.isorf() True >>> b=SeqRecord("atgaaa") >>> b.isorf() False >>> c=SeqRecord("atttaa") >>> c.isorf() False
References
[3] http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c