This module provides the Dseqrecord class, for handling double stranded
DNA sequences. The Dseqrecord holds sequence information in the form of a pydna.dseq.Dseq
object. The Dseq and Dseqrecord classes are subclasses of Biopythons
Seq and SeqRecord classes, respectively.
The Dseq and Dseqrecord classes support the notion of circular and linear DNA topology.
Dseqrecord is a double stranded version of the Biopython SeqRecord [1] class.
The Dseqrecord object holds a Dseq object describing the sequence.
Additionally, Dseqrecord hold meta information about the sequence in the
from of a list of SeqFeatures, in the same way as the SeqRecord does.
The Dseqrecord can be initialized with a string, Seq, Dseq, SeqRecord
or another Dseqrecord. The sequence information will be stored in a
Dseq object in all cases.
Dseqrecord objects can be read or parsed from sequences in FASTA, EMBL or Genbank formats.
See the pydna.readers and pydna.parsers modules for further information.
There is a short representation associated with the Dseqrecord.
Dseqrecord(-3) represents a linear sequence of length 2
while Dseqrecord(o7)
represents a circular sequence of length 7.
Dseqrecord and Dseq share the same concept of length. This length can be larger
than each strand alone if they are staggered as in the example below.
<--length-->GATCCTTTAAAGCCTAG
Parameters:
record (string, Seq, SeqRecord, Dseq or other Dseqrecord object) – This data will be used to form the seq property
circular (bool, optional) – True or False reflecting the shape of the DNA molecule
linear (bool, optional) – True or False reflecting the shape of the DNA molecule
This checksum is the same as seguid but with base64.urlsafe
encoding instead of the normal base64. This means that
the characters + and / are replaced with - and _ so that
the checksum can be part of a URL.
Writes the Dseqrecord to a file using the format f, which must
be a format supported by Biopython SeqIO for writing [3]. Default
is “gb” which is short for Genbank. Note that Biopython SeqIO reads
more formats than it writes.
Filename is the path to the file where the sequece is to be
written. The filename is optional, if it is not given, the
description property (string) is used together with the format.
If obj is the Dseqrecord object, the default file name will be:
<obj.locus>.<f>
Where <f> is “gb” by default. If the filename already exists and
AND the sequence it contains is different, a new file name will be
used so that the old file is not lost:
This method returns a new circular sequence (Dseqrecord object), which has been rotated
in such a way that there is maximum overlap between the sequence and
ref, which may be a string, Biopython Seq, SeqRecord object or
another Dseqrecord object.
The reason for using this could be to rotate a new recombinant plasmid so
that it starts at the same position after cloning. See the example below:
Digest a Dseqrecord object with one or more restriction enzymes.
returns a list of linear Dseqrecords. If there are no cuts, an empty
list is returned.
See also Dseq.cut()
:param enzymes: A Bio.Restriction.XXX restriction object or iterable of such.
:type enzymes: enzyme object or iterable of such objects
Returns:
Dseqrecord_frags – list of Dseqrecord objects formed by the digestion