pydna.design
This module contain functions for primer design for various purposes.
:func:primer_design for designing primers for a sequence or a matching primer for an existing primer. Returns an
Amplicon
object (same as theamplify
module returns).:func:assembly_fragments Adds tails to primers for a linear assembly through homologous recombination or Gibson assembly.
:func:circular_assembly_fragments Adds tails to primers for a circular assembly through homologous recombination or Gibson assembly.
- pydna.design.primer_design(template, fp=None, rp=None, limit=13, target_tm=55.0, tm_func=_tm_default, estimate_function=None, **kwargs)[source]
This function designs a forward primer and a reverse primer for PCR amplification of a given template sequence.
The template argument is a Dseqrecord object or equivalent containing the template sequence.
The optional fp and rp arguments can contain an existing primer for the sequence (either the forward or reverse primer). One or the other primers can be specified, not both (since then there is nothing to design!, use the pydna.amplify.pcr function instead).
The limit argument is the minimum length of the primer. The default value is 13.
If one of the primers is given, the other primer is designed to match in terms of Tm. If both primers are designed, they will be designed to target_tm
tm_func is a function that takes an ascii string representing an oligonuceotide as argument and returns a float. Some useful functions can be found in the
pydna.tm
module, but can be substituted for a custom made function.estimate_function is a tm_func-like function that is used to get a first guess for the primer design, that is then used as starting point for the final result. This is useful when the tm_func function is slow to calculate (e.g. it relies on an external API, such as the NEB primer design API). The estimate_function should be faster than the tm_func function. The default value is None. To use the default tm_func as estimate function to get the NEB Tm faster, you can do: primer_design(dseqr, target_tm=55, tm_func=tm_neb, estimate_function=tm_default).
The function returns a pydna.amplicon.Amplicon class instance. This object has the object.forward_primer and object.reverse_primer properties which contain the designed primers.
- Parameters:
template (pydna.dseqrecord.Dseqrecord) – a Dseqrecord object. The only required argument.
fp (pydna.primer.Primer, optional) – optional pydna.primer.Primer objects containing one primer each.
rp (pydna.primer.Primer, optional) – optional pydna.primer.Primer objects containing one primer each.
target_tm (float, optional) – target tm for the primers, set to 55°C by default.
tm_func (function) – Function used for tm calculation. This function takes an ascii string representing an oligonuceotide as argument and returns a float. Some useful functions can be found in the
pydna.tm
module, but can be substituted for a custom made function.
- Returns:
result
- Return type:
Examples
>>> from pydna.dseqrecord import Dseqrecord >>> t=Dseqrecord("atgactgctaacccttccttggtgttgaacaagatcgacgacatttcgttcgaaacttacgatg") >>> t Dseqrecord(-64) >>> from pydna.design import primer_design >>> ampl = primer_design(t) >>> ampl Amplicon(64) >>> ampl.forward_primer f64 17-mer:5'-atgactgctaacccttc-3' >>> ampl.reverse_primer r64 18-mer:5'-catcgtaagtttcgaacg-3' >>> print(ampl.figure()) 5atgactgctaacccttc...cgttcgaaacttacgatg3 |||||||||||||||||| 3gcaagctttgaatgctac5 5atgactgctaacccttc3 ||||||||||||||||| 3tactgacgattgggaag...gcaagctttgaatgctac5 >>> pf = "GGATCC" + ampl.forward_primer >>> pr = "GGATCC" + ampl.reverse_primer >>> pf f64 23-mer:5'-GGATCCatgactgct..ttc-3' >>> pr r64 24-mer:5'-GGATCCcatcgtaag..acg-3' >>> from pydna.amplify import pcr >>> pcr_prod = pcr(pf, pr, t) >>> print(pcr_prod.figure()) 5atgactgctaacccttc...cgttcgaaacttacgatg3 |||||||||||||||||| 3gcaagctttgaatgctacCCTAGG5 5GGATCCatgactgctaacccttc3 ||||||||||||||||| 3tactgacgattgggaag...gcaagctttgaatgctac5 >>> print(pcr_prod.seq) GGATCCatgactgctaacccttccttggtgttgaacaagatcgacgacatttcgttcgaaacttacgatgGGATCC >>> from pydna.primer import Primer >>> pf = Primer("atgactgctaacccttccttggtgttg", id="myprimer") >>> ampl = primer_design(t, fp = pf) >>> ampl.forward_primer myprimer 27-mer:5'-atgactgctaaccct..ttg-3' >>> ampl.reverse_primer r64 32-mer:5'-catcgtaagtttcga..atc-3'
- pydna.design.assembly_fragments(f, overlap=35, maxlink=40)[source]
This function return a list of
pydna.amplicon.Amplicon
objects where primers have been modified with tails so that the fragments can be fused in the order they appear in the list by for example Gibson assembly or homologous recombination.Given that we have two linear
pydna.amplicon.Amplicon
objects a and bwe can modify the reverse primer of a and forward primer of b with tails to allow fusion by fusion PCR, Gibson assembly or in-vivo homologous recombination. The basic requirements for the primers for the three techniques are the same.
_________ a _________ / \ agcctatcatcttggtctctgca ||||| <gacgt agcct> ||||| tcggatagtagaaccagagacgt __________ b ________ / \ TTTATATCGCATGACTCTTCTTT ||||| <AGAAA TTTAT> ||||| AAATATAGCGTACTGAGAAGAAA agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT |||||||||||||||||||||||||||||||||||||||||||||| tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA \___________________ c ______________________/
Design tailed primers incorporating a part of the next or previous fragment to be assembled.
agcctatcatcttggtctctgca ||||||||||||||||||||||| gagacgtAAATATA ||||||||||||||||||||||| tcggatagtagaaccagagacgt TTTATATCGCATGACTCTTCTTT ||||||||||||||||||||||| ctctgcaTTTATAT ||||||||||||||||||||||| AAATATAGCGTACTGAGAAGAAA
PCR products with flanking sequences are formed in the PCR process.
agcctatcatcttggtctctgcaTTTATAT |||||||||||||||||||||||||||||| tcggatagtagaaccagagacgtAAATATA \____________/ identical sequences ____________ / \ ctctgcaTTTATATCGCATGACTCTTCTTT |||||||||||||||||||||||||||||| gagacgtAAATATAGCGTACTGAGAAGAAA
The fragments can be fused by any of the techniques mentioned earlier to form c:
agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT |||||||||||||||||||||||||||||||||||||||||||||| tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA
The first argument of this function is a list of sequence objects containing Amplicons and other similar objects.
At least every second sequence object needs to be an Amplicon
This rule exists because if a sequence object is that is not a PCR product is to be fused with another fragment, that other fragment needs to be an Amplicon so that the primer of the other object can be modified to include the whole stretch of sequence homology needed for the fusion. See the example below where a is a non-amplicon (a linear plasmid vector for instance)
_________ a _________ __________ b ________ / \ / \ agcctatcatcttggtctctgca <--> TTTATATCGCATGACTCTTCTTT ||||||||||||||||||||||| ||||||||||||||||||||||| tcggatagtagaaccagagacgt <AGAAA TTTAT> ||||||||||||||||||||||| <--> AAATATAGCGTACTGAGAAGAAA agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT |||||||||||||||||||||||||||||||||||||||||||||| tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA \___________________ c ______________________/
In this case only the forward primer of b is fitted with a tail with a part a:
agcctatcatcttggtctctgca ||||||||||||||||||||||| tcggatagtagaaccagagacgt TTTATATCGCATGACTCTTCTTT ||||||||||||||||||||||| <AGAAA tcttggtctctgcaTTTATAT ||||||||||||||||||||||| AAATATAGCGTACTGAGAAGAAA
PCR products with flanking sequences are formed in the PCR process.
agcctatcatcttggtctctgcaTTTATAT |||||||||||||||||||||||||||||| tcggatagtagaaccagagacgtAAATATA \____________/ identical sequences ____________ / \ ctctgcaTTTATATCGCATGACTCTTCTTT |||||||||||||||||||||||||||||| gagacgtAAATATAGCGTACTGAGAAGAAA
The fragments can be fused by for example Gibson assembly:
agcctatcatcttggtctctgcaTTTATAT |||||||||||||||||||||||||||||| tcggatagtagaacca TCGCATGACTCTTCTTT |||||||||||||||||||||||||||||| gagacgtAAATATAGCGTACTGAGAAGAAA
to form c:
agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT |||||||||||||||||||||||||||||||||||||||||||||| tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA
The first argument of this function is a list of sequence objects containing Amplicons and other similar objects.
The overlap argument controls how many base pairs of overlap required between adjacent sequence fragments. In the junction between Amplicons, tails with the length of about half of this value is added to the two primers closest to the junction.
> < Amplicon1 Amplicon2 > < ⇣ > <- Amplicon1 Amplicon2 -> <
In the case of an Amplicon adjacent to a Dseqrecord object, the tail will be twice as long (1*overlap) since the recombining sequence is present entirely on this primer:
Dseqrecd1 Amplicon1 > < ⇣ Dseqrecd1 Amplicon1 --> <
Note that if the sequence of DNA fragments starts or stops with an Amplicon, the very first and very last prinmer will not be modified i.e. assembles are always assumed to be linear. There are simple tricks around that for circular assemblies depicted in the last two examples below.
The maxlink arguments controls the cut off length for sequences that will be synhtesized by adding them to primers for the adjacent fragment(s). The argument list may contain short spacers (such as spacers between fusion proteins).
Example 1: Linear assembly of PCR products (pydna.amplicon.Amplicon class objects) ------ > < > < Amplicon1 Amplicon3 Amplicon2 Amplicon4 > < > < ⇣ pydna.design.assembly_fragments ⇣ > <- -> <- pydna.assembly.Assembly Amplicon1 Amplicon3 Amplicon2 Amplicon4 ➤ Amplicon1Amplicon2Amplicon3Amplicon4 -> <- -> < Example 2: Linear assembly of alternating Amplicons and other fragments > < > < Amplicon1 Amplicon2 Dseqrecd1 Dseqrecd2 ⇣ pydna.design.assembly_fragments ⇣ > <-- --> <-- pydna.assembly.Assembly Amplicon1 Amplicon2 Dseqrecd1 Dseqrecd2 ➤ Amplicon1Dseqrecd1Amplicon2Dseqrecd2 Example 3: Linear assembly of alternating Amplicons and other fragments Dseqrecd1 Dseqrecd2 Amplicon1 Amplicon2 > < --> < ⇣ pydna.design.assembly_fragments ⇣ pydna.assembly.Assembly Dseqrecd1 Dseqrecd2 Amplicon1 Amplicon2 ➤ Dseqrecd1Amplicon1Dseqrecd2Amplicon2 --> <-- --> < Example 4: Circular assembly of alternating Amplicons and other fragments -> <== Dseqrecd1 Amplicon2 Amplicon1 Dseqrecd1 --> <- ⇣ pydna.design.assembly_fragments ⇣ pydna.assembly.Assembly -> <== Dseqrecd1 Amplicon2 -Dseqrecd1Amplicon1Amplicon2- Amplicon1 ➤ | | --> <- ----------------------------- ------ Example 5: Circular assembly of Amplicons > < > < Amplicon1 Amplicon3 Amplicon2 Amplicon1 > < > < ⇣ pydna.design.assembly_fragments ⇣ > <= -> <- Amplicon1 Amplicon3 Amplicon2 Amplicon1 -> <- +> < ⇣ make new Amplicon using the Amplicon1.template and the last fwd primer and the first rev primer. ⇣ pydna.assembly.Assembly +> <= -> <- Amplicon1 Amplicon3 -Amplicon1Amplicon2Amplicon3- Amplicon2 ➤ | | -> <- -----------------------------
- Parameters:
f (list of
pydna.amplicon.Amplicon
and other Dseqrecord like objects) – list Amplicon and Dseqrecord object for which fusion primers should be constructed.overlap (int, optional) – Length of required overlap between fragments.
maxlink (int, optional) – Maximum length of spacer sequences that may be present in f. These will be included in tails for designed primers.
- Returns:
seqs –
[Amplicon1, Amplicon2, ...]
- Return type:
list of
pydna.amplicon.Amplicon
and other Dseqrecord like objectspydna.amplicon.Amplicon
objects
Examples
>>> from pydna.dseqrecord import Dseqrecord >>> from pydna.design import primer_design >>> a=primer_design(Dseqrecord("atgactgctaacccttccttggtgttgaacaagatcgacgacatttcgttcgaaacttacgatg")) >>> b=primer_design(Dseqrecord("ccaaacccaccaggtaccttatgtaagtacttcaagtcgccagaagacttcttggtcaagttgcc")) >>> c=primer_design(Dseqrecord("tgtactggtgctgaaccttgtatcaagttgggtgttgacgccattgccccaggtggtcgtttcgtt")) >>> from pydna.design import assembly_fragments >>> # We would like a circular recombination, so the first sequence has to be repeated >>> fa1,fb,fc,fa2 = assembly_fragments([a,b,c,a]) >>> # Since all fragments are Amplicons, we need to extract the rp of the 1st and fp of the last fragments. >>> from pydna.amplify import pcr >>> fa = pcr(fa2.forward_primer, fa1.reverse_primer, a) >>> [fa,fb,fc] [Amplicon(100), Amplicon(101), Amplicon(102)] >>> fa.name, fb.name, fc.name = "fa fb fc".split() >>> from pydna.assembly import Assembly >>> assemblyobj = Assembly([fa,fb,fc]) >>> assemblyobj Assembly fragments....: 100bp 101bp 102bp limit(bp)....: 25 G.nodes......: 6 algorithm....: common_sub_strings >>> assemblyobj.assemble_linear() [Contig(-231), Contig(-166), Contig(-36)] >>> assemblyobj.assemble_circular()[0].seguid() 'cdseguid=85t6tfcvWav0wnXEIb-lkUtrl4s' >>> (a+b+c).looped().seguid() 'cdseguid=85t6tfcvWav0wnXEIb-lkUtrl4s' >>> print(assemblyobj.assemble_circular()[0].figure()) -|fa|36 | \/ | /\ | 36|fb|36 | \/ | /\ | 36|fc|36 | \/ | /\ | 36- | | -------------------- >>>