pydna.design

This module contain functions for primer design for various purposes.

:func:primer_design for designing primers for a sequence or a matching primer for an existing primer. Returns an Amplicon object (same as the amplify module returns).
:func:assembly_fragments Adds tails to primers for a linear assembly through homologous recombination or Gibson assembly.
:func:circular_assembly_fragments Adds tails to primers for a circular assembly through homologous recombination or Gibson assembly.

pydna.design.primer_design(template, fp=None, rp=None, limit=13, target_tm=55.0, tm_func=_tm_default, estimate_function=None, **kwargs)[source]

This function designs a forward primer and a reverse primer for PCR amplification of a given template sequence.

The template argument is a Dseqrecord object or equivalent containing the template sequence.

The optional fp and rp arguments can contain an existing primer for the sequence (either the forward or reverse primer). One or the other primers can be specified, not both (since then there is nothing to design!, use the pydna.amplify.pcr function instead).

The limit argument is the minimum length of the primer. The default value is 13.

If one of the primers is given, the other primer is designed to match in terms of Tm. If both primers are designed, they will be designed to target_tm

tm_func is a function that takes an ascii string representing an oligonuceotide as argument and returns a float. Some useful functions can be found in the pydna.tm module, but can be substituted for a custom made function.

estimate_function is a tm_func-like function that is used to get a first guess for the primer design, that is then used as starting point for the final result. This is useful when the tm_func function is slow to calculate (e.g. it relies on an external API, such as the NEB primer design API). The estimate_function should be faster than the tm_func function. The default value is None. To use the default tm_func as estimate function to get the NEB Tm faster, you can do: primer_design(dseqr, target_tm=55, tm_func=tm_neb, estimate_function=tm_default).

The function returns a pydna.amplicon.Amplicon class instance. This object has the object.forward_primer and object.reverse_primer properties which contain the designed primers.

Parameters:

template (pydna.dseqrecord.Dseqrecord) – a Dseqrecord object. The only required argument.
fp (pydna.primer.Primer, optional) – optional pydna.primer.Primer objects containing one primer each.
rp (pydna.primer.Primer, optional) – optional pydna.primer.Primer objects containing one primer each.
target_tm (float, optional) – target tm for the primers, set to 55°C by default.
tm_func (function) – Function used for tm calculation. This function takes an ascii string representing an oligonuceotide as argument and returns a float. Some useful functions can be found in the pydna.tm module, but can be substituted for a custom made function.

Returns:

result

Return type:

Amplicon

Examples

>>> from pydna.dseqrecord import Dseqrecord
>>> t=Dseqrecord("atgactgctaacccttccttggtgttgaacaagatcgacgacatttcgttcgaaacttacgatg")
>>> t
Dseqrecord(-64)
>>> from pydna.design import primer_design
>>> ampl = primer_design(t)
>>> ampl
Amplicon(64)
>>> ampl.forward_primer
f64 17-mer:5'-atgactgctaacccttc-3'
>>> ampl.reverse_primer
r64 18-mer:5'-catcgtaagtttcgaacg-3'
>>> print(ampl.figure())
5atgactgctaacccttc...cgttcgaaacttacgatg3
                     ||||||||||||||||||
                    3gcaagctttgaatgctac5
5atgactgctaacccttc3
 |||||||||||||||||
3tactgacgattgggaag...gcaagctttgaatgctac5
>>> pf = "GGATCC" + ampl.forward_primer
>>> pr = "GGATCC" + ampl.reverse_primer
>>> pf
f64 23-mer:5'-GGATCCatgactgct..ttc-3'
>>> pr
r64 24-mer:5'-GGATCCcatcgtaag..acg-3'
>>> from pydna.amplify import pcr
>>> pcr_prod = pcr(pf, pr, t)
>>> print(pcr_prod.figure())
      5atgactgctaacccttc...cgttcgaaacttacgatg3
                           ||||||||||||||||||
                          3gcaagctttgaatgctacCCTAGG5
5GGATCCatgactgctaacccttc3
       |||||||||||||||||
      3tactgacgattgggaag...gcaagctttgaatgctac5
>>> print(pcr_prod.seq)
GGATCCatgactgctaacccttccttggtgttgaacaagatcgacgacatttcgttcgaaacttacgatgGGATCC
>>> from pydna.primer import Primer
>>> pf = Primer("atgactgctaacccttccttggtgttg", id="myprimer")
>>> ampl = primer_design(t, fp = pf)
>>> ampl.forward_primer
myprimer 27-mer:5'-atgactgctaaccct..ttg-3'
>>> ampl.reverse_primer
r64 32-mer:5'-catcgtaagtttcga..atc-3'

pydna.design.assembly_fragments(f, overlap=35, maxlink=40)[source]

This function return a list of pydna.amplicon.Amplicon objects where primers have been modified with tails so that the fragments can be fused in the order they appear in the list by for example Gibson assembly or homologous recombination.

Given that we have two linear pydna.amplicon.Amplicon objects a and b

we can modify the reverse primer of a and forward primer of b with tails to allow fusion by fusion PCR, Gibson assembly or in-vivo homologous recombination. The basic requirements for the primers for the three techniques are the same.

 _________ a _________
/                     \
agcctatcatcttggtctctgca
                  |||||
                 <gacgt
agcct>
|||||
tcggatagtagaaccagagacgt

                        __________ b ________
                       /                     \
                       TTTATATCGCATGACTCTTCTTT
                                         |||||
                                        <AGAAA
                       TTTAT>
                       |||||
                       AAATATAGCGTACTGAGAAGAAA

agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT
||||||||||||||||||||||||||||||||||||||||||||||
tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA
\___________________ c ______________________/

Design tailed primers incorporating a part of the next or previous fragment to be assembled.

agcctatcatcttggtctctgca
|||||||||||||||||||||||
                gagacgtAAATATA

|||||||||||||||||||||||
tcggatagtagaaccagagacgt

                       TTTATATCGCATGACTCTTCTTT
                       |||||||||||||||||||||||

                ctctgcaTTTATAT
                       |||||||||||||||||||||||
                       AAATATAGCGTACTGAGAAGAAA

PCR products with flanking sequences are formed in the PCR process.

agcctatcatcttggtctctgcaTTTATAT
||||||||||||||||||||||||||||||
tcggatagtagaaccagagacgtAAATATA
                \____________/

                   identical
                   sequences
                 ____________
                /            \
                ctctgcaTTTATATCGCATGACTCTTCTTT
                ||||||||||||||||||||||||||||||
                gagacgtAAATATAGCGTACTGAGAAGAAA

The fragments can be fused by any of the techniques mentioned earlier to form c:

agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT
||||||||||||||||||||||||||||||||||||||||||||||
tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA

The first argument of this function is a list of sequence objects containing Amplicons and other similar objects.

At least every second sequence object needs to be an Amplicon

This rule exists because if a sequence object is that is not a PCR product is to be fused with another fragment, that other fragment needs to be an Amplicon so that the primer of the other object can be modified to include the whole stretch of sequence homology needed for the fusion. See the example below where a is a non-amplicon (a linear plasmid vector for instance)

 _________ a _________           __________ b ________
/                     \         /                     \
agcctatcatcttggtctctgca   <-->  TTTATATCGCATGACTCTTCTTT
|||||||||||||||||||||||         |||||||||||||||||||||||
tcggatagtagaaccagagacgt                          <AGAAA
                                TTTAT>
                                |||||||||||||||||||||||
                          <-->  AAATATAGCGTACTGAGAAGAAA

     agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT
     ||||||||||||||||||||||||||||||||||||||||||||||
     tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA
     \___________________ c ______________________/

In this case only the forward primer of b is fitted with a tail with a part a:

agcctatcatcttggtctctgca
|||||||||||||||||||||||
tcggatagtagaaccagagacgt

                       TTTATATCGCATGACTCTTCTTT
                       |||||||||||||||||||||||
                                        <AGAAA
         tcttggtctctgcaTTTATAT
                       |||||||||||||||||||||||
                       AAATATAGCGTACTGAGAAGAAA

PCR products with flanking sequences are formed in the PCR process.

agcctatcatcttggtctctgcaTTTATAT
||||||||||||||||||||||||||||||
tcggatagtagaaccagagacgtAAATATA
                \____________/

                   identical
                   sequences
                 ____________
                /            \
                ctctgcaTTTATATCGCATGACTCTTCTTT
                ||||||||||||||||||||||||||||||
                gagacgtAAATATAGCGTACTGAGAAGAAA

The fragments can be fused by for example Gibson assembly:

agcctatcatcttggtctctgcaTTTATAT
||||||||||||||||||||||||||||||
tcggatagtagaacca

                             TCGCATGACTCTTCTTT
                ||||||||||||||||||||||||||||||
                gagacgtAAATATAGCGTACTGAGAAGAAA

to form c:

agcctatcatcttggtctctgcaTTTATATCGCATGACTCTTCTTT
||||||||||||||||||||||||||||||||||||||||||||||
tcggatagtagaaccagagacgtAAATATAGCGTACTGAGAAGAAA

The first argument of this function is a list of sequence objects containing Amplicons and other similar objects.

The overlap argument controls how many base pairs of overlap required between adjacent sequence fragments. In the junction between Amplicons, tails with the length of about half of this value is added to the two primers closest to the junction.

>       <
Amplicon1
         Amplicon2
         >       <

         ⇣

>       <-
Amplicon1
         Amplicon2
        ->       <

In the case of an Amplicon adjacent to a Dseqrecord object, the tail will be twice as long (1*overlap) since the recombining sequence is present entirely on this primer:

Dseqrecd1
         Amplicon1
         >       <

         ⇣

Dseqrecd1
         Amplicon1
       -->       <

Note that if the sequence of DNA fragments starts or stops with an Amplicon, the very first and very last prinmer will not be modified i.e. assembles are always assumed to be linear. There are simple tricks around that for circular assemblies depicted in the last two examples below.

The maxlink arguments controls the cut off length for sequences that will be synhtesized by adding them to primers for the adjacent fragment(s). The argument list may contain short spacers (such as spacers between fusion proteins).

Example 1: Linear assembly of PCR products (pydna.amplicon.Amplicon class objects) ------

>       <         >       <
Amplicon1         Amplicon3
         Amplicon2         Amplicon4
         >       <         >       <

                     ⇣
                     pydna.design.assembly_fragments
                     ⇣

>       <-       ->       <-                      pydna.assembly.Assembly
Amplicon1         Amplicon3
         Amplicon2         Amplicon4     ➤  Amplicon1Amplicon2Amplicon3Amplicon4
        ->       <-       ->       <

Example 2: Linear assembly of alternating Amplicons and other fragments

>       <         >       <
Amplicon1         Amplicon2
         Dseqrecd1         Dseqrecd2

                     ⇣
                     pydna.design.assembly_fragments
                     ⇣

>       <--     -->       <--                     pydna.assembly.Assembly
Amplicon1         Amplicon2
         Dseqrecd1         Dseqrecd2     ➤  Amplicon1Dseqrecd1Amplicon2Dseqrecd2

Example 3: Linear assembly of alternating Amplicons and other fragments

Dseqrecd1         Dseqrecd2
         Amplicon1         Amplicon2
         >       <       -->       <

                     ⇣
             pydna.design.assembly_fragments
                     ⇣
                                                  pydna.assembly.Assembly
Dseqrecd1         Dseqrecd2
         Amplicon1         Amplicon2     ➤  Dseqrecd1Amplicon1Dseqrecd2Amplicon2
       -->       <--     -->       <

Example 4: Circular assembly of alternating Amplicons and other fragments

                 ->       <==
Dseqrecd1         Amplicon2
         Amplicon1         Dseqrecd1
       -->       <-
                     ⇣
                     pydna.design.assembly_fragments
                     ⇣
                                                   pydna.assembly.Assembly
                 ->       <==
Dseqrecd1         Amplicon2                    -Dseqrecd1Amplicon1Amplicon2-
         Amplicon1                       ➤    |                             |
       -->       <-                            -----------------------------

------ Example 5: Circular assembly of Amplicons

>       <         >       <
Amplicon1         Amplicon3
         Amplicon2         Amplicon1
         >       <         >       <

                     ⇣
                     pydna.design.assembly_fragments
                     ⇣

>       <=       ->       <-
Amplicon1         Amplicon3
         Amplicon2         Amplicon1
        ->       <-       +>       <

                     ⇣
             make new Amplicon using the Amplicon1.template and
             the last fwd primer and the first rev primer.
                     ⇣
                                                   pydna.assembly.Assembly
+>       <=       ->       <-
 Amplicon1         Amplicon3                  -Amplicon1Amplicon2Amplicon3-
          Amplicon2                      ➤   |                             |
         ->       <-                          -----------------------------

Parameters:

f (list of pydna.amplicon.Amplicon and other Dseqrecord like objects) – list Amplicon and Dseqrecord object for which fusion primers should be constructed.
overlap (int, optional) – Length of required overlap between fragments.
maxlink (int, optional) – Maximum length of spacer sequences that may be present in f. These will be included in tails for designed primers.

Returns:

seqs –

[Amplicon1,
 Amplicon2, ...]

Return type:

list of pydna.amplicon.Amplicon and other Dseqrecord like objects pydna.amplicon.Amplicon objects

Examples

>>> from pydna.dseqrecord import Dseqrecord
>>> from pydna.design import primer_design
>>> a=primer_design(Dseqrecord("atgactgctaacccttccttggtgttgaacaagatcgacgacatttcgttcgaaacttacgatg"))
>>> b=primer_design(Dseqrecord("ccaaacccaccaggtaccttatgtaagtacttcaagtcgccagaagacttcttggtcaagttgcc"))
>>> c=primer_design(Dseqrecord("tgtactggtgctgaaccttgtatcaagttgggtgttgacgccattgccccaggtggtcgtttcgtt"))
>>> from pydna.design import assembly_fragments
>>> # We would like a circular recombination, so the first sequence has to be repeated
>>> fa1,fb,fc,fa2 = assembly_fragments([a,b,c,a])
>>> # Since all fragments are Amplicons, we need to extract the rp of the 1st and fp of the last fragments.
>>> from pydna.amplify import pcr
>>> fa = pcr(fa2.forward_primer, fa1.reverse_primer, a)
>>> [fa,fb,fc]
[Amplicon(100), Amplicon(101), Amplicon(102)]
>>> fa.name, fb.name, fc.name = "fa fb fc".split()
>>> from pydna.assembly import Assembly
>>> assemblyobj = Assembly([fa,fb,fc])
>>> assemblyobj
Assembly
fragments....: 100bp 101bp 102bp
limit(bp)....: 25
G.nodes......: 6
algorithm....: common_sub_strings
>>> assemblyobj.assemble_linear()
[Contig(-231), Contig(-166), Contig(-36)]
>>> assemblyobj.assemble_circular()[0].seguid()
'cdseguid=85t6tfcvWav0wnXEIb-lkUtrl4s'
>>> (a+b+c).looped().seguid()
'cdseguid=85t6tfcvWav0wnXEIb-lkUtrl4s'
>>> print(assemblyobj.assemble_circular()[0].figure())
 -|fa|36
|     \/
|     /\
|     36|fb|36
|           \/
|           /\
|           36|fc|36
|                 \/
|                 /\
|                 36-
|                    |
 --------------------
>>>

pydna.design.circular_assembly_fragments(f, overlap=35, maxlink=40)[source]