pydna.common_sub_strings

This module is based on the Py-rstr-max package that was written by Romain Brixtel (rbrixtel_at_gmail_dot_com) (https://brixtel.users.greyc.fr) and is available from https://code.google.com/p/py-rstr-max https://github.com/gip0/py-rstr-max the original code was covered by an MIT licence.

pydna.common_sub_strings.common_sub_strings(stringx: str, stringy: str, limit: int = 25) List[Tuple[int, int, int]][source]

Finds all common substrings between stringx and stringy, and returns them sorted by length.

This function is case sensitive.

Parameters:
  • stringx (str) –

  • stringy (str) –

  • limit (int, optional) –

Returns:

[(startx1, starty1, length1),(startx2, starty2, length2), …]

startx1 = startposition in x, where substring 1 starts starty1 = position in y where substring 1 starts length1 = lenght of substring

Return type:

list of tuple

pydna.common_sub_strings.terminal_overlap(stringx: str, stringy: str, limit: int = 15) List[Tuple[int, int, int]][source]

Finds the the flanking common substrings between stringx and stringy longer than limit. This means that the results only contains substrings that starts or ends at the the ends of stringx and stringy.

This function is case sensitive.

returns a list of tuples describing the substrings The list is sorted longest -> shortest.

Parameters:
  • stringx (str) –

  • stringy (str) –

  • limit (int, optional) –

Returns:

[(startx1,starty1,length1),(startx2,starty2,length2), …]

startx1 = startposition in x, where substring 1 starts starty1 = position in y where substring 1 starts length1 = lenght of substring

Return type:

list of tuple

Examples

>>> from pydna.common_sub_strings import terminal_overlap
>>> terminal_overlap("agctatgtatcttgcatcgta", "gcatcgtagtctatttgcttac", limit=8)
[(13, 0, 8)]
             <-- 8 ->
<---- 13 --->
agctatgtatcttgcatcgta                    stringx
             gcatcgtagtctatttgcttac      stringy
             0