SimilarityMatrix#

class pytrimal.SimilarityMatrix(scoring_matrices.ScoringMatrix)#

A similarity matrix for biological sequence characters.

Changed in version 0.8.0: Inherit from the ScoringMatrix class.

__init__(matrix, alphabet='ARNDCQEGHILKMFPSTWYVBZX*', name=None)#

Create a new similarity matrix from the given alphabet and data.

Parameters:
  • matrix (ArrayLike) – The similarity matrix, as a square matrix indexed by the alphabet characters.

  • alphabet (str) – The alphabet used for indexing the rows and columns of the similarity matrix.

  • name (str or None) – The name of the scoring matrix, if any.

Example

Create a new similarity matrix using the HOXD70 scores by Chiaromonte, Yap and Miller (PMID:11928468):

>>> matrix = SimilarityMatrix(
...     [[  91, -114,  -31, -123],
...      [-114,  100, -125,  -31],
...      [ -31, -125,  100, -114],
...      [-123,  -31, -114,   91]],
...     alphabet="ATCG",
...     name="HOXD70",
... )

Create a new similarity matrix using one of the matrices from the Bio.Align.substitution_matrices module:

>>> jones = Bio.Align.substitution_matrices.load('JONES')
>>> matrix = SimilarityMatrix(jones, jones.alphabet, 'JONES')

Added in version 0.1.2.

aa()#

Create a default amino-acid similarity matrix (BLOSUM62).

distance()#

Return the distance between two sequence characters.

Example

>>> mx = SimilarityMatrix.nt(degenerated=True)
>>> mx.distance('A', 'A')
0.0
>>> mx.distance('A', 'T')
1.5184...
Raises:
  • ValueError – When a or b is an invalid character or a character that was not defined in the matrix alphabet.

  • TypeError – When a or b is a string containing more than one character.

from_name()#

Load a built-in scoring matrix by name.

This library comes with built-in matrices including the PAM, BLOSUM, VTML or BENNER matrix series. See the Matrices page of the documentation for a comprehensive list.

Parameters:

name (str) – The name of the scoring matrix.

Raises:

ValueError – When no scoring matrix with the given name can be found in the embedded matrix data.

Example

>>> blosum62 = ScoringMatrix.from_name("BLOSUM62")

Note

The ScoringMatrix.BUILTIN_MATRICES frozenset contains the names of every available matrix, which can be useful for checking allowed matrix names:

>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> arg = parser.add_argument(
...     "--matrix",
...     choices=ScoringMatrix.BUILTIN_MATRICES,
...     default="BLOSUM62"
... )
nt()#

Create a default nucleotide similarity matrix.

Parameters:

degenerated (bool) – Set to True to create a similarity matrix for degenerated nucleotides.

similarity()#

Return the similarity between two sequence characters.

Example

>>> mx = SimilarityMatrix.nt()
>>> mx.similarity('A', 'A')
1.0
>>> mx.similarity('A', 'T')
0.0
Raises:
  • ValueError – When a or b is an invalid character or a character that was not defined in the matrix alphabet.

  • TypeError – When a or b is a string containing more than one character.

Added in version 0.1.2.