Multiple Sequence Alignment¶
Alignments¶
Alignment¶
- class pytrimal.Alignment¶
A multiple sequence alignment.
- __init__(names, sequences)¶
Create a new alignment with the given names and sequences.
- Parameters
Examples
Create a new alignment with a list of sequences and a list of names:
>>> alignment = Alignment( ... names=[b"Sp8", b"Sp10", b"Sp26"], ... sequences=[ ... "-----GLGKVIV-YGIVLGTKSDQFSNWVVWLFPWNGLQIHMMGII", ... "-------DPAVL-FVIMLGTIT-KFS--SEWFFAWLGLEINMMVII", ... "AAAAAAAAALLTYLGLFLGTDYENFA--AAAANAWLGLEINMMAQI", ... ] ... )
There should be as many sequences as there are names, otherwise a
ValueErrorwill be raised:>>> Alignment( ... names=[b"Sp8", b"Sp10", b"Sp26"], ... sequences=["GLQIHMMGII", "GLEINMMVII"] ... ) Traceback (most recent call last): ... ValueError: `Alignment` given 3 names but 2 sequences
Sequence characters will be checked, and an error will be raised if they are not one of the characters from a biological alphabet:
>>> Alignment( ... names=[b"Sp8", b"Sp10"], ... sequences=["GLQIHMMGII", "GLEINMM123"] ... ) Traceback (most recent call last): ... ValueError: The sequence "Sp10" has an unknown (49) character
- copy()¶
Create a copy of this alignment.
- load(path)¶
Load a multiple sequence alignment from a file.
- Parameters
path (
str,bytesoros.PathLike) – The path to the file containing the serialized alignment to load.- Returns
Alignment– The deserialized alignment.
Example
>>> msa = Alignment.load("example.001.AA.clw") >>> msa.names [b'Sp8', b'Sp10', b'Sp26', b'Sp6', b'Sp17', b'Sp33']
- residues¶
The residues in the alignment.
- Type
- sequences¶
The sequences in the alignment.
- Type
Trimmed Alignment¶
- class pytrimal.TrimmedAlignment(Alignment)¶
A multiple sequence alignment that has been trimmed.
Internally, the trimming process only produces a mask of sequences and a mask of residues. This class exposes the filtered sequences and residues.
- __init__(names, sequences, sequences_mask=None, residues_mask=None)¶
Create a new alignment with the given names, sequences and masks.
- Parameters
names (
Sequenceofbytes) – The names of the sequences in the alignment.sequences (
Sequenceofstr) – The actual sequences in the alignment.sequences_mask (
Sequenceofbool) – A mask for which sequences to keep in the trimmed alignment. If given, must be as long as thesequencesandnameslist.residues_mask (
Sequenceofbool) – A mask for which residues to keep in the trimmed alignment. If given, must be as long as every element in thesequencesargument.
- sequences¶
The sequences in the alignment.
- Type
Attributes¶
AlignmentSequences¶
- class pytrimal.AlignmentSequences¶
A read-only view over the sequences of an alignment.
Objects from this class are created in the
sequencesproperty ofAlignmentobjects. Use it to access the string data of individual rows from the alignment:>>> msa = Alignment.load("example.001.AA.clw") >>> len(msa.sequences) 6 >>> msa.sequences[0] '-----GLGKVIV-YGIVLGTKSDQFSNWVVWLFPWNGLQIHMMGII' >>> sum(letter == '-' for seq in msa.sequences for letter in seq) 43
AlignmentResidues¶
- class pytrimal.AlignmentResidues¶
A read-only view over the residues of an alignment.
Objects from this class are created in the
residuesproperty ofAlignmentobjects. Use it to access the string data of individual columns from the alignment:>>> msa = Alignment.load("example.001.AA.clw") >>> len(msa.residues) 46 >>> msa.residues[0] '--A---' >>> msa.residues[-1] 'IIIIFL'