Trimmer¶
Base Trimmer¶
- class pytrimal.BaseTrimmer¶
A sequence alignment trimmer.
All subclasses provide the same
trim
method, and are configured through their constructor.- __init__(*, backend='detect')¶
Create a new base trimmer.
- Keyword Arguments:
backend (
str
, optional) – The SIMD extension backend to use to accelerate computation of pairwise similarity statistics. IfNone
given, use the original code from trimAl.
New in version 0.2.0: The
backend
keyword argument.
- trim(alignment, matrix=None)¶
Trim the provided alignment.
- Parameters:
alignment (
Alignment
) – A multiple sequence alignment to trim.matrix (
SimilarityMatrix
, optional) – An alternative similarity matrix to use for computing the similarity statistic. IfNone
, a default matrix will be used based on the type of the alignment.
- Returns:
TrimmedAlignment
– The trimmed alignment.
Hint
This method is re-entrant, and can be called safely accross different threads. Most of the computations will be done after releasing the GIL.
Changed in version 0.1.2: Added the
matrix
optional argument.
Automatic Trimmer¶
- class pytrimal.AutomaticTrimmer(BaseTrimmer)¶
A sequence alignment trimmer with automatic parameter detection.
trimAl provides several heuristic methods for automated trimming of multiple sequence algorithms:
strict
: A statistical method that combines gaps and similarity statistics to clean the alignment.strictplus
: A statistical method that combines gaps and similarity statistics, optimized for Neighbour-Joining tree reconstruction.gappyout
: A statistical method that only uses gaps statistic to clean the alignment.automated1
: A meta-method that chooses betweenstrict
andgappyout
, optimized for Maximum Likelihood phylogenetic tree reconstruction.nogaps
: A naive method that removes every column containing at least one gap.noallgaps
: A naive method that removes every column containing only gaps.noduplicateseqs
: A naive method that removes sequences that are equal on the alignment, keeping the latest occurence.
Hint
A Python
frozenset
containing all valid automatic trimming methods can be obtained with theAutomaticTrimmer.METHODS
attribute. This can be useful for listing or validating methods beforehand, e.g. to build a CLI withargparse
.New in version 0.4.0: The
AutomaticTrimmer.METHODS
class attribute.New in version 0.5.0: Support for
pickle
protocol.- __init__(method='strict', *, backend='detect')¶
Create a new automatic alignment trimmer using the given method.
- Parameters:
method (
str
) – The automatic aligment trimming method. See the documentation forAutomaticTrimmer
for a list of supported values.- Keyword Arguments:
backend (
str
, optional) – The SIMD extension backend to use to accelerate computation of pairwise similarity statistics. IfNone
given, use the original code from trimAl.- Raises:
ValueError – When
method
is not one of the automatic alignment trimming methods supported by trimAl.
New in version 0.2.0: The
backend
keyword argument.New in version 0.4.0: The
noduplicateseqs
method.
- trim(alignment, matrix=None)¶
Trim the provided alignment.
- Parameters:
alignment (
Alignment
) – A multiple sequence alignment to trim.matrix (
SimilarityMatrix
, optional) – An alternative similarity matrix to use for computing the similarity statistic. IfNone
, a default matrix will be used based on the type of the alignment.
- Returns:
TrimmedAlignment
– The trimmed alignment.
Hint
This method is re-entrant, and can be called safely accross different threads. Most of the computations will be done after releasing the GIL.
Changed in version 0.1.2: Added the
matrix
optional argument.
Manual Trimmer¶
- class pytrimal.ManualTrimmer(BaseTrimmer)¶
A sequence alignment trimmer with manually defined thresholds.
Manual trimming allows the user to specify independent thresholds for two different statistics:
Gap threshold: Remove columns where the gap ratio (or the absolute gap count) is higher than the provided threshold.
Similarity threshold: Remove columns with a similarity ratio lower than the provided threshold.
In addition, the trimming can be restricted so that at least a configurable fraction of the original alignment is retained, in order to avoid stripping an alignment of distance sequences by aggressive trimming.
- __init__(*, gap_threshold=None, gap_absolute_threshold=None, similarity_threshold=None, conservation_percentage=None, window=None, gap_window=None, similarity_window=None, backend='detect')¶
Create a new manual alignment trimmer with the given parameters.
- Keyword Arguments:
gap_threshold (
float
, optional) – The minimum fraction of non-gap characters that must be present in a column to keep the column.gap_absolute_threshold (
int
, optional) – The absolute number of gaps allowed on a column to keep it in the alignment. Incompatible withgap_threshold
.similarity_threshold (
float
, optional) – The minimum average similarity required.conservation_percentage (
float
, optional) – The minimum percentage of positions in the original alignment to conserve.window (
int
, optional) – The size of the half-window to use when computing statistics for an alignment.gap_window (
int
, optional) – The size of the half-window to use when computing the gap statistic for an alignment. Incompatible withwindow
.similarity_window (
int
, optional) – The size of the half-window to use when computing the similarity statistic for an alignment. Incompatible withwindow
.backend (
str
, optional) – The SIMD extension backend to use to accelerate computation of pairwise similarity statistics. IfNone
given, use the original code from trimAl.
New in version 0.2.0: The
backend
keyword argument.New in version 0.2.2: The keyword arguments for controling the half-window sizes.
Changed in version 0.4.0: Removed
consistency_threshold
andconsistency_window
.
- trim(alignment, matrix=None)¶
Trim the provided alignment.
- Parameters:
alignment (
Alignment
) – A multiple sequence alignment to trim.matrix (
SimilarityMatrix
, optional) – An alternative similarity matrix to use for computing the similarity statistic. IfNone
, a default matrix will be used based on the type of the alignment.
- Returns:
TrimmedAlignment
– The trimmed alignment.
Hint
This method is re-entrant, and can be called safely accross different threads. Most of the computations will be done after releasing the GIL.
Changed in version 0.1.2: Added the
matrix
optional argument.