Trimmer¶
Base Trimmer¶
- class pytrimal.BaseTrimmer(unicode backend=u'detect', *)¶
A sequence alignment trimmer.
All subclasses provide the same
trimmethod, and are configured through their constructor.- __getstate__(self)¶
- __init__(*, backend='detect')¶
Create a new base trimmer.
- Keyword Arguments
backend (
str, optional) – The SIMD extension backend to use to accelerate computation of pairwise similarity statistics. IfNonegiven, use the original code from trimAl.
New in version 0.2.0: The
backendkeyword argument.
- __reduce_cython__(self)¶
- __setstate__(self, dict state)¶
- __setstate_cython__(self, __pyx_state)¶
- trim(self, Alignment alignment, SimilarityMatrix matrix=None) TrimmedAlignment¶
- trim(self, alignment, matrix=None) None
–
Trim the provided alignment.
- Parameters
alignment (
Alignment) – A multiple sequence alignment to trim.matrix (
SimilarityMatrix, optional) – An alternative similarity matrix to use for computing the similarity statistic. IfNone, a default matrix will be used based on the type of the alignment.
- Returns
TrimmedAlignment– The trimmed alignment.
Hint
This method is re-entrant, and can be called safely accross different threads. Most of the computations will be done after releasing the GIL.
Changed in version 0.1.2: Added the
matrixoptional argument.
Automatic Trimmer¶
- class pytrimal.AutomaticTrimmer(BaseTrimmer)¶
AutomaticTrimmer(unicode method=u’strict’, unicode backend=u’detect’, *) A sequence alignment trimmer with automatic parameter detection.
trimAl provides several heuristic methods for automated trimming of multiple sequence algorithms:
strict: A statistical method that combines gaps and similarity statistics to clean the alignment.strictplus: A statistical method that combines gaps and similarity statistics, optimized for Neighbour-Joining tree reconstruction.gappyout: A statistical method that only uses gaps statistic to clean the alignment.automated1: A meta-method that chooses betweenstrictandgappyout, optimized for Maximum Likelihood phylogenetic tree reconstruction.nogaps: A naive method that removes every column containing at least one gap.noallgaps: A naive method that removes every column containing only gaps.noduplicateseqs: A naive method that removes sequences that are equal on the alignment, keeping the latest occurence.
- Hint:
A Python
frozensetcontaining all valid automatic trimming methods can be obtained with theAutomaticTrimmer.METHODSattribute. This can be useful for listing or validating methods beforehand, e.g. to build a CLI withargparse.
New in version 0.4.0: The
AutomaticTrimmer.METHODSclass attribute.New in version 0.5.0: Support for
pickleprotocol.- __getstate__(self)¶
- __init__(method='strict', *, backend='detect')¶
Create a new automatic alignment trimmer using the given method.
- Parameters
method (
str) – The automatic aligment trimming method. See the documentation forAutomaticTrimmerfor a list of supported values.- Keyword Arguments
backend (
str, optional) – The SIMD extension backend to use to accelerate computation of pairwise similarity statistics. IfNonegiven, use the original code from trimAl.- Raises
ValueError – When
methodis not one of the automatic alignment trimming methods supported by trimAl.
New in version 0.2.0: The
backendkeyword argument.New in version 0.4.0: The
noduplicateseqsmethod.
- __reduce_cython__(self)¶
- __setstate__(self, dict state)¶
- __setstate_cython__(self, __pyx_state)¶
- trim(self, Alignment alignment, SimilarityMatrix matrix=None) TrimmedAlignment¶
- trim(self, alignment, matrix=None) None
–
Trim the provided alignment.
- Parameters
alignment (
Alignment) – A multiple sequence alignment to trim.matrix (
SimilarityMatrix, optional) – An alternative similarity matrix to use for computing the similarity statistic. IfNone, a default matrix will be used based on the type of the alignment.
- Returns
TrimmedAlignment– The trimmed alignment.
Hint
This method is re-entrant, and can be called safely accross different threads. Most of the computations will be done after releasing the GIL.
Changed in version 0.1.2: Added the
matrixoptional argument.
Manual Trimmer¶
- class pytrimal.ManualTrimmer(BaseTrimmer)¶
ManualTrimmer(gap_threshold=None, *, gap_absolute_threshold=None, similarity_threshold=None, conservation_percentage=None, window=None, gap_window=None, similarity_window=None, unicode backend=u’detect’) A sequence alignment trimmer with manually defined thresholds.
Manual trimming allows the user to specify independent thresholds for two different statistics:
Gap threshold: Remove columns where the gap ratio (or the absolute gap count) is higher than the provided threshold.
Similarity threshold: Remove columns with a similarity ratio lower than the provided threshold.
In addition, the trimming can be restricted so that at least a configurable fraction of the original alignment is retained, in order to avoid stripping an alignment of distance sequences by aggressive trimming.
- __getstate__(self)¶
- __init__(*, gap_threshold=None, gap_absolute_threshold=None, similarity_threshold=None, conservation_percentage=None, window=None, gap_window=None, similarity_window=None, backend='detect')¶
Create a new manual alignment trimmer with the given parameters.
- Keyword Arguments
gap_threshold (
float, optional) – The minimum fraction of non-gap characters that must be present in a column to keep the column.gap_absolute_threshold (
int, optional) – The absolute number of gaps allowed on a column to keep it in the alignment. Incompatible withgap_threshold.similarity_threshold (
float, optional) – The minimum average similarity required.conservation_percentage (
float, optional) – The minimum percentage of positions in the original alignment to conserve.window (
int, optional) – The size of the half-window to use when computing statistics for an alignment.gap_window (
int, optional) – The size of the half-window to use when computing the gap statistic for an alignment. Incompatible withwindow.similarity_window (
int, optional) – The size of the half-window to use when computing the similarity statistic for an alignment. Incompatible withwindow.backend (
str, optional) – The SIMD extension backend to use to accelerate computation of pairwise similarity statistics. IfNonegiven, use the original code from trimAl.
New in version 0.2.0: The
backendkeyword argument.New in version 0.2.2: The keyword arguments for controling the half-window sizes.
Changed in version 0.4.0: Removed
consistency_thresholdandconsistency_window.
- __setstate__(self, dict state)¶
- trim(self, Alignment alignment, SimilarityMatrix matrix=None) TrimmedAlignment¶
- trim(self, alignment, matrix=None) None
–
Trim the provided alignment.
- Parameters
alignment (
Alignment) – A multiple sequence alignment to trim.matrix (
SimilarityMatrix, optional) – An alternative similarity matrix to use for computing the similarity statistic. IfNone, a default matrix will be used based on the type of the alignment.
- Returns
TrimmedAlignment– The trimmed alignment.
Hint
This method is re-entrant, and can be called safely accross different threads. Most of the computations will be done after releasing the GIL.
Changed in version 0.1.2: Added the
matrixoptional argument.