Basic examples from the trimAl documentation¶
This example shows how to run the basic methods shown in the trimAl manual, but using the pytrimal API.
[1]:
import pytrimal
pytrimal.__version__
[1]:
'0.3.0'
For this example, we will use one of the example alignments from the trimAl source repository. We use Alignment.load to load the alignment from a filename; note that os.PathLike objects are supported as well.
[2]:
import pathlib
ali = pytrimal.Alignment.load(pathlib.Path("data").joinpath("example.001.AA.clw"))
Let’s see how the original alignment looks before trimming:
[3]:
for name, seq in zip(ali.names, ali.sequences):
print(name.decode().ljust(10), seq)
Sp8 -----GLGKVIV-YGIVLGTKSDQFSNWVVWLFPWNGLQIHMMGII
Sp10 -------DPAVL-FVIMLGTIT-KFS--SEWFFAWLGLEINMMVII
Sp26 AAAAAAAAALLTYLGLFLGTDYENFA--AAAANAWLGLEINMMAQI
Sp6 -----ASGAILT-LGIYLFTLCAVIS--VSWYLAWLGLEINMMAII
Sp17 --FAYTAPDLL-LIGFLLKTVA-TFG--DTWFQLWQGLDLNKMPVF
Sp33 -------PTILNIAGLHMETDI-NFS--LAWFQAWGGLEINKQAIL
Example 1¶
Remove all positions in the alignment with gaps in 10% or more of the sequences, unless this leaves less than 60% of original alignment. In such case, print the 60% best (with less gaps) positions. Equivalent to:
$ trimal -in data/example.001.AA.clw -gt 0.9 -cons 60
[4]:
trimmer = pytrimal.ManualTrimmer(gap_threshold=0.9, conservation_percentage=60)
trimmed = trimmer.trim(ali)
for name, seq in zip(trimmed.names, trimmed.sequences):
print(name.decode().ljust(10), seq)
Sp8 GKVIYGIVLGTKSQFSVVWLFPWNGLQIHMMGII
Sp10 DPAVFVIMLGTITKFSSEWFFAWLGLEINMMVII
Sp26 AALLLGLFLGTDYNFAAAAANAWLGLEINMMAQI
Sp6 GAILLGIYLFTLCVISVSWYLAWLGLEINMMAII
Sp17 PDLLIGFLLKTVATFGDTWFQLWQGLDLNKMPVF
Sp33 PTILAGLHMETDINFSLAWFQAWGGLEINKQAIL
Example 2¶
Same as Example 1, but the gap score is averaged over a window starting 3 positions before and ending 3 positions after each column.
[5]:
trimmer = pytrimal.ManualTrimmer(gap_threshold=0.9, conservation_percentage=60, window=3)
trimmed = trimmer.trim(ali)
for name, seq in zip(trimmed.names, trimmed.sequences):
print(name.decode().ljust(10), seq)
Sp8 V-YGIVLGTKSDQLFPWNGLQIHMMGII
Sp10 L-FVIMLGTIT-KFFAWLGLEINMMVII
Sp26 TYLGLFLGTDYENANAWLGLEINMMAQI
Sp6 T-LGIYLFTLCAVYLAWLGLEINMMAII
Sp17 -LIGFLLKTVA-TFQLWQGLDLNKMPVF
Sp33 NIAGLHMETDI-NFQAWGGLEINKQAIL
Example 3¶
Use an automatic method to decide optimal thresholds, based on the gap scores the input alignment. Equivalent to:
$ trimal -in data/example.001.AA.clw -gappyout
[6]:
trimmer = pytrimal.AutomaticTrimmer(method="gappyout")
trimmed = trimmer.trim(ali)
for name, seq in zip(trimmed.names, trimmed.sequences):
print(name.decode().ljust(10), seq)
Sp8 GKVIVYGIVLGTKSQFSVVWLFPWNGLQIHMMGII
Sp10 DPAVLFVIMLGTITKFSSEWFFAWLGLEINMMVII
Sp26 AALLTLGLFLGTDYNFAAAAANAWLGLEINMMAQI
Sp6 GAILTLGIYLFTLCVISVSWYLAWLGLEINMMAII
Sp17 PDLL-IGFLLKTVATFGDTWFQLWQGLDLNKMPVF
Sp33 PTILNAGLHMETDINFSLAWFQAWGGLEINKQAIL
Example 4¶
Use automatic methods to decide optimal thresholds, based on the combination of gap and similarity scores. Equivalent to:
$ trimal -in data/example.001.AA.clw -strictplus
[7]:
trimmer = pytrimal.AutomaticTrimmer(method="strictplus")
trimmed = trimmer.trim(ali)
for name, seq in zip(trimmed.names, trimmed.sequences):
print(name.decode().ljust(10), seq)
Sp8 GIVLVWLFPWNGLQIHMMGII
Sp10 VIMLEWFFAWLGLEINMMVII
Sp26 GLFLAAANAWLGLEINMMAQI
Sp6 GIYLSWYLAWLGLEINMMAII
Sp17 GFLLTWFQLWQGLDLNKMPVF
Sp33 GLHMAWFQAWGGLEINKQAIL
Example 5¶
Use an heuristic to decide the optimal method for trimming the alignment. Equivalent to:
$ trimal -in data/example.001.AA.clw -automated1
[8]:
trimmer = pytrimal.AutomaticTrimmer(method="automated1")
trimmed = trimmer.trim(ali)
for name, seq in zip(trimmed.names, trimmed.sequences):
print(name.decode().ljust(10), seq)
Sp8 VWLFPWNGLQIHMMGII
Sp10 EWFFAWLGLEINMMVII
Sp26 AAANAWLGLEINMMAQI
Sp6 SWYLAWLGLEINMMAII
Sp17 TWFQLWQGLDLNKMPVF
Sp33 AWFQAWGGLEINKQAIL