{ "cells": [ { "cell_type": "markdown", "id": "805a9bde-7fde-4eeb-abc1-bc15b8aceb7f", "metadata": {}, "source": [ "# Basic examples from the trimAl documentation\n", "\n", "This example shows how to run the basic methods shown in the trimAl manual, but using the `pytrimal` API." ] }, { "cell_type": "code", "execution_count": null, "id": "a1f5f0b9-e3e3-413b-8ae6-486069e0705f", "metadata": {}, "outputs": [], "source": [ "import pytrimal\n", "pytrimal.__version__" ] }, { "cell_type": "markdown", "id": "c1e55561-0213-47e1-addb-5c4252051850", "metadata": {}, "source": [ "For this example, we will use one of the example alignments from the trimAl source repository. We use `Alignment.load` to load the alignment from a filename; note that `os.PathLike` objects are supported as well." ] }, { "cell_type": "code", "execution_count": null, "id": "7f8d4a1c-eaa0-4d29-85e4-9ec852cb4df2", "metadata": {}, "outputs": [], "source": [ "import pathlib\n", "ali = pytrimal.Alignment.load(pathlib.Path(\"data\").joinpath(\"example.001.AA.clw\"))" ] }, { "cell_type": "markdown", "id": "9cb77186-8f4b-4f7b-a60f-72ddf325828e", "metadata": {}, "source": [ "Let's see how the original alignment looks before trimming:" ] }, { "cell_type": "code", "execution_count": null, "id": "cb4bdda8-0206-4ca9-ba92-dccf1003393b", "metadata": {}, "outputs": [], "source": [ "for name, seq in zip(ali.names, ali.sequences):\n", " print(name.decode().ljust(10), seq)" ] }, { "cell_type": "markdown", "id": "293dddfc-ac4b-444d-9bd5-2b670f5f8e3f", "metadata": {}, "source": [ "## Example 1\n", "\n", "Remove all positions in the alignment with gaps in 10% or more of the sequences, unless this leaves less than 60% of original alignment. In such case, print the 60% best (with less gaps) positions. Equivalent to:\n", "```\n", "$ trimal -in data/example.001.AA.clw -gt 0.9 -cons 60\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "86e2b0cc-6cad-48a2-a615-a37ecd48a13e", "metadata": {}, "outputs": [], "source": [ "trimmer = pytrimal.ManualTrimmer(gap_threshold=0.9, conservation_percentage=60)\n", "trimmed = trimmer.trim(ali)\n", "\n", "for name, seq in zip(trimmed.names, trimmed.sequences):\n", " print(name.decode().ljust(10), seq)" ] }, { "cell_type": "markdown", "id": "b40760e3-b73a-41d9-9a92-0cc824b09e73", "metadata": {}, "source": [ "## Example 2\n", "\n", "Same as *Example 1*, but the gap score is averaged over a window starting 3 positions before and ending 3 positions after each column." ] }, { "cell_type": "code", "execution_count": null, "id": "872d8097-f54c-469b-9a5c-dba02da590b1", "metadata": {}, "outputs": [], "source": [ "trimmer = pytrimal.ManualTrimmer(gap_threshold=0.9, conservation_percentage=60, window=3)\n", "trimmed = trimmer.trim(ali)\n", "\n", "for name, seq in zip(trimmed.names, trimmed.sequences):\n", " print(name.decode().ljust(10), seq)" ] }, { "cell_type": "markdown", "id": "d7b66c60-54d8-4ab2-ac19-dfd4dfab7a90", "metadata": {}, "source": [ "## Example 3\n", "\n", "Use an automatic method to decide optimal thresholds, based on the gap scores the input alignment. Equivalent to:\n", "```\n", "$ trimal -in data/example.001.AA.clw -gappyout\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "00e658aa-9280-437d-a024-853bdaa3fafa", "metadata": {}, "outputs": [], "source": [ "trimmer = pytrimal.AutomaticTrimmer(method=\"gappyout\")\n", "trimmed = trimmer.trim(ali)\n", "\n", "for name, seq in zip(trimmed.names, trimmed.sequences):\n", " print(name.decode().ljust(10), seq)" ] }, { "cell_type": "markdown", "id": "01efceb6-8eec-4548-9b70-7178a7670537", "metadata": {}, "source": [ "## Example 4\n", "\n", "Use automatic methods to decide optimal thresholds, based on the combination of gap and similarity scores. Equivalent to:\n", "```\n", "$ trimal -in data/example.001.AA.clw -strictplus\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "2580fe78-64a6-43f5-808a-fd3220804c69", "metadata": {}, "outputs": [], "source": [ "trimmer = pytrimal.AutomaticTrimmer(method=\"strictplus\")\n", "trimmed = trimmer.trim(ali)\n", "\n", "for name, seq in zip(trimmed.names, trimmed.sequences):\n", " print(name.decode().ljust(10), seq)" ] }, { "cell_type": "markdown", "id": "664c6392-bb2b-4b21-ae4d-655a97f0d65a", "metadata": {}, "source": [ "## Example 5\n", "\n", "Use an heuristic to decide the optimal method for trimming the alignment. Equivalent to:\n", "```\n", "$ trimal -in data/example.001.AA.clw -automated1\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "38499837-b1ef-45a0-844d-be36adae2f47", "metadata": {}, "outputs": [], "source": [ "trimmer = pytrimal.AutomaticTrimmer(method=\"automated1\")\n", "trimmed = trimmer.trim(ali)\n", "\n", "for name, seq in zip(trimmed.names, trimmed.sequences):\n", " print(name.decode().ljust(10), seq)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 5 }