How to run a design with Teolenn
Requirements
To run Teolenn you need:
- A computer that can execute
Teolenn (see installation requirements for more informations).
- Download Teolenn and install it.
- The sequence of the genome you want use for the design.
- This genome must be in one file using the Fasta format.
- The header of each chromosome or scaffold must be simple (short as possible, without punctuation marks or other symbols).
>chromosome_1
TATATAAAAACCTTTACTACTTTTACTATTATTATTACCTTATTATATAGTTATAATTAACTTCCTTTTA
GCACTACTATTAATAAATAATAAATATAATATACTACTAATTACTATAAATAAATTTAGTAAAAAGGTAA
TTCTAAAACTAGTTAAAAAAACTAATATAGCCTTAAAAATAGCTAATAAGCTAGTAGCAAGACTTTTAAA
...
- The sequence of the masked genome if you want use the complexity measurement.
- This genome must be in one file using the Fasta format.
- The header of each chromosome or scaffold must be simple (short as possible, without punctuation marks or other symbols).
- The name of the header of each chromosome must be the same for genome file and genome masked file.
The design file
The design file is the core element of a design using Teolenn. It
contains the filters to apply on generated sequences, the measurements
to compute, the filters on measurements and the selection parameters.
The design file is an XML file (see
Wikipedia article for
more information about XML). This file allow Teolenn to be a very flexible
design tool. As the design file is the central part of Teolenn you must
read carefully the design file section of the
documentation.
You can also use the Trichoderma
reesei design file as a model for your design.
Sequence and sequence masked of Trichoderma reesei
are available on the website of the JGI. Don't forget in
this design that the parameters are specific to Trichoderma reesei and
need to be adapted for your design.
The Teolenn process
There are 4 steps in the Teolenn process to select probes:
- Generate all oligonucleotides sequences for the genome and the
genome masked (create one file per chromosome, the extension of
this files are ".oligo" and ".masked" files).
Filter oligonucleotides sequences (one file per chromosome, the extension of
this files are ".oligo.filtered" and ".masked.filtered" files)
- Compute measurements (create the oligo.mes file)
- Filter measurements (create the filtered.mes and the filtered.stat file)
- Compute the selection of the oligonucleotides (create the select.mes file). In this final step:
- For each measurement and oligonucleotide, Teolenn compute a score using the measurement value and
sometimes one or more statistical parameters (defined in the parameters values of the measurements in the
select section of the design file). A score is a float value between 0 and 1 whereas a measurement
value can have any type and value (negative, string...).
- Apply a weight on each measurement score to get a global score for an oligonucleotide.
- Choose the oligonucleotide with the best score in each selection window.
As there is the need of statistical data for the last step of
the process, Teolenn must be launched two times. One to get the statistical
data and one to get the selected oligonucleotides. You can skip
the 4 firsts steps in the second run using the skip attribute
in the design file (see the design file section
of the documentation for more information).