Quickstart Guide

Installation

To install Eoulsan, go to the Eoulsan installation page and follow the detail procedure.

Sample files

We provide in this section some samples files to test Eoulsan. These files have been produced during a mouse RNASeq experiment.

Create a design file

In an empty directory, copy the reads, genome and annotation files, then you can create a design file with the next command:

$ eoulsan.sh createdesign *.fq.bz2 genome.fasta.bz2 annotation.gff.bz2

You can now modify the design file to add additional information. Note that Eoulsan handle automatically compressed files.

Create workflow file

To create a workflow file, the best solution is to reuse an existing workflow file (see sample section) and adapt it to your needs.

The workflow file contains the list of the steps that will be executed by Eoulsan. Each step have parameters and it is related to a module to execute. Some step modules are only available in local mode (like differential analysis) or in distributed mode, see the modules page for more details. For each step you can change, add or remove parameters. Parameters are specific to a module, consult the documentation of the built-in steps for the list of available parameters of each step.

At least, there is a global section in the workflow file that override the values of Eoulsan configuration file. This section is useful to set for example the path of the temporary directory to use.

Launch Eoulsan in local mode

Once your design file and workflow file are ready, you can launch Eoulsan analysis with the following command:

$ eoulsan.sh exec workflow-local.xml design.txt

Warning: To perform the normalization and differential analysis steps of this workflow, this demo requires Docker. If you want to run this demo without Docker, you must must install R (or a RServe server) and the related packages (See differential analysis step for more information) and change the execution.mode parameter of the normalization and diffana steps.

Once started and before starting analysis, Eoulsan will check if:

  • The design file and workflow file are valid
  • All the modules related to the steps exist
  • The workflow of all the steps are valid
  • The order of the steps is correct (a step can not use data generated after its end)
  • The input files are valid (fasta and annotation files)
  • A genome mapper index is needed. In this case, a step to generate this index is automatically added

If successful, you wil obtain a new directory name like eoulsan-20110310-101235 with log files about the analysis. Results files are stored in the current directory of the user.

Launch Eoulsan in local hadoop cluster mode

First, you must have a configurated Hadoop cluster (see hadoop configuration). You can launch Eoulsan analysis with the following command:

$ eoulsan.sh hadoopexec workflow-hadoop.xml design.txt hdfs://master.example.com/test

When a step can be distributed on the Hadoop cluster, required input files are automatically copied on the HDFS filesystem before launching the step.

Launch Eoulsan in cluster mode

The cluster mode works like the local mode, you just need to configure before the cluster scheduler to use (see the cluster configuration page). Then, you can launch an Eoulsan analysis on a cluster with the following command:

$ eoulsan.sh clusterexec workflow-local.xml design.txt

Step tasks will be automaticaly submitted to the scheduler of your cluster. The outputs of this mode are the same as in local mode.