User documentation for EP3
The different parameters explained
- files
- These are the FASTA formatted files you want predictions for.
- genome size
- Indicate whether the genome of the species the sequence is from is above or below 2 Gbp. This is important for the calculation of the threshold.
Starting the software
The easiest way to start the application is by pressing the launch button below. For stand-alone versions of program, please visit the downloads page.
Using the software
The easiest way for first-time user is to use the graphical user interface. This is the program launched by the button above.
If you don't have data, you may want to try one of the sample datasets we provide. Download one of them and extract it with your favourite zip tool. Once you downloaded the file, you can use EP3 to make promoter prediction for the sequence.
- Add the file to the list of files that should be processed. This is done by pressing the 'Add files' button, navigating to your file, selecting the file and pressing the 'open button' in the dialog window.
- To start the prediction algorithm press the 'launch' button
- The prediction are stored in the same directory as the datafile.
Interpreting the prediction file
The prediction file is located in the same directory as the datafile is. You need to have write permissions in this directory.
The prediction file starts with some information about the settings that were using to make the file. The rest of the file has one prediction on each line with in the first column the position of the prediction and in the second column the value for that prediction. This value can be considered a measure for the certainty of the prediction, the further away it is from the threshold, the more confident the prediction.
Starting the stand alone application
The tool can be started with the command 'java -jar ep3-1.10.jar' from the command line.
If you don't need the Graphical program, but want to run the program on a few sequence files you can use the following syntax
java -jar ep3-1.10.jar [options] <sequence file 1> <sequence file 2> ...
The available options can be found by issueing the command java -jar ep3-1.10.jar --help
You can process as many sequences as you like at once. Note however that each file should only contain one fasta entry.
Example 1:
java -jar ep3-1.1.jar sequencefile.fa
Example 2:
java -jar ep3-1.1.jar -a sequencefile.fa sequence.fa genome.fasta
FASTA formatted files
Some information regarding the the input files for EP3.- Each file should only contain a single sequence
- The first line of the file should be a headerline for the fasta file. I.e. a line that start with >.
- The should only have one headerline and should not contain comment lines (lines that start with a semi-colon).
- The input file should (more or less) follow the convention that there should not be more than 80 characters on a single line. EP3 has no problems with up to a few million characters on a single line, but do not try to put the sequence of a whole genome on a single line.