The SpliceMachine server recognizes GT-AG splice sites on Arabidopsis thaliana and human DNA sequences. SpliceMachine classifiers are induced using the Linear Support Vector Machine method on high-dimensional local context representations of actual and pseudo splice sites. (Article). The user can compile data sets for other organisms using the software that can be downloaded here. This software only allows for the induction of the SpliceMachine splice site classification models. We ask to submit the models to us, so that we can make them available on this website. The software to annotate sequences locally can be obtained by request


Sequences should be submitted in the FASTA-format. Each sequence in the FASTA format begins with a single-line description, followed by lines of DNA sequence data. The description line must begin with a greater-than (">") symbol in the first column. An example is give below.

>AB000263 |acc=AB000263|descr=Homo sapiens mRNA 

A file called splice_machine_annotation is generated that contains the splice site recognitions. Each recognition is on one line:

position	{donor,donor_rev,acceptor,acceptor_rev}	score

with position indicating the sequence position where the site is located (these are the boundaries of the intron, i.e. the position of G in 'GT' and 'AG'). Next to the position is the type of the site (donor_rev (and acceptor_rev) indicating a donor (acceptor) on the reverse strand). Next to the type is the score given by the site classification model.

Offline Installation

Splicemachine is freely available for academic use and can be licensed for commercial use. Please contact us for more information. Send mail to

