Supplementary material to: i-ADHoRe 2.0: An improved tool to detect degenerated genomic homology using genomic profiles

Cedric Simillion, Koen Janssens, Lieven Sterck and Yves Van de Peer

Corresponding author:

Abstract

i-ADHoRe is a software tool that combines gene content and gene order information of homologous genomic segments into profiles to detect highly degenerated homology relations within and between genomes. The new version offers, besides a significant increase in performance, several optimizations to the algorithm, most importantly to the profile alignment routine. As a result, the annotations of multiple genomes, or parts thereof, can be fed simultaneously into the program, after which it will report all regions of homology, both within and between genomes.

Availability:The i-ADHoRe 2.0 package contains the C++ source code for the main program as well as various Perl scripts and a fully documented Perl API to facilitate post-processing. The software runs on any Linux or UNIX based platform. The package is freely available for academic users and can be downloaded from bioinformatics.psb.ugent.be

Simillion, C. , Janssens, K., Sterck, L. , Van de Peer, Y. (2007) i-ADHoRe 2.0: An improved tool to detect degenerated genomic homology using genomic profiles. Bioinformatics 24, 127-8.

Supplementary Data

In order to compare the performance of both alignment methods, i-ADHoRe was run twice on the same dataset, once using the original Needleman-Wunsch (NW) based alignment algorithm and once using the new Greedy-Graph (GG) based alignment method. The output alignments of both runs were compared to assess the difference of both alignment methods. Only pairs of multiplicons that are identical between both runs were considered. Because the detection of a higher-level multiplicon is directly dependant on the alignment of the profile used to detect it, most output multiplicons between both runs are slightly different from each other. Only those pairs of multiplicons were considered for comparison where the first and last genes of each segment were identical. Note that during the construction of profiles, the i-ADHore algorithm sometimes inverses part of a segment. Depending on the alignment method used, different breakpoints can be chosen for these inversions, which can lead to a different number of genes in alignments otherwise considered identical.

The quality of an alignment was assessed by counting the number of misaligned genes. A gene was considered to be misaligned if in a given profile a homolog was present on a different segment and this homolog was positioned in a different column than the reference gene. Conversely, a gene was considered aligned if a homolog was positioned in the same column in the alignment. The number of alignable genes is then the sum of both the number of misaligned and aligned genes. The lower the number of unaligned genes, the better the quality of an alignment is.

Table 1 shows the comparison of both alignment methods for the 21 pairs of identical alignments between both runs. In 11 cases the alignment created with the GG-method had a smaller fraction of unaligned genes whereas only in once case the NW-method gave a better result. For 9 cases no difference was observed, but for 6 of these this was because the number of unaligned genes was zero with both methods. This data shows cleary that the GG alignment method outperforms the original NW method.

Needleman-Wunsch (NW) method Greedy Graph-based (GG) method
#alignable #misaligned fraction misaligned #alignable #misalignedfraction misalignedbest performing
14748.51%4712.13%GG
228725.00%30930.00%NW
32600.00%2600.00%-
420315.00%2214.55%GG
553916.98%53916.98%-
62600.00%2600.00%-
72727.41%2500.00%GG
8591220.34%611219.67%GG
949816.33%49714.29%GG
102400.00%2400.00%-
1120210.00%2000.00%GG
1238513.16%38410.53%GG
1327933.33%2713.70%GG
141900.00%1900.00%-
1532412.50%32412.50%-
161800.00%1800.00%-
173825.26%3800.00%GG
182314.35%2913.45%GG
192700.00%2700.00%-
2056916.07%56916.07%-
2142614.29%44511.36%GG
Total6998311.87%711638.86%GG

Table 1: Comparison of the Needleman-Wunsch and Greed Graph based alignment methods.

» Download in pdf format


Contact:
VIB / UGent
Bioinformatics & Evolutionary Genomics
Technologiepark 927
B-9052 Gent
BELGIUM
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)

Don't hesitate to contact the in case of problems with the website!