Clustering and Transformed Distance
At each stage in the process, two terminal nodes (sequences or groups
of sequences) are replaced by one new node until only two nodes remain.
This concept is very simple but starts from the assumption that the evolutionary
rate is the same in all branches of the tree. As a result, in trees
inferred by clustering, the distance from the root to each terminal node
is the same.
If the rate of substitutions varies from evolutionary lineage to evolutionary
lineage, the transformed distance method (Klotz et al., 1979; Li, 1981)
may give an improved tree topology. In this method, the additive
distance matrix is transformed into an ultrametric matrix and then clustering
is used to infer the tree. In TREECON, the transformed distances
are calculated as (Nei, 1987):
where dAB is the distance between sequence A and B; dAR is the distance between sequence A and a reference- or outgroup sequence R; dBR is the distance between sequence B and the reference organism and is the average of dIR for all I's. is introduced to make dAB positive (in fact, any other positive value can be used).
If you choose to construct evolutionary trees by the transformed distance method, you are forced by the program to select a reference organism. After the conversion of the original distance values to the transformed distance values, you can choose which clustering method you want to use to build the tree. However, it should be noted that the transformed distance method only gives a tree topology and does not provide estimates of branch lengths (Nei, 1987).
When a tree topology is inferred with clustering or transformed distance, the rooting procedure should be skipped because these methods inferrooted tree topologies.
This new OTU is added to the tree while the replaced OTUs and their respective branches are removed from the tree. This process converts the newly added OTU into a terminal node on a tree of reduced size. At each stage in the process, two terminal OTUs are replaced by one new. The process is complete when only two OTUs remain, separated by a single branch. A worked-out example can be found in Swofford et al. (1996). Besides the fact that the neighbor-joining method is very effective in recovering the correct tree topology (Saitou and Nei, 1987; Saitou and Imanishi, 1989; Nei, 1991), its main advantage is its speed.