Speaker
Description
Since Zuckerkandl and Pauling proposed the molecular clock in the 1960s it became common to trace mutations in genomes to infer their evolutionary history \cite{Zuckerkandl1965}.
UPGMA, also known as average linkage, is a hierarchical clustering algorithm, which was one of the first used to infer ultrametric phylogenetic trees \cite{Sokal1958}. Trees can inferred efficiently in $O(n^2)$ time, where $n$ is the number of tips \cite{Murtagh1985}. This allows to estimate large trees and made the method also popular for clustering. Here we present a tree arrangement algorithm to improve UPGMA trees under the implicitly used minimum evolution criterion. Our algorithm allows to compute all $2(n-1)$ Nearest Neighbor Interchange (NNI) moves in $O(n^2)$ time. We further extend our algorithm to allow the estimation of tip-dated phylogenies assuming a strict molecular clock.
An implementation of the algorithms will be part of the R package phangorn \cite{Schliep2011}.
Bibliography
@article{Murtagh1985,
title={Multidimensional clustering algorithms},
author={Murtagh, Fionn},
journal={Compstat lectures},
year={1985}
}
@Article{Schliep2011,
title = "phangorn: Phylogenetic analysis in {R}",
author = "Klaus Peter Schliep",
journal = "Bioinformatics",
year = "2011",
volume = "27",
number = "4",
pages = "592--593",
doi = {10.1093/bioinformatics/btq706}
}
@article{Sokal1958,
author = {Sokal, R. and Michener, C.},
title= {A statistical method for evaluating systematic relationships},
journal = {University of Kansas Science Bulletin},
volume = {38},
pages = {1409--1438},
year = {1958}
}
@article{Zuckerkandl1965,
title={Molecules as documents of evolutionary history},
author={Zuckerkandl, Emile and Pauling, Linus},
journal={Journal of theoretical biology},
volume={8},
number={2},
pages={357--366},
year={1965}
}