Home

Magnification factors for the SOM and GTM algorithms


Author(s) : Christopher K. I. Williams Christopher M. Bishop, 
Publisher : N/A
Publication Date : 1997
ISSN : N/A
Abstract : Magnification factors specify the extent to which the area of a small patch of the latent (or `feature') space of a topographic mapping is magnified on projection to the data space, and are of considerable interest in both neuro-biological and data analysis contexts. Previous attempts to consider magnification factors for the self-organizing map (SOM) algorithm have been hindered because the mapping is only defined at discrete points (given by the reference vectors). In this paper we consider the batch version of SOM, for which a continuous mapping can be defined, as well as the Generative Topographic Mapping (GTM) algorithm of Bishop et al. [2] which has been introduced as a probabilistic formulation of the SOM. We show how the techniques of differential geometry can be used to determine magnification factors as continuous functions of the latent space coordinates. The results are illustrated here using a problem involving the identification of crab species from morphological data. 1 The Batch SOM Algorithm We begin by reviewing the batch form of the SOM [4] and showing how it leads to a continuous mapping from latent space to data space. The batch SOM algorithm involves a set of K reference vectors fy i g defined in the data space, in which each vector y i is associated with a node i on a regular lattice in a (typically) two-dimensional latent space (often called a `feature ' space). We denote the coordinate system in latent space by x, so that the ith node is at position x i. The algorithm begins by initializing the reference vectors using, for example, principal component analysis. At each cycle the corresponding `winning node ' j(n) is identified for every data vector t n, corresponding to the reference vector y i having the smallest Euclidean distance ky i \Gamma t n k,