Публикация:Svensén 1998 Generative Topographic Mapping
Материал из MachineLearning.
м (оформление) |
м (→Ссылки: категория) |
||
Строка 41: | Строка 41: | ||
[[Категория:Публикации по авторам]][[Категория:Электронная библиотека]] | [[Категория:Публикации по авторам]][[Категория:Электронная библиотека]] | ||
[[Категория:Машинное обучение (публикации)]] | [[Категория:Машинное обучение (публикации)]] | ||
+ | [[Категория:Визуализация данных (публикации)]] |
Версия 23:02, 23 октября 2009
Svens\'en, Johan Fredrik Markus GTM: The Generative Topographic Mapping: Ph.D Dissertation. — Aston University, 1998. — 108 с.
BibTeX: |
@phdthesis{SvensenThesis, author = "Svens\'en, Johan Fredrik Markus", title = "GTM: The Generative Topographic Mapping: Ph.D Dissertation", school = "Aston University", pages = "108", url = "http://pca.narod.ru/SvensenThesis.pdf", year = "1998", language = russian } |
Аннотация
Диссертация И.Ф.М. Свенсена, Обобщенное топографическое отображение (картирование). Построены нелинейные модели скрытых переменных для моделирования непрерывных распределений вероятности малой размерности, погруженных в пространства высокой размерности. Это новая форма нелинейного метода главных компонент, существенно отличающаяся от карт Кохонена. Важное приложение этого метода – визуализация многомерных данных.
Реферат
This thesis describes the Generative Topographic Mapping (GTM) - a non-linear latent variable model, intended for modelling continuous, intrinsically low-dimensional probability distributions, embedded in high-dimensional spaces. It can be seen as a non-linear form of principal component analysis or factor analysis. It also provides a principled alternative to the self-organizing map, a widely established neural network model for unsupervised learning - resolving many of its associated theoretical problems.
An important potential application of the GTM is visualization of high-dimensional data. Since the GTM is non-linear, the relationship between data and its visual representation may be far from trivial, but a better understanding of this relationship can be gained by computing the so-called magnification factor. In essence, the magnification factor relates the distances between data points, as they appear when visualized, to the actual distances between those data points. There are two principal limitations of the basic GTM model. The computational effort required will grow exponentially with the intrinsic dimensionality of the density model. However, if the intended application is visualization, this will typically not be a problem. The other limitation is the inherent structure of the GTM, which makes it most suitable for modelling moderately curved probability distributions of approximately rectangular shape. When the target distribution is very different to that, the aim of maintaining an "interpretable" structure suitable for visualizing data, may come in conflict with the aim of providing a good density model.
The fact that the GTM is a probabilistic model means that results from probability theory and statistics can be used to address problems such as model complexity. Furthermore, this framework provides solid ground for extending the GTM to wider contexts than that of this thesis.