|
About
NeuroCOLT
Papers
Archive
Books
info@neurocolt.org
|
NeuroCOLT
Technical Report NC-TR-97-030
Batch Classifications
with Discrete Finite Mixtures
Petri
Kontkanen, Petri Myllymaki, Tom Silander and Henry Tirri
University of Helsinki
Finland
Abstract
In this paper we study batch classification problems where multiple
predictions can be made simultaneously, instead of performing the
classifications independently one at a time. For the predictions we
use the model family of discrete finite mixtures, where, by introducing
a hidden latent variable, we implicitly assume missing data that has
to be estimated in order to be able to construct models from sample
data. The main contribution of this paper is to demonstrate how the
standard EM algorithm can be modified for estimating both the missing
latent variable data, and the batch classification data at the same
time, thus allowing us to use the same algorithm both for constructing
the models from training data and for making predictions. In our framework
the amount of data available for making predictions is greater than
with the traditional approach, as the algorithm can also exploit the
information available in the query vectors. In the empirical part
of the paper, the results obtained by the batch classification approach
are compared to those obtained by standard (independent) predictions
by using public domain classification data sets.
Download Compressed
Postscript
Title
Page
|