|
NeuroCOLT
Technical Report NC-TR-96-033
Learning to Compress
Ergodic Sources
Jonathan
Baxter
Royal Holloway, University of London and
London School of Economics
UK
John
Shawe-Taylor
Royal Holloway, University of London
UK
Abstract
We present an adaptive coding technique which is shown to achieve
the optimal coding in the limit as the size of the text grows, while
the data structures associated with the code only grow linearly with
the text. The approach relies on Huffman codes which are generated
relative to the context in which a particular character occurs. The
Huffman codes themselves are inferred from the data that has already
been seen. A key part of the paper involves showing that the loss
per character incurred by the learning process tends to zero as the
size of the text tends to infinity.
Download Compressed
Postscript
|