NeuroCOLT

Neural Networks and Computational Learning Theory

 

About NeuroCOLT

Papers Archive

Research Areas

Partners

Coordinator

Events

info@neurocolt.org

 

NeuroCOLT workshop
on
Applications of Learning to Text and Images
Windsor, 30 April - 2 May 2001
Cumberland Lodge


Professor Andrew Blake FREng
Microsoft Research,

Online Presentation


Probabilistic Inference and Learning in Computer Vision Automatic speech recognisers are now commonplace, and set to make a major commercial impact, for example in voice-driven email handling and telephone/web-based query-handling services. The core of their success lies in the underlying principle, established by pioneers at IBM and Bell laboratories in the 70s, that effective pattern recognition algorithms cannot be programmed by hand; instead they have to be learned. Happily there is a powerful class of learnable, probabilistic models, for which tractable recognition algorithms (based largely on dynamic programming) are available.
 

Over the last five to ten years, serious progress has been made in importing the methods of probabilistic pattern recognition and learning into vision. This tutorial series will use some important recent papers to initiate a review and discussion of pervasive methods in learning and inference. These are tools that are essential for researchers wanting to take a lead in some of the most exciting developments in vision at the moment.

Two important recent papers to initiate discussion:

Learning low-level vision, WT Freeman and EC Pasztor, Proc. ICCV99 This paper proposes a persuasive general approach to inference in image arrays. The classic application is restoration of degraded images, including super-resolution. This is a classic Bayesian piece of work, the latest in an honourable succession that began with "intrinsic images" (Barrow and Tenenbaum 1978), and moved on to regularisation (Poggio et al. 1983), via Markov random fields (MRF) and Gibbs sampling (Geman^2 1984), and probabilistic graphical models (Pearl 1988). It will be a good springboard to introduce MRFs, message passing algorithms, and the striking new trend towards "non-parametric" or "exemplar"-based learning. It's certainly bracing stuff - where's the catch?

Learning graphical models of images, videos and their spatial transformations, BJ Frey and N Jojic, Proc UAI 2000 Frey and Jojic have put together an exciting story that uses "latent variable modelling", second nature in the probabilistic inference (NIPS) community, to explain and analyse images and image sequences . The exciting part is that, apparently, all you have to do is describe how an image is constructed, and you automatically get an analysis of the image. The trick is, you just take the description and push it through the "Expectation Maximisation" sausage machine. It seems almost miraculous, in the same way that declarative programming (PROLOG) seems miraculous, that the analytical machinery is generated for you automatically. Is there a catch here, or is this what we should all be doing?