Next: Video Representation
Up: Introduction
Previous: Unsupervised approach
How do we select the right feature set for event
comparison?
We must make a compromise between two contradictory desires.
On one hand we would like features to be as
descriptive as possible: measuring the kinematics and dynamics of the
object's movements is very useful for event comparison. On the other
hand, we also want
feature extraction to be extremely robust across many hours of video.
Descriptive
features are hard to extract. Object detection and tracking often fails
in an unconstrained environment. Basic image features based on spatial/motion histogram of objects are
simple and reliable to compute [2,5,15]. The only drawback of
these methods is that the (important) feature signal might be obscured by
noise. Event similarity computed
naively could be overestimated, making unusual events appear similar
to common ones. This over-dependence on the feature set has been
a general weakness for most unsupervised approaches [13].
The situation is vastly improved if we can extract the important feature signal
from a large set of simple features.
This problem resembles the problem of unusual event detection
itself as important signals are ``hard to detect" but
``easy to verify".
In fact, unusual event
detection and important feature selection are two interlocked problems.
We propose a correspondence function to measure such mutual
interdependence thereby detecting unusual events and important
features simultaneously. We show that for an important subfamily
of correspondence functions an efficient computational solution
exists via co-embedding.
The paper is organized as follows: in Section 2 we
show our video representation. In Section
3 and 4 we describe our algorithm for unusual
event detection.
In section 5 we present our experimental results, and we conclude our paper in Section 6.
Next: Video Representation
Up: Introduction
Previous: Unsupervised approach
Mirko Visontai
2004-05-13