Off-line trained class-specific object detectors are designed to detect any instance of the class in a given image or video sequence. In the context of object tracking, however, one seeks the location and scale of a target object, which is a specific instance of the class. Hence, the target needs to be separated not only from the background but also from other instances in the video sequence. We address this problem by adapting a class-specific object detector to the target, making it more instance-specific. To this end, we learn off-line a codebook for the object class that models the spatial distribution and appearance of object parts. For tracking, the codebook is coupled with a particle filter. While the posterior probability of the location and scale of the target is used to learn on-line the probability of each part in the codebook belonging to the target, the probabilistic votes for the object cast by the codebook entries are used to model the likelihood.
After updating the particles, the multi-modal posterior distribution is approximated. The weights of the particles are indicated by color (yellow: high, red: low). The target is marked by a blue dot.
Based on the posterior, the voting space is clustered (blue: foreground, red: background, green: uncertain).
From top to bottom. Standard test sequences: i-Lids hard; David Indoor; Girl; Occluded Face; Occluded Face2. YouTube test sequences: Klaus Kinski 1971; David Attenborough Night; Desert; Elephant Seal; Top Gear; Top Gear Ariel Atom; Monty Python Military; Florian Silbereisen. The last five sequences are from public street parades.
Short video 5min; ~20MB (AVI)
Full video 12min; ~60MB (AVI)
Gall J., Yao A., Razavi N., van Gool L., and Lempitsky V., Hough Forests for Object Detection, Tracking, and Action Recognition (PDF), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 11, 2188-2202, 2011. ©IEEE