Learning Appearance Transfer for Person Re-Identification

Tamar Avraham and Michael Lindenbaum, invited book chapter in S. Gong, M. Cristani, S. Yan, C.C. Loy (Eds.). Person Re-Identification, Springer, January 2014

We review methods that model the transfer a person's appearance undergoes when passing between two cameras with non-overlapping fields of view. Whereas many recent studies deal with re-identifying a person at any new location and search for universal signatures and metrics, here we focus on solutions for the natural setup of surveillance systems in which the cameras are specific and stationary, solutions which exploit the limited transfer domain associated with a specific camera pair. We compare the performance of explicit transfer modeling, implicit transfer modeling, and camera-invariant methods. Although explicit transfer modeling is advantageous over implicit transfer modeling when the inter-camera training data is poor, implicit camera transfer, which can model multi-valued mappings and better utilizes negative training data, is advantageous when a larger training set is available. While camera-invariant methods have the advantage of not relying on specific inter-camera training data, they are outperformed by both camera-transfer approaches when sufficient training data is available. We therefore conclude that camera-specific information is very informative for improving re-identification in sites with static non-overlapping cameras and that it should still be considered even with the improvement of camera-invariant methods.

Transitive Re-identification

Yulia Brand, Tamar Avraham, and Michael Lindenbaum, BMVC (British Machine Vision Conference) 2013, extended abstractpresentation

Person re-identification accuracy can be significantly improved given a training set that demonstrates changes in appearances associated with the two non-overlapping cameras involved. Here we test whether this advantage can be maintained when directly annotated training sets are not available for all camera-pairs at the site. Given the training sets capturing correspondences between cameras A and B and a different training set capturing correspondences between cameras B and C, the Transitive Re-IDentification algorithm (TRID) suggested here provides a classifier for (A,C) appearance pairs. The proposed method is based on statistical modeling and uses a marginalization process for the inference. This approach significantly reduces the annotation effort inherent in a learning system, which goes down from O(N^2) to O(N), for a site containing N cameras. Moreover, when adding camera (N+1), only one inter-camera training set is required for establishing all correspondences. In our experiments we found that the method is effective and more accurate than the competing camera invariant approach.




To download the source code of all the experiments described in the paper click here. See the Readme file.



Learning Implicit Transfer for Person Re-identification

Tamar Avraham, Ilya Gurvich, Michael Lindenbaum, and Shaul Markovitch, 1st International Workshop on Re-Identification (Re-Id 2012) In conjunction with ECCV 2012, LNCS 7583, pp. 381390.


The re-identification problem has received increasing attention in the last years, especially due to its important role in surveillance systems. It is desirable that computer vision systems will be able to keep track of people after they have left the field of view of one camera and entered the field of view of the next, even when these fields of view do not overlap. We propose a novel approach for pedestrian re-identification. Previous re-identification methods use one of 3 approaches: invariant features; designing metrics that aim to bring instances of shared identities close to one another and instances of different identities far from one another; or learning a transformation from the appearance in one domain to the other. Our implicit approach models camera transfer by a binary relation R = {(x; y) | x and y describe the same person seen from cameras A and B respectively}. This solution implies that the camera transfer function is a multi-valued mapping and not a single-valued transformation, and does not assume the existence of a metric with desirable properties. We present an algorithm (ICT) that follows this approach and achieves new state-of-the-art performance.





To download the source code of all the experiments described in the paper click here. See the Readme file.