The module allows retrieval of similar images based on the general colour distribution of the image, with discrimination between colours in images which are homogenous to some sizable area.
The module is good for retrieval of images based on a known query image where the retrieved images are required to have a similar mix of colours. It is not suitable for retrieval of images based on a query image which is only part of a complete image.
Module Speed | Fast |
Module Accuracy | Medium |
The CCV stands for Colour Coherence Vector. A coherent region of colours in an image is a region of colour which is larger than some threshold. This module retrieves images which have similar distributions of coherent colours.
A histogram of 64 bins (4x4x4) is generated for both coherent and incoherent colours and these are matched separately. As for HistogramRGB, 64 bins has been chosen as a trade-off between accuracy and speed. The total size of the feature vector for a CCV is therefore 64x2 integer numbers (512 bytes per feature).
Coherence and incoherence are arbitrarily defined as greater and less then 5% of the total image area, respectively. This means if a pixel is part of a region which is less than 5% of the total image area it is added to the incoherent histogram within the CCV. If it is greater than 5% of the total image area it is added to the coherent histogram within the CCV.
Here is an example:
Image | Incoherent | Coherent |
---|---|---|
The chessboard image is 50% black and 50% white, arranged into 64 squares, where 32 are white and 32 are black. Each square constitutes a region each of which is 1/64th of the total image area. This is approximately 1.5% of the image area. They are therefore considered incoherent and stored into the incoherent histogram, leaving the coherent histogram empty. The second image contains the same amount of black and white (50% of each), however, the two regions are coherent and are therefore stored into the coherent histogram, leaving the incoherent histogram empty.
When each coherent and incoherent histogram is matched against each other, these two images will have a large distance, as they should indeed do, unlike a standard histogram function which would give a distance of zero implying they were identical.
The following is an example query giving what would be considered good results.
Note: The matching is based only upon the general colour distribution of the image, with discrimination between colours in images which are homogenous to some sizable area.
Query | 1 | 2 | 3 |
4 | 5 | 6 |
In the above example, the query image is contained within the database. It is therefore expected to be found in first place, as it is. This is a good test of the integrity of the algorithm. Also, all the retrieved results contain areas of contiguous colour similar to that of the query image. Note that the contiguous white background will alter the match, because it would become a coherent region within the vector.
The CCV is a whole-image query algorithm, and cannot be used to find sub-images, because it is likely that the subimage will have a different colour distribution. An image which is a sub-image of a database image may be used as the query but the parent from which it was derived is unlikely to be found. The query below shows an example of using an image which is a subimage of an image in the database as the query. Notice the results all have a similar colour distribution (white/yellow), but it did not find the image from which the query image was taken.
Query | 1 | 2 | 3 |
4 | 5 | 6 |
Used on its own, you should definitely not expect the CCV algorithm to be able to find specific instances of objects (e.g. chairs, pots, etc). However, used with a metadata search to locate similar types of object, this algorithm could locate those of a similar colour.