Contest on Graph Embedding for Pattern RecognitionHosted by the 20th International Conference on Pattern Recognition |
|
home
description organizers datasets important dates entry submission |
Contest descriptionIn Pattern Recognition, statistical and structural methods have been traditionally considered as two rather separate worlds. However, many attempts at reducing the gap between these two approaches have been done. The idea inspiring these attempts is that of preserving the advantages of an expressive structural representation (such as a graph), while using most of the powerful, vector-based algorithms from Statistical Pattern Recognition.A possible approach to this issue has been recently suggested by Graph Embedding. The latter is a methodology aimed at representing a whole graph (possibly with attributes attached to its nodes and edges) as a point in a suitable vectorial space. Of course the relevant property is that of preserving the similarity of the graphs: the more two graphs are considered similar, the closer the corresponding points in the space should be. Graph embedding, in this sense, is a real bridge joining the two worlds: once the object at hand has been described in terms of graphs, and the latter represented in the vectorial space, all the problems of matching, learning and clustering can be performed using classical Statistical Pattern Recognition algorithms. The proposed contest aims at evaluating the effectiveness of the embedding process in the contest of a classical Pattern Recognition application.
Contest data and performance evaluationThe contest will be performed on three databases of graphs obtained from different sets of 2D images. Each database will be partitioned into a training set, available to the participants, and a test set, that will be disclosed only at the moment of the performance evaluation.Participants will have to submit a command line application (all the most popular operating systems will be supported) that takes an input file containing a set of graphs and produces an output file. Two kinds of output formats will be supported:
Each database will be divided into classes on the basis of the contents of the images used to generate the graphs. For performance evaluation, the vectors produced by applying each algorithm to the test sets will be grouped into clusters according to the corresponding class, and then a cluster validity index will be computed. To this aim the euclidean distance between the vectors will be used (for the implicit graph embedding, the euclidean distance can be easily computed from the scalar products). The cluster validity index will be chosen so as to be independent of the absolute scale of the vectors. The cluster validity index will be averaged over the three databases; then the algorithms will be ranked according to the values assumed by a performance function (known at the contest time) which depends on the cluster validity index and the execution time (so as to take into account both the classification results and the computational cost, although this latter will have a lower weight than the former). Participants will receive, together with the graphs composing the training sets and a detailed description of the input and output file formats, a program that computes the actual performance index on the training set, in order to check and tune their algorithm.
|