Learned features versus engineered features for multimedia indexing

Mateusz Budnik; Efrain-Leonardo Gutierrez-Gomez; Bahjat Safadi; Denis Pellerin; Georges Quénot

doi:10.1007/s11042-016-4240-2

Article Dans Une Revue Multimedia Tools and Applications Année : 2016

Learned features versus engineered features for multimedia indexing

(1) , (1) , (1) , (2) , (1)

1
2

Mateusz Budnik

Fonction : Auteur

Laboratoire d'Informatique de Grenoble

Efrain-Leonardo Gutierrez-Gomez

Fonction : Auteur

Laboratoire d'Informatique de Grenoble

Bahjat Safadi

Fonction : Auteur

Laboratoire d'Informatique de Grenoble

Denis Pellerin

Fonction : Auteur
PersonId : 20254
IdHAL : denis-pellerin
ORCID : 0000-0002-3792-1706
IdRef : 060908998

GIPSA - Architecture, Géométrie, Perception, Images, Gestes

Georges Quénot

Fonction : Auteur
PersonId : 3114
IdHAL : georges-quenot
ORCID : 0000-0003-2117-247X
IdRef : 034104518

Laboratoire d'Informatique de Grenoble

Résumé

In this paper, we compare " traditional " engineered (hand-crafted) features (or descriptors) and learned features for content-based indexing of image or video documents. Learned (or semantic) features are obtained by training classifiers on a source collection containing samples annotated with concepts. These classifiers are applied to the samples of a destination collection and the classification scores for each sample are gathered into a vector that becomes a feature for it. These feature vectors are then used for training another classifier for the destination concepts on the destination collection. If the classifiers used on the source collection are Deep Convolutional Neu-ral Networks (DCNNs), it is possible to use as a new feature vector also the intermediate values corresponding to the output of all the hidden layers. We made an extensive comparison of the performance of such features with traditional engineered ones as well as with combinations of them. The comparison was made in the context of the TRECVid semantic indexing task. Our results confirm those obtained for still images: features learned from other training data generally outperform engineered features for concept recognition. Additionally , we found that directly training KNN and SVM classifiers using these features performs significantly better than partially retraining the DCNN for adapting it to the new data. We also found that, even though the learned features performed better that the engineered ones, fusing both of them performs even better, indicating that engineered features are still useful, at least in the considered case. Finally, the combination of DCNN features with KNN and SVM classifiers was applied to the VOC 2012 object classification task where it currently obtains the best performance with a MAP of 85.4%.

Mots clés

Semantic indexing Engineered features Learned features

Domaines

Informatique [cs] Recherche d'information [cs.IR] Multimédia [cs.MM] Réseau de neurones [cs.NE]

Fichier principal

mtap16a-c.pdf (185.73 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Georges Quénot : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01479240

Soumis le : mardi 28 février 2017-17:01:37

Dernière modification le : jeudi 4 avril 2024-21:32:13

Archivage à long terme le : lundi 29 mai 2017-15:57:45

Dates et versions

hal-01479240 , version 1 (28-02-2017)

Identifiants

HAL Id : hal-01479240 , version 1
DOI : 10.1007/s11042-016-4240-2

Citer

Mateusz Budnik, Efrain-Leonardo Gutierrez-Gomez, Bahjat Safadi, Denis Pellerin, Georges Quénot. Learned features versus engineered features for multimedia indexing. Multimedia Tools and Applications, 2016, 76 (9), pp.11941-11958. ⟨10.1007/s11042-016-4240-2⟩. ⟨hal-01479240⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA GIPSA-DIS LIG GIPSA-AGPIG LIG_TDCGE_MRIM POLYTECH-GRENOBLE ANR LIG_SIDCH

318 Consultations

304 Téléchargements

Learned features versus engineered features for multimedia indexing

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager