Combining image descriptors to effectively retrieve events from visual lifelogs

Doherty AR.; Conaire CO.; Blighe M.; Smeaton AF.; O'connor NE.

Combining image descriptors to effectively retrieve events from visual lifelogs

Doherty AR., Conaire CO., Blighe M., Smeaton AF., O'connor NE.

The SenseCam is a wearable camera that passively captures approximately 3,000 images per day, which equates to almost one million images per year. It is used to create a personal visual recording of the wearer's life and generates information which can be helpful as a human memory aid. For such a large amount of visual information to be of any use, it is accepted that it should be structured into "events", of which there are about 8,000 in a wearer's average year. In automatically segmenting SenseCam images into events, it will then be useful for users to locate other events similar to a given event e.g. "what other times was I walking in the park?", "show me other events when I was in a restaurant". On two datasets of 240k and 1.8M images containing topics with a variety of information needs, we evaluate the fusion of MPEG-7, SIFT, and SURF content-based retrieval techniques to address the event search issue. We have found that our proposed fusion approach of MPEG-7 and SURF offers an improvement on using either of those sources or SIFT individually, and we have also shown how a lifelog event is modeled has a large effect on the retrieval performance. Copyright 2008 ACM.

Original publication

DOI

10.1145/1460096.1460100

Type

Journal article

Journal

Proceedings of the 1st International ACM Conference on Multimedia Information Retrieval, MIR2008, Co-located with the 2008 ACM International Conference on Multimedia, MM'08

Publication Date

01/12/2008

Pages

10 - 17

Cookies on this website

Combining image descriptors to effectively retrieve events from visual lifelogs

Doherty AR., Conaire CO., Blighe M., Smeaton AF., O'connor NE.

DOI

Type

Journal

Publication Date

Pages