Video Object Instance Dataset

The Video-Object-Instance (NTU-VOI) dataset from NTU’s ROSE Lab is provided for the evaluation of object instance search and localization in large scale videos. It consists of 146 ground truth video clips with bounding box annotations of object instances in each frame. The total download size of the videos is ~222MB.

Distractor videos used in the experiments from Stanford I2V are not included, but can be accessed here: http://blackhole1.stanford.edu/vidsearch/dataset/stanfordi2v.html.

Content and Format

Besides a readme file, the dataset includes three folders:

 

./queries/: 10 query object instances, each in .jpg format
 
./gt_videos/: 146 .mp4 ground truth videos, resolution 800x400. Each video is named after the ground truth object it contains + a number. e.g. ferrari1.mp4. If a video contains multiple objects, it is named by concatenating individual object names with an underscore, e.g. kittyb3_kittyg3.mp4
 
./gt/: 151 ground truth annotations. Each .gt file is named after its corresponding video file. If a video contains > 1 object instances, it will spawn multiple .gt files, one for each object. e.g. kittyb3_kittyg3.mp4 is associated with both kittyb3.gt and kittyg3.gt
 
format of .gt:
1st row: duration of the object instance (in terms of #frames)
2nd row onwards: frm_idx top left bottom right (frm_idx starts from 0)
   

 

Related Publications

Please cite the following papers should you make use of the dataset:
@ARTICLE{Meng:TMM16,
author={Jingjing Meng and Junsong Yuan and Jiong Yang and Gang Wang and Yap-Peng Tan},
journal={Multimedia, IEEE Transactions on},
title={Object Instance Search in Videos via Spatio-Temporal Trajectory Discovery},
year={2016},
note={(to appear)},
}

@INPROCEEDINGS{Meng:ICIP15,
author={Jingjing Meng and Junsong Yuan and Yap-Peng Tan and Gang Wang},
booktitle={Image Processing (ICIP), 2015 22nd IEEE International Conference on},
title={Fast object instance search in videos from one example},
year={2015}
}
 

Important Notice

The following 5 clips were downloaded from the internet and may subject to copyright. We don't own the copyright of the videos and only provide them for non-commercial research purposes:
           maggi8.mp4, maggi14.mp4, maggi15.mp4, starbucks12.mp4, starbucks14.mp4

 

Acknowledgements

This research was carried out at the Rapid-Rich Object Search (ROSE) Lab at the Nanyang Technological University, Singapore. The ROSE Lab is supported by the National Research Foundation, Singapore, under its Interactive Digital Media (IDM) Strategic Research Programme. This work is supported in part by Singapore Ministry of Education Academic Research Fund (AcRF) Tier 1 grant M4011272.040.

 

Usage for Academic Research

Terms & Conditions of Use
The datasets are released for academic research only, and are free to researchers from educational or research institutes for non-commercial purposes.

The use of this dataset is governed by the following terms and conditions:
• Without the expressed permission of the ROSE Lab, any of the following will be considered illegal: redistribution, derivation or generation of a new dataset from this dataset, and commercial usage of any of these datasets in any way or form, either partially or in its entirety.
• For the sake of privacy, images of all subjects in any of these datasets are only allowed for the demonstration in academic publications and presentations.
• All users of these datasets agree to indemnify, defend and hold harmless, the ROSE Lab and its officers, employees, and agents, individually and collectively, from any and all losses, expenses, and damages.

If interested, researchers can register for an account, submit the request form and accept the Release Agreement. We will validate your request and grant approval for downloading the datasets.