NTU CCTV-Fights Dataset

CCTV-Fights Dataset contains 1,000 videos picturing real-world fights, recorded from CCTVs or mobile cameras. We also provide frame-level annotation of each fight instance segment present in the videos, with its exact starting and ending points.

The dataset videos were collected from YouTube, searching with keywords such as: CCTV Fight, Mugging, Violence, Surveillance, Physical violence, etc. The fights can contain a diverse range of actions and attributes, for example: punching, kicking, pushing, wrestling, with two persons or more, etc. It was discarded videos that did not came directly from a CCTV recording (e.g., footage made with a mobile camera recording a screen), as well as videos with heavy special effects (e.g., shaded borders, slow-motion).

The dataset consists of 280 CCTV videos containing different types of fights, ranging from 5 seconds to 12 minutes, with an average length of 2 minutes. Furthermore, it also contains 720 videos of real fights from other sources (hereinafter referred to as Non-CCTV), mainly from mobile cameras, but a few from car cameras (dash-cams) and drones or helicopters. These videos are shorter, 3 seconds to 7 minutes, with an average length of 45 seconds, but still some have multiple instances of fight and can help the model to generalize better.

The table below presents a summary of the dataset statistics.

  Videos Duration (hours) Fight Instances Instances Average per video
All 1,000 17.68 2,414 2.41
CCTV 280 8.54 747 2.67
Non-CCTV 720 9.13 1,667 2.32

The overall size of the dataset is 7.2 GB.

How to obtain the dataset:
If interested, researchers can register an account, submit the request form and accept the Release Agreement. We will validate your request and grant approval for downloading the datasets.

Sample Frames

Sample Videos



Usage for Academic Research

Terms & Conditions of Use
The datasets are released for academic research only, and are free to researchers from educational or research institutes for non-commercial purposes.

The use of this dataset is governed by the following terms and conditions:
• Without the expressed permission of the ROSE Lab, any of the following will be considered illegal: redistribution, derivation or generation of a new dataset from this dataset, and commercial usage of any of these datasets in any way or form, either partially or in its entirety.
• For the sake of privacy, images of all subjects in any of these datasets are only allowed for demonstration in academic publications and presentations.
• All users of these datasets agree to indemnify, defend and hold harmless, the ROSE Lab and its officers, employees, and agents, individually and collectively, from any and all losses, expenses, and damages.

If interested, researchers can register for an account, submit the request form and accept the Release Agreement. We will validate your request and grant approval for downloading the datasets.

Related Publications
All publications using the NTU CCTV-Fights dataset should include the following acknowledgment: “(Portions of) the research in this paper used the NTU CCTV-Fights Dataset made available by the ROSE Lab at the Nanyang Technological University, Singapore.”

Furthermore, these publications should cite the following reference:
Mauricio Perez, Alex C. Kot, Anderson Rocha, “Detection of Real-world Fights in Surveillance Videos”, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019