異なる, 写真, 時計, メーターが含まれている画像自動的に生成された説明

Wide Surveillance Images from Different Heights： Dataset Description

Vision and Robotics Lab. Wakayama University Japanese 2023 4/1

Dataset Overview

This image dataset consists of images of people captured by a wide-angle ceiling camera with annotation information added. The images show multiple people in a scene with furniture such as desks and chairs, and luggage. A truck with a crane to change the height of the wide-angle camera is also in the image. The annotations are the tilted bounding box information and the posture information (sit/stand) of the figures.

A unique feature of this dataset is that it was captured from three different heights (3m, 4m, and 5m), and no similar dataset currently exists within the range of our research.

The reason why the height of the ceiling camera needs to be changed in the task of detecting people from a wide-angle ceiling camera is that the accuracy of detection can change significantly if the image data used for inference differs from the image data used to train the detector.

Conditions of use:

This dataset can be used unconditionally by non-profit organizations for research or implementation without financial support from commercial firms by submitting a written application. For all other uses, please contact with twada@ieee.org for permission.

Application for use:

At this time, we do not have an automated application system, so please send us an email with the following information.

----------------------------------------------------

To: twada@ieee.org

Subject: wsi-dh dataset download request

Purpose of Use:

Applicant Name:

Applicant Organization:

Applicant Address:

Applicant E-mail address:

----------------------------------------------------

Acknowledgements:

We thank Giken Truststem Co., Ltd. for providing this data, and Ms. Sugisaki and Mr. Miura, second-year students at Wakayama University, for their help in annotating the data.

Toshikazu Wada

The following is detailed information.．

There are three types of stored data: original video data, image data, and annotation data.
The original video data is stored under the directory "Video".

A total of 6 videos (2 types of scenes x 3 types of heights)
Two types of scene names: "store in shopping center" (sc) and "food court" (fc).
Installation heights: 3m, 4m, 5m (e.g. height name is "3000" for 3m)
Video name is scene name_height name and file format is .mp4.
The video is recorded in 1920x1080 at 30 fps.

The data divided into images and annotations are stored separately in directories with video names, such as "fc_3000".
The image data is stored under a directory named "Images" directly under each directory.

The image data is a jpeg file with each video data sampled every 10 or 15 frames and the image size changed to 640x640. The frame number is attached to the file name.．

The annotation data is stored under the directory "Annotation" immediately below each directory, and two types of annotation data are stored: "Json" and "Text". "Jason" contains a json file, and "Text" contains a file that corresponds one-to-one with the image file. The file name is "video_frame_number.txt" if the image is "video_frame_number.jpg". The information under "Json" and the information in "Text" are essentially the same.

Directory Structure

dataset

├── Video

│ ├── fc_3000.mp4

│ ├── fc_4000.mp4

│ ├── fc_5000.mp4

│ ├── sc_3000.mp4

│ ├── sc_4000.mp4

│ └── sc_5000.mp4

├── fc_3000

│ ├── Annotation

│ └── Images

├── fc_4000

│ ├── Annotation

│ └── Images

├── fc_5000

│ ├── Annotation

│ └── Images

├── sc_3000

│ ├── Annotation

│ └── Images

├── sc_4000

│ ├── Annotation

│ └── Images

└── sc_5000

├── Annotation

└── Images

*See the table below for details on each video and image data.

Annotation file description

Each bounding box is represented by 14 values [cx, cy, w, h, angle, lux, luy, rux, ruy, rbx, rby, lbx, lby, class]

cx, cy：Center coordinates of the bounding box with the upper left corner of the image as (0,0)
w, h：Width and height of the bounding box
angle：Clockwise rotation angle (in degrees) from the upward vertical axis, range -180 to 180
lux, luy：Upper left corner coordinates of the bounding box
rux, ruy：Upper right corner coordinates of the bounding box
rbx, rby：Lower right corner coordinates of bounding box
lbx, lby：Lower left corner coordinates of bounding box
class："stand person"or "sit person"

The order of attribute description in text format：cx cy w h angle lux luy rux ruy rbx rby lbx lby class
JSON format style: Each video is stored in a single JSON file in a format that roughly conforms to the MS COCO format.

*See the table below for details on each video and image data.

Video name	Number of total frames	Number of annotated frames(frame interval)	Video frame resolution(FPS)	Image resolution
fc_3000	18097	1200(15)	1920×1080(30)	640×640
fc_4000	15262	1500(10)	1920×1080(30)	640×640
fc_5000	18420	1700(10)	1920×1080(30)	640×640
sc_3000	15101	1000(20:1-10000frame,10: 10001frame-)	1920×1080(30)	640×640
sc_4000	13770	1000(10)	1920×1080(30)	640×640
sc_5000	14369	1000(10)	1920×1080(30)	640×640

Annotation file description

Each bounding box is represented by 14 values [cx, cy, w, h, angle, lux, luy, rux, ruy, rbx, rby, lbx, lby, class]

The order of attribute description in text format：cx cy w h angle lux luy rux ruy rbx rby lbx lby class
JSON format style: Each video is stored in a single JSON file in a format that roughly conforms to the MS COCO format.