mmdetection-数据篇

傅志诚

2023-12-01

数据集格式

mmdetection提供的数据集格式如下：

CustomDataset
XMLDataset
CocoDataset
VOCDataset
CityscapesDataset
WIDERFaceDataset

数据准备：

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   ├── cityscapes
│   │   ├── annotations
│   │   ├── leftImg8bit
│   │   │   ├── train
│   │   │   ├── val
│   │   ├── gtFine
│   │   │   ├── train
│   │   │   ├── val
│   ├── VOCdevkit
│   │   ├── VOC2007
│   │   ├── VOC2012

1.VOCDataset

VOC数据的配置项，使用了RepeatDataset来添加多个目录的数据集, times是指重复的次数,详见github
config文件中的配置

dataset_type = 'VOCDataset'
data = dict(
    imgs_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='RepeatDataset',
        times=3, 
        dataset=dict(
            type=dataset_type,
            ann_file=[
                data_root + 'VOC2007/ImageSets/Main/trainval.txt',
                data_root + 'VOC2012/ImageSets/Main/trainval.txt'
            ],
            img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'],
            pipeline=train_pipeline)),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
        img_prefix=data_root + 'VOC2007/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
        img_prefix=data_root + 'VOC2007/',
        pipeline=test_pipeline))

VOC 数据集格式

2.CocoDataset

config文件中的配置：

dataset_type = 'CocoDataset'
data = dict(
    imgs_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_train2017.json',
        img_prefix=data_root + 'train2017/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=test_pipeline))

COCO数据集格式说明

{
    "info"              : info, 
    "images"            : [image], 
    "annotations"       : [annotation], 
    "licenses"          : [license],
}

info{
    "year"              : int, 
    "version"           : str, 
    "description"       : str, 
    "contributor"       : str, 
    "url": str, 
    "date_created"      : datetime,
}

image{
    "id"                : int,
    "width"             : int, 
    "height"            : int, 
    "file_name"         : str, 
    "license"           : int, 
    "flickr_url"        : str, 
    "coco_url"          : str, 
    "date_captured"     : datetime,
}

license{
    "id"                : int, 
    "name"              : str, 
    "url"               : str,
}

Object Detection

annotation{
    "id": int, 
    "image_id": int, 
    "category_id": int, 
    "segmentation": RLE or [polygon], 
    "area": float, 
    "bbox": [x,y,width,height], 
    "iscrowd": 0 or 1,
}

categories[{
"id": int, "name": str, "supercategory": str,
}]

3.CityscapesDataset

dataset_type = 'CityscapesDataset'
data = dict(
    imgs_per_gpu=1,
    workers_per_gpu=2,
    train=dict(
        type='RepeatDataset',
        times=8,
        dataset=dict(
            type=dataset_type,
            ann_file=data_root +
            'annotations/instancesonly_filtered_gtFine_train.json',
            img_prefix=data_root + 'leftImg8bit/train/',
            pipeline=train_pipeline)),
    val=dict(
        type=dataset_type,
        ann_file=data_root +
        'annotations/instancesonly_filtered_gtFine_val.json',
        img_prefix=data_root + 'leftImg8bit/val/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root +
        'annotations/instancesonly_filtered_gtFine_test.json',
        img_prefix=data_root + 'leftImg8bit/test/',
        pipeline=test_pipeline))

使用自己的数据集

方式一：转换成已有数据集格式

将自己的数据转成上述中的数据集格式，如PASCAL VOC、 COCO、Cityscapes。
数据放在data下面，结构如数据准备部分。
修改config配置文件中num_classes, 其值为：类别 + 1
修改mmdet/datasets/下对应格式py文件里的CLASSES，设置成自己数据集的类别
修改mmdet/core/evaluation/class_names.py里的对应数据集格式的类别名，设置成自己数据集的类别

方式二：添加自己的数据集格式

编写自己数据格式的类，继承自已有数据集的格式，如继承CocoDataset，或 VOCDataset

创建 mmdet/datasets/my_dataset.py文件，添加内容:

from .coco import CocoDataset
from .registry import DATASETS


@DATASETS.register_module
class MyDataset(CocoDataset):

    CLASSES = ('a', 'b', 'c', 'd', 'e')

在 mmdet/datasets/__init__.py添加:

from .my_dataset import MyDataset

这样就可以像使用CocoDataset一样在config文件里使用 MyDataset。

你也可以创建继承自CustomDataset的类，并重载load_annotations(self, ann_file) and get_ann_info(self, idx),实现如CocoDataset和VOCDataset类似的功能。

CustomDataset数据集的标签是一个字典的列表，每一个字典对应一张图片，字典里包含三个字段， filename (相对路径), width和height 是测试的长和宽, ann 用于训练. ann 里至少包含两个字段:
bboxes 和 labels,两个字段的值都是 numpy arrays. 一些数据集提供的标签包括：crowd/difficult/ignored bboxes等, 可以使用 bboxes_ignore and labels_ignore去覆盖。

数据格式的例子：

[
    {
        'filename': 'a.jpg',
        'width': 1280,
        'height': 720,
        'ann': {
            'bboxes': <np.ndarray, float32> (n, 4),
            'labels': <np.ndarray, int64> (n, ),
            'bboxes_ignore': <np.ndarray, float32> (k, 4),
            'labels_ignore': <np.ndarray, int64> (k, ) (optional field)
        }
    },
    ...
]