MMDetection 目标检测使用 | 字数总计: 3.7k | 阅读时长: 16分钟 | 阅读量: 
OpenMMLab 是香港中文大学-商汤科技联合实验室 MMLab 自 2018 年 10 月开源的一个计算机视觉领域的 AI 算法框架。其包含了众多计算机视觉多个方向的算法框架,本篇介绍 MMDetection 库,运行服务器 Ubuntu 18.04。本篇以 mmdet 2.x 版本为例,可能有些模块、类、函数在最新版中会有改变。请访问官网 docs  查看更新。
安装 MMDetection 的安装需要在一些基础库之上进行,如 PyTorch,mmcv 等。假设显卡驱动、cuda、cudnn 等均已安装配置好,接下来就是安装 python 和一些包。我们这里采用 miniconda 来进行 python 安装和环境配置。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh chmod  +x miniconda.shbash ./miniconda.sh -b -p /opt/miniconda echo  "export MINICONDA_HOME=/opt/miniconda"  >> ~/.bashrcecho  "export PATH=$MINICONDA_HOME /bin:$PATH "  >> ~/.bashrcsource  ~/.bashrcconda create -n mm python=3.9 -y conda activate mm pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html pip install mmcv -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html git clone  -b v2.28.0 https://github.com/open-mmlab/mmdetection.git cd  mmdetectionpip install -v -e . pip install -U openmim mim install mmengine mim install "mmcv>=2.0.0"    
每个版本的具体安装方法,请参考官方文档:https://mmdetection.readthedocs.io/zh_CN/latest/get_started.html,  左下角可以选择不同版本。 
打开 Python 命令提示符,输入如下内容,如果正常显示版本信息,说明 mmdetection 安装成功:
1 2 import  mmdetprint (mmdet.__version__)
使用 MMDetection 进行图片目标检测 安装成功后,在克隆的仓库里有源代码,里面包含了很多算法的配置信息。我们这里以 faster-rcnn 为例,进行图片目标检测应用。
首先需要下载预训练好的模型,访问网址:mmdetection-model_zoo ,选择 faster-rcnn :
1 2 3 cd  mmdetectionmkdir  checkpointswget https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth -P checkpoints 
然后,进行图片目标检测(假设下面代码运行目录是 mmdetection):show_result_pyplot 函数在 mmdet 3.x 中被删除 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 from  mmdet.apis import  inference_detector, init_detector, show_result_pyplotconfig = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'  checkpoint = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'  device = "cuda:0"  model = init_detector(config, checkpoint, device=device) img = 'demo/demo.jpg'  result = inference_detector(model, img) show_result_pyplot(model, img, result, score_thr=0.9 ) 
mmdet 3.x 可视化方法: 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 from  mmdet.apis import  inference_detector, init_detector, show_result_pyplotconfig = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'  checkpoint = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'  device = "cuda:0"  model = init_detector(config, checkpoint, device=device) img = 'demo/demo.jpg'  result = inference_detector(model, img) import  mmcvfrom  mmdet.registry import  VISUALIZERSvisualizer = VISUALIZERS.build(model.cfg.visualizer) visualizer.dataset_meta = model.dataset_meta image = mmcv.imread('demo/demo.jpg' , channel_order="rgb" ) visualizer.add_datasample(     "result" , image, data_sample=result, draw_gt=None , wait_time=0  ) visualizer.show() 
rpn 区域提议 首先需要下载预训练好的模型,访问网址:mmdetection-model_zoo ,选择 rpn :
1 2 cd  mmdetectionwget https://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_fpn_1x_coco/rpn_r50_fpn_1x_coco_20200218-5525fa2e.pth -P checkpoints 
然后,进行图片目标检测(假设下面代码运行目录是 mmdetection):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 from  mmdet.apis import  inference_detector, init_detector, show_result_pyplotconfig = 'configs/rpn/rpn_r50_fpn_1x_coco.py'  checkpoint = 'checkpoints/rpn_r50_fpn_1x_coco_20200218-5525fa2e.pth'  device = "cuda:0"  model = init_detector(config, checkpoint, device=device) img = 'demo/demo.jpg'  rpn_result = inference_detector(model, img) model.show_result(img, rpn_result, top_k=100 ) 
使用 mmdetection 模型微调 现在我们模型框架、预训练模型都有了,只需要使用自己的数据集进行再次训练,使得模型能够更加准确的在我们自己数据上进行预测。
对于个人数据集,有三种组织数据集方法来结合 mmdetection 进行模型微调:
把数据集转化为 CoCo 组织形式; 
使用 mmdetection 提供的数据类型 CustomDataset,把数据写成 pkl 保存本地使用; 
继承 mmdetection 提供的数据类型 CustomDataset,撰写自己的数据类型,省去 pkl,节省本地空间增加运行速度。 
 
首先,下载数据集,这里我们采用 kitti_tiny 数据集:
1 wget https://download.openmmlab.com/mmdetection/data/kitti_tiny.zip -P data 
其次,将下载的数据保存并解压到目录下:
1 2 cd  dataunzip -q kitti_tiny.zip 
最后,根据不同的数据组织形式进行配置文件生产和模型训练。
COCO 格式 请自行组织。
CustomDataset 格式 把数据转化为中间 pkl 格式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 import os.path as osp import numpy as np import mmcv def convert_kitti_to_middle(ann_file, out_file, img_prefix):     CLASSES = ('Car' , 'Pedestrian' , 'Cyclist' )          cat2label = {k: i for  i, k in  enumerate(CLASSES)}          image_list = mmcv.list_from_file(ann_file)          data_infos = []          for  image_id in  image_list:         filename = f'{img_prefix}/{image_id}.jpeg'          image = mmcv.imread(filename)         height, width = image.shape[:2]                  data_info = dict(filename=f'{image_id}.jpeg' , width=width, height=height)                  label_prefix = img_prefix.replace('image_2' , 'label_2' )         lines = mmcv.list_from_file(osp.join(label_prefix, f'{image_id}.txt' ))         content = [line.strip().split (' ' ) for  line in  lines]         bbox_names = [x[0] for  x in  content]         bboxes = [[float (info) for  info in  x[4:8]] for  x in  content]         gt_bboxes = []         gt_labels = []         gt_bboxes_ignore = []         gt_labels_ignore = []                  for  bbox_name, bbox in  zip(bbox_names, bboxes):             if  bbox_name in  cat2label:                 gt_labels.append(cat2label[bbox_name])                 gt_bboxes.append(bbox)             else :                 gt_labels_ignore.append(-1)                 gt_bboxes_ignore.append(bbox)                  data_anno = dict(             bboxes = np.array(gt_bboxes, dtype=np.float32).reshape(-1, 4),             labels = np.array(gt_labels, dtype=np.longlong),             bboxes_ignore = np.array(gt_bboxes_ignore, dtype=np.float32).reshape(-1, 4),             labels_ignore = np.array(gt_labels_ignore, dtype=np.longlong)         )         data_info.update(ann=data_anno)                  data_infos.append(data_info)     mmcv.dump(data_infos, out_file) convert_kitti_to_middle('data/kitti_tiny/train.txt' , 'data/kitti_tiny/train_middle.pkl' , 'data/kitti_tiny/training/image_2' ) convert_kitti_to_middle('data/kitti_tiny/val.txt' , 'data/kitti_tiny/val_middle.pkl' , 'data/kitti_tiny/training/image_2' ) 
生产配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 from mmcv import Config from mmdet.apis import set_random_seed cfg = Config.fromfile("configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py" ) cfg.device='cuda'  classes = ('Car' , 'Pedestrain' , 'Cyclist' ) cfg.dataset_type = 'CustomDataset'  cfg.data_root = 'data/kitti_tiny'  cfg.classes = classes dtype = 'CustomDataset'  droot = 'data/kitti_tiny'  cfg.data.test.type = dtype cfg.data.test.data_root = droot cfg.data.test.ann_file = 'train_middle.pkl'  cfg.data.test.img_prefix = 'training/image_2'  cfg.data.test.classes = classes cfg.data.train.type = dtype cfg.data.train.data_root = droot cfg.data.train.ann_file = 'train_middle.pkl'  cfg.data.train.img_prefix = 'training/image_2'  cfg.data.train.classes = classes cfg.data.val.type = dtype cfg.data.val.data_root = droot cfg.data.val.ann_file = 'val_middle.pkl'  cfg.data.val.img_prefix = 'training/image_2'  cfg.data.val.classes = classes cfg.model.roi_head.bbox_head.num_classes = 3 cfg.load_from = 'checkpoints/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-5324cff8.pth'  cfg.work_dir = 'work_dir'  if  not os.path.exists(cfg.work_dir):    os.path.makedirs(cfg.work_dir) cfg.optimizer.lr = 0.02 / 8 cfg.lr_config.warmup = None cfg.log_config.interval = 10 cfg.evaluation.metric = 'mAP'  cfg.evaluation.interval = 12 cfg.checkpoint_config.interval = 12 cfg.seed = 0 set_random_seed(0, deterministic=False) cfg.gpu_ids = range(1) print (f"Config: \n {cfg.pretty_text}" )cfg.dump(F'{cfg.work_dir}/customformat.py' ) 
训练模型:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 import mmcv from mmdet.datasets import build_dataset from mmdet.models import build_detector from mmdet.apis import train_detector import os.path as osp from mmcv import Config cfg = Config.fromfile("work_dir/customformat.py" ) datasets = [build_dataset(cfg.data.train)] model = build_detector(cfg.model) model.CLASSES = datasets[0].CLASSES mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_detector(model, datasets, cfg, distributed=False, validate=True) 
自定义 KittiTinyDataset 格式 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 import os.path as osp import mmcv import numpy as np from mmdet.datasets.builder import DATASETS from mmdet.datasets.custom import CustomDataset from mmcv import Config from mmdet.apis import set_random_seed @DATASETS.register_module() class KittiTinyDataset(CustomDataset):     CLASSES = ('Car' , 'Pedestrain' , 'Cyclist' )     def load_annotations(self, ann_file):         cat2label = {k: i for  i, k in  enumerate(self.CLASSES)}                  image_list = mmcv.list_from_file(self.ann_file)         data_infos = []                  for  image_id in  image_list:             filename = f"{self.img_prefix}/{image_id}.jpeg"              image = mmcv.imread(filename)             height, width = image.shape[:2]             data_info = dict(filename=f'{image_id}.jpeg' , width=width, height=height)                          label_prefix = self.img_prefix.replace('image_2' , 'label_2' )             lines = mmcv.list_from_file(osp.join(label_prefix, f'{image_id}.txt' ))             content = [line.strip().split (' ' ) for  line in  lines]             bbox_names = [x[0] for  x in  content]             bboxes = [[float (info) for  info in  x[4:8]] for  x in  content]             gt_bboxes = []             gt_labels = []             gt_bboxes_ignore = []             gt_labels_ignore = []                          for  bbox_name, bbox in  zip(bbox_names, bboxes):                 if  bbox_name in  cat2label:                     gt_labels.append(cat2label[bbox_name])                     gt_bboxes.append(bbox)                 else :                     gt_labels_ignore.append(-1)                     gt_bboxes_ignore.append(bbox)             data_anno = dict(                 bboxes=np.array(gt_bboxes, dtype=np.float32).reshape(-1, 4),                 labels=np.array(gt_labels, dtype=np.longlong),                 bboxes_ignore=np.array(gt_bboxes_ignore, dtype=np.float32).reshape(-1, 4),                 labels_ignore=np.array(gt_labels_ignore, dtype=np.longlong)             )             data_info.update(ann=data_anno)             data_infos.append(data_info)         return  data_infos cfg = Config.fromfile("configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py" ) cfg.device='cuda'  classes = ('Car' , 'Pedestrain' , 'Cyclist' ) cfg.dataset_type = 'KittiTinyDataset'  cfg.data_root = 'data/kitti_tiny'  cfg.classes = classes dtype = 'KittiTinyDataset'  droot = 'data/kitti_tiny'  cfg.data.test.type = dtype cfg.data.test.data_root = droot cfg.data.test.ann_file = 'train.txt'  cfg.data.test.img_prefix = 'training/image_2'  cfg.data.test.classes = classes cfg.data.train.type = dtype cfg.data.train.data_root = droot cfg.data.train.ann_file = 'train.txt'  cfg.data.train.img_prefix = 'training/image_2'  cfg.data.train.classes = classes cfg.data.val.type = dtype cfg.data.val.data_root = droot cfg.data.val.ann_file = 'val.txt'  cfg.data.val.img_prefix = 'training/image_2'  cfg.data.val.classes = classes cfg.model.roi_head.bbox_head.num_classes = 3 cfg.load_from = 'checkpoints/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-5324cff8.pth'  cfg.work_dir = 'work_dir_2'  if  not os.path.exists(cfg.work_dir):    os.path.makedirs(cfg.work_dir) cfg.optimizer.lr = 0.02 / 8 cfg.lr_config.warmup = None cfg.log_config.interval = 10 cfg.evaluation.metric = 'mAP'  cfg.evaluation.interval = 12 cfg.checkpoint_config.interval = 12 cfg.seed = 0 set_random_seed(0, deterministic=False) cfg.gpu_ids = range(1) print (f"Config: \n {cfg.pretty_text}" )cfg.dump(F'{cfg.work_dir}/customKittiFormat.py' ) import mmcv from mmdet.datasets import build_dataset from mmdet.models import build_detector from mmdet.apis import train_detector import os.path as osp from mmcv import Config datasets = [build_dataset(cfg.data.train)] model = build_detector(cfg.model) model.CLASSES = datasets[0].CLASSES mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_detector(model, datasets, cfg, distributed=False, validate=True) 
QA 
workflow 和 evaluation 的验证测量区别?271. workflow not work , 2. 171. Load train_dataloader and val_dataloader , 3. 1093. Allowing validation dataset for computing validation loss  , 4. [Bug] ‘ConfigDict’ object has no attribute ‘dataset’ #9633 
训练日志 time 和 data_time 时间表示什么?
  1 2 2023-08-06 07:46:54,707 - mmdet - INFO - Epoch [1][9000/29317]  lr: 2.000e-02, eta: 14:11:42, time: 0.229, data_time: 0.006, memory: 3947, loss_rpn_cls: 0.0809, loss_rpn_bbox: 0.0757, loss_cls: 0.3745, acc: 90.4531, loss_bbox: 0.2941, loss: 0.8252 2023-08-06 07:47:17,408 - mmdet - INFO - Epoch [1][9100/29317]  lr: 2.000e-02, eta: 14:06:03, time: 0.227, data_time: 0.006, memory: 3947, loss_rpn_cls: 0.0820, loss_rpn_bbox: 0.0685, loss_cls: 0.3663, acc: 90.9038, loss_bbox: 0.2829, loss: 0.7996 
  “2023-08-06 07:46:54,707” 表示当前系统时间,”mmdet” 表示 openmmlab 目标检测包,”Epoch [1][9000/29317]” 表示当前 Epoch 为 1,日志区间为 100 个迭代,即迭代 100 次打印一次日志。我这里训练集有 117268 张图片,用 2 个 GPU 并行训练,每个 GPU 批次是 2,因此每个 GPU  共需要 117268 / 2 / 2 = 29317 次迭代(一个 epoch),每次迭代一批次数据(batch_size)需要耗费时间 0.229 秒(网络转发和后处理的总时间,不包括数据加载时间),加载数据时间消耗 0.006 秒(数据加载时间)。因此,100次迭代的时间大概是 0.227 * 100 = 22.7 秒,大约就是系统时间的差值:2023-08-06 07:47:17,408 - 2023-08-06 07:46:54,707,这也是用户看到的日志打印等待时间。更多关于日志打印等请查看 mmdet 2.x: mmcv 1.7.1 .runner.hooks.logger.text.py 里面的内容,如果是 mmdet 3.x 请查看 mmengine 。
 
参考文献 
open-mmlab/mmdetection 【OpenMMLab 公开课】目标检测与 MMDetection 下