nanodet阅读：（1）概述

寿亦

2023-12-01

一、前言

出于某些需要，阅读一下anchor-free模型的代码，因为之前用过nanodet，对其印象深刻，所以重温一下代码。好记性不如烂笔头，多记录、多总结、多分享。
正如作者博客说的：NanoDet总体而言没有特别多的创新点，是一个纯工程化的项目，主要的工作就是将目前学术界的一些优秀论文，落地到移动端的轻量级模型上。

二、正文

1. 模型整体特点
模型之所以轻量，是因为作者用了
① 轻量的backbone ：经典轻量级模型，如mobilenet, shufflenet等；
② 轻量的FPN ：完全去掉PAN中的所有卷积，只保留1x1卷积来进行特征通道维度的对齐，上采样和下采样均使用插值来完成；
③ 轻量的head ：深度卷积、减少卷积个数与维度、边框回归和分类共享同一组卷积。
此外还因为作者选择了
① 合适的损失函数GFocal Loss；
② 合适的正负样本定义方法ATSS；
③ 轻量但性能不弱的backbone；
④ 成熟的模型架构 backbone + pan + head；
⑤ head不共享权重(检测头非常轻量的情况下，共享权重会降低其泛化能力);

使得模型虽然轻量，但性能不差。

2. nanodet anchor 大小及生成。
nanodet虽说是anchor-free路线，但还是有anchor的，其作用主要体现在训练时的正负样本定义（ATSS）阶段，其他时候只会用到anchor的中心坐标（如计算bbox时）。

def get_single_level_center_point(
        self, featmap_size, stride, dtype, device, flatten=True
    ):
        """
        Generate pixel centers of a single stage feature map.
        :param featmap_size: height and width of the feature map
        :param stride: down sample stride of the feature map
        :param dtype: data type of the tensors
        :param device: device of the tensors
        :param flatten: flatten the x and y tensors
        :return: y and x of the center points
        """
        h, w = featmap_size
        # 加 0.5, 输出 anchor 中心坐标
        x_range = (torch.arange(w, dtype=dtype, device=device) + 0.5) * stride
        y_range = (torch.arange(h, dtype=dtype, device=device) + 0.5) * stride
        y, x = torch.meshgrid(y_range, x_range)
        if flatten:
            y = y.flatten()
            x = x.flatten()
        return y, x

def get_grid_cells(self, featmap_size, scale, stride, dtype, device):  
	"""
	Generate grid cells of a feature map for target assignment.
	:param featmap_size: Size of a single level feature map.
	:param scale: Grid cell scale.
	:param stride: Down sample stride of the feature map.
	:param dtype: Data type of the tensors.
	:param device: Device of the tensors.
	:return: Grid_cells xyxy position. Size should be [feat_w * feat_h, 4]
	"""
	cell_size = stride * scale  # anchor 的边长。scale = 5 超参
	# 生成 anchor 中心坐标
	y, x = self.get_single_level_center_point(
	    featmap_size, stride, dtype, device, flatten=True
	)
	# 生成 anhcor 左上右下坐标
	grid_cells = torch.stack(
	    [
	        x - 0.5 * cell_size,  # 在 cell 中心坐标处放了一个方形 anchor，宽为 cell_size
	        y - 0.5 * cell_size,
	        x + 0.5 * cell_size,
	        y + 0.5 * cell_size,
	    ],
	    dim=-1,
	)
	return grid_cells

从上面代码可以看出，nanodet的anchor有三个特点：
① 形状单一，每个输出层上都是正方形anchor；
② 数量少，每个输出层上只有一种anchor，总体的anchor数目少了很多；
③ 尺寸单一，输出层上的anchor只有一种尺寸——stride * scale。

由此产生疑惑：为什么anchor的形状要设置为正方形？
个人理解：因为anchor的主要作用是在正负样本分类时，如果设置为W > H的形状，对W < H形状的ground truth可能会匹配不佳。反之亦然，所以干脆设置成正方形的形状，无论是 W < H形状的还是 W > H形状的ground truth，都能兼顾到。

三、后言

仓促之下写成，如有遗漏，还请指正，谢谢！
此外，本系列一共三篇，另有：
nanodet阅读：（2）正负样本定义(ATSS)；
nanodet阅读：（3）Loss计算及推理部分。

nanodet阅读：（1）概述

一、前言

二、正文

三、后言

相关阅读

相关文章

相关问答

相关文档

nanodet阅读：（1）概述

一、前言

二、正文

三、 后言

相关阅读

相关文章

相关问答

相关文档

三、后言