GFPGAN源码分析—第十篇

商池暝

2023-12-01

2021SC@SDUSC

源码：

models\gfpgan_model.py

本篇继续分析init.py与models\gfpgan_model.py下的

class GFPGANModel(BaseModel) 类get_roi_regions() 方法

class GFPGANModel(BaseModel)

construct_img_pyramid(self)

get_roi_regions()

_gram_mat(self, x)

class GFPGANModel(BaseModel)

construct_img_pyramid(self)

代码：

def construct_img_pyramid(self):
    pyramid_gt = [self.gt]
    down_img = self.gt
    for _ in range(0, self.log_size - 3):
        #对down_img进行数组采样操作
        down_img = F.interpolate(down_img, scale_factor=0.5, mode='bilinear', align_corners=False)
        #将down_img插入 pyramid_gt的最前面
        pyramid_gt.insert(0, down_img)
    return pyramid_gt

重点介绍一下F.interpolate即数组采样

作用：利用插值方法，对输入的张量数组进行上/下采样操作
F.interpolate的几个参数：
1.input(Tensor)：需要进行采样处理的数组。
2.size(int或序列)：输出空间的大小
3.scale_factor(float或序列)：空间大小的乘数
4.mode(str)：用于采样的算法。'nearest'| 'linear'| 'bilinear'| 'bicubic'| 'trilinear'| 'area'。默认：'nearest'
5.align_corners(bool)
6.recompute_scale_facto(bool)

get_roi_regions()

参数：

self, eye_out_size=80, mouth_out_size=120

1.硬编码(hard cord)

rois_eyes = []
rois_mouths = []
for b in range(self.loc_left_eyes.size(0)):  # loop for batch size
    # left eye and right eye
    img_inds = self.loc_left_eyes.new_full((2, 1), b)
    #torch.stack()沿指定维度拼接
    bbox = torch.stack([self.loc_left_eyes[b, :], self.loc_right_eyes[b, :]], dim=0)  # shape: (2, 4)
    #torch.cat()对img_inds沿指定维度拼接
    rois = torch.cat([img_inds, bbox], dim=-1)  # shape: (2, 5)
    rois_eyes.append(rois)
    # mouse
    img_inds = self.loc_left_eyes.new_full((1, 1), b)
    #torch.cat()对img_inds沿指定维度拼接
    rois = torch.cat([img_inds, self.loc_mouths[b:b + 1, :]], dim=-1)  # shape: (1, 5)
    rois_mouths.append(rois)

rois_eyes = torch.cat(rois_eyes, 0).to(self.device)
rois_mouths = torch.cat(rois_mouths, 0).to(self.device)

在这里对比以下两种方法的区别

torch.cat()：对tensors沿指定维度拼接，但返回的Tensor的维数不会变
torch.stack()同样是对tensors沿指定维度拼接，但返回的Tensor会多一维

3.real images

all_eyes = roi_align(self.gt, boxes=rois_eyes, output_size=eye_out_size) * face_ratio
self.left_eyes_gt = all_eyes[0::2, :, :, :]
self.right_eyes_gt = all_eyes[1::2, :, :, :]
self.mouths_gt = roi_align(self.gt, boxes=rois_mouths, output_size=mouth_out_size) * face_ratio

4.输出

all_eyes = roi_align(self.output, boxes=rois_eyes, output_size=eye_out_size) * face_ratio
self.left_eyes = all_eyes[0::2, :, :, :]
self.right_eyes = all_eyes[1::2, :, :, :]
self.mouths = roi_align(self.output, boxes=rois_mouths, output_size=mouth_out_size) * face_ratio

_gram_mat(self, x)

用于计算格拉姆矩阵（Gram matrix），最后返回这样一个矩阵

参数：

x (torch.Tensor): Tensor with shape of (n, c, h, w).

代码：

n, c, h, w = x.size()
#调用view函数把原先tensor中的数据按照行优先的顺序排成一个一维的数据
features = x.view(n, c, w * h)
#交换输入张量 features 的两个维度
features_t = features.transpose(1, 2)
#计算
gram = features.bmm(features_t) / (c * h * w)
#返回
return gram

GFPGAN源码分析—第十篇

class GFPGANModel(BaseModel)

construct_img_pyramid(self)

get_roi_regions()

_gram_mat(self, x)

相关阅读

相关文章

相关问答

相关文档