Enriching Variety of Layer-Wise Learning Information by Gradient Combination


Chien-Yao Wang, Hong-Yuan Mark Liao, Ping-Yang Chen, Jun-Wei Hsieh


arXiv (archive - the X represents the Greek letter chi [χ]) is a repository of electronic preprints approved for posting after moderation, but not full peer review.

This ICCV Workshop paper is the Open Access version, provided by the Computer Vision Foundation.
Except for this watermark, it is identical to the accepted version; the final published version of the proceedings is available on IEEE Xplore.


This study proposes to use the combination of gradient concept to enhance the learning capability of Deep Convolutional Networks (DCN), and four Partial Residual Networks-based (PRN-based) architectures are developed to verify above concept. The purpose of designing PRN is to provide as rich information as possible for each single layer. During the training phase, we propose to propagate gradient combinations rather than feature combinations. PRN can be easily applied in many existing network architectures, such as ResNet, feature pyramid network, etc., and can effectively improve their performance. Nowadays, more advanced DCNs are designed with the hierarchical semantic information of multiple layers, so the model will continue to deepen and expand. Due to the neat design of PRN, it can benefit all models, especially for lightweight models. In the MSCOCO object detection experiments, YOLO-v3-PRN maintains the same accuracy as YOLO-v3 with a 55% reduction of parameters and 35% reduction of computation, while increasing the speed of execution by twice. For lightweight models, YOLO-v3-tiny-PRN maintains the same accuracy under the condition of 37% less parameters and 38% less computation than YOLO-v3-tiny and increases the frame rate by up to 12 fps on the NVIDIA Jetson TX2 platform. The Pelee-PRN is 6.7% mAP@0.5 higher than Pelee, which achieves the state-of-the-art lightweight object detection. The proposed lightweight object detection model has been integrated with technologies such as multi-object tracking and license plate recognition, and is used in a commercial intelligent traffic flow analysis system as its edge computing component. There are already three countries and more than ten cities have deployed this technique into their traffic flow analysis systems.
本研究提出使用梯度概念的组合来增强深度卷积网络 (DCN) 的学习能力,并开发了四种 Partial Residual Networks-based (PRN-based) 架构来验证上述概念。设计 PRN 的目的是为每个单层提供尽可能丰富的信息。在训练阶段,我们建议传播梯度组合而不是特征组合。PRN 可以轻松地应用于许多现有的网络体系结构中,例如 ResNet、feature pyramid network 等,并可以有效地提高其性能。如今,更高级的 DCN 具有多层的分层语义信息,因此该模型将继续深化和扩展。由于 PRN 的简洁设计,它可以使所有模型受益,特别是对于轻量的模型。在 MSCOCO 目标检测实验中,YOLO-v3-PRN 保持与 YOLO-v3 相同的精度,参数减少了 55%,计算量减少了 35%,同时执行速度提高了两倍。对于轻量的模型,YOLO-v3-tiny-PRN 在比 YOLO-v3-tiny 少 37% 的参数和 38% 的计算量的情况下保持相同的精度,并且在 NVIDIA Jetson TX2 平台上将帧速率提高了 12 fps。Pelee-PRN 比 Pelee 高 6.7% mAP@0.5,可实现最先进的轻量的物体检测。提出的轻量级目标检测模型已与多目标跟踪和车牌识别等技术集成在一起,并作为其边缘计算组件用于商业智能交通流量分析系统。已经有三个国家和十多个城市已将此技术部署到他们的交通流量分析系统中。

