论文精读--Learning Efficient Object Detection Models with Knowledge Distillation

ops/2024/9/23 2:02:23/

目标检测任务中,存在特殊的挑战:

(1)目标检测任务标签信息量更大,根据标签学到的模型更为复杂,压缩后损失更多

(2)分类任务中,每个类别相对均衡,同等重要,而目标检测任务中,存在类别不平衡问题,背景类偏多

(3)目标检测任务更为复杂,既有类别分类,也有位置回归的预测

(4)现行的知识蒸馏主要针对同一域中数据进行蒸馏,对于跨域目标检测的任务而言,对知识的蒸馏有更高的要求

Abstract

Despite significant accuracy improvement in convolutional neural networks (CNN) based object detectors, they often require prohibitive runtimes to process an image for real-time applications. State-of-the-art models often use very deep networks with a large number of floating point operations. Efforts such as model compression learn compact models with fewer number of parameters, but with much reduced accuracy. In this work, we propose a new framework to learn compact and fast object detection networks with improved accuracy using knowledge distillation [20] and hint learning [34]. Although knowledge distillation has demonstrated excellent improvements for simpler classification setups, the complexity of detection poses new challenges in the form of regression, region proposals and less voluminous labels. We address this through several innovations such as a weighted cross-entropy loss to address class imbalance, a teacher bounded loss to handle the regression component and adaptation layers to better learn from intermediate teacher distributions. We conduct comprehensive empirical evaluation with different distillation configurations over multiple datasets including PASCAL, KITTI, ILSVRC and MS-COCO. Our results show consistent improvement in accuracy-speed trade-offs for modern multi-class detection models.

翻译:

尽管基于卷积神经网络(CNN)的目标检测器在准确性方面取得了显著的提高,但它们往往需要禁止的运行时间来处理图像以用于实时应用。最先进的模型通常使用非常深层的网络和大量的浮点运算。诸如模型压缩之类的工作学习具有更少参数的紧凑模型,但准确性大大降低。在本工作中,我们提出了一种新的框架,使用知识蒸馏和提示学习来学习紧凑且快速的目标检测网络,并改善准确性。尽管知识蒸馏在简单分类设置中表现出了出色的改进,但检测的复杂性提出了新的挑战,例如回归、区域提议和较少数量的标签。我们通过几项创新来解决这些挑战,例如加权交叉熵损失来解决类别不平衡问题,教师边界损失来处理回归组件,并且使用适应层更好地从中间教师分布中学习。我们在多个数据集上对不同的蒸馏配置进行了全面的实证评估,包括PASCAL、KITTI、ILSVRC和MS-COCO。我们的结果显示,对于现代多类别检测模型,准确性和速度的折衷一直得到了一致的改善。

Introduction

On the other hand, seminal works on knowledge distillation show that a shallow or compressed model trained to mimic the behavior of a deeper or more complex model can recover some or all of the accuracy drop [3, 20, 34]. However, those results are shown only for problems such as classification, using simpler networks without strong regularization such as dropout.

Applying distillation techniques to multi-class object detection, in contrast to image classification, is challenging for several reasons. First, the performance of detection models suffers more degradation with compression, since detection labels are more expensive and thereby, usually less voluminous.Second, knowledge distillation is proposed for classification assuming each class is equally important, whereas that is not the case for detection where the background class is far more prevalent. Third, detection is a more complex task that combines elements of both classification and bounding box regression. Finally, an added challenge is that we focus on transferring knowledge within the same domain (images of the same dataset) with no additional data or labels, as opposed other works that might rely on data from other domains (such as high-quality and low-quality image domains, or image and depth domains)

翻译:

另一方面,关于知识蒸馏的重要研究表明,一个浅层或压缩的模型训练成模仿更深或更复杂模型的行为可以恢复部分或全部准确性下降。然而,这些结果仅适用于诸如分类等问题,使用没有像dropout这样强的正则化的简单网络。

与图像分类相比,将蒸馏技术应用于多类目标检测具有挑战性,原因有几个。首先,由于检测标签更昂贵,通常情况下数量更少,因此检测模型的性能在压缩时会更受影响。其次,知识蒸馏被提出用于分类,假设每个类别同等重要,而在检测中情况并非如此,背景类别更为普遍。第三,检测是一个更复杂的任务,结合了分类和边界框回归的元素。最后,一个额外的挑战是,我们专注于在同一领域内(同一数据集的图像)传递知识,没有额外的数据或标签,而其他工作可能依赖于来自其他领域的数据(例如高质量和低质量图像领域,或图像和深度领域)。

总结:

KD一开始的应用领域不是目标检测目标检测比分类复杂,需要改进;没有使用伪标签

To address the above challenges, we propose a method to train fast models for object detection with knowledge distillation. Our contributions are four-fold: 

• We propose an end-to-end trainable framework for learning compact multi-class object detection models through knowledge distillation (Section 3.1). To the best of our knowledge, this is the first successful demonstration of knowledge distillation for the multi-class object detection problem.

• We propose new losses that effectively address the aforementioned challenges. In particular, we propose a weighted cross entropy loss for classification that accounts for the imbalance in the impact of misclassification for background class as opposed to object classes (Section 3.2), a teacher bounded regression loss for knowledge distillation (Section 3.3) and adaptation layers for hint learning that allows the student to better learn from the distribution of neurons in intermediate layers of the teacher (Section 3.4).

• We perform comprehensive empirical evaluation using multiple large-scale public benchmarks.Our study demonstrates the positive impact of each of the above novel design choices, resulting in significant improvement in object detection accuracy using compressed fast networks, consistently across all benchmarks (Sections 4.1 – 4.3).

• We present insights into the behavior of our framework by relating it to the generalization and under-fitting problems (Section 4.4).

翻译:

为了解决上述挑战,我们提出了一种使用知识蒸馏训练快速目标检测模型的方法。我们的贡献有四个方面:

• 我们提出了一个端到端可训练的框架,通过知识蒸馏学习紧凑的多类目标检测模型(第3.1节)。据我们所知,这是对多类目标检测问题进行知识蒸馏的首次成功演示。

• 我们提出了新的损失函数,有效解决了上述挑战。特别地,我们提出了一种加权交叉熵损失,用于分类,考虑了对背景类别和目标类别的误分类影响不平衡(第3.2节),一种用于知识蒸馏的教师边界回归损失(第3.3节),以及用于提示学习的适应层,允许学生更好地从教师的中间层神经元分布中学习(第3.4节)。

• 我们使用多个大规模公共基准进行了全面的实证评估。我们的研究表明了上述每个新设计选择的积极影响,在所有基准测试中,使用压缩快速网络显著提高了目标检测准确性(第4.1 - 4.3节)。

• 我们通过将其与泛化和欠拟合问题相关联,提供了对我们框架行为的深入见解(第4.4节)。

Related Works

相较于已有的KD+目标检测,最近一个是用目标检测模型做二分类,和多分类任务还是有差别

Method

Overall Structure

对于主干网络,作者使用FitNet中的hint learning进行蒸馏,即加入adaptation layers使得feature map的维度匹配

对于分类任务的输出,使用加权cross entropy loss来解决类别失衡严重问题

对于回归任务,除了原本的smooth L1 loss,作者还提出teacher bounded regression loss,将教师的回归预测作为上界,学生网络回归的结果更优则该损失为0。

 

Knowledge Distillation for Classification with Imbalanced Classes

对于分类损失中的背景误分概率占比较高的情况,作者提出增大蒸馏交叉熵中背景类的权重来解决失衡问题

令背景类的wc为1,目标类为1.5

Knowledge Distillation for Regression with Teacher Bounds

对于回归结果的蒸馏,由于回归的输出是无界的,且教师网络的预测方向可能与groundtruth的方向相反。因此,作者将教师的输出损失作为上界,当学生网络的输出损失大于上界时计入该损失否则不考虑该loss。

Hint Learning with Feature Adaptation

添加一个adapt层效果会更好,哪怕引导层和提示层的维度相同

FitNets中hint learning的误差

Conclusion

We propose a novel framework for learning compact and fast CNN based object detectors with the knowledge distillation. Highly complicated detector models are used as a teacher to guide the learning process of efficient student models. Combining the knowledge distillation and hint framework together with our newly proposed loss functions, we demonstrate consistent improvements over various experimental setups. Notably, the compact models trained with our learning framework execute significantly faster than the teachers with almost no accuracy compromises at PASCAL dataset. Our empirical analysis reveals the presence of under-fitting issue in object detector learning, which could provide good insights to further advancement in the field.

翻译:

我们提出了一种新颖的框架,利用知识蒸馏来学习紧凑且高效的基于CNN的目标检测器。高度复杂的检测器模型被用作教师,来引导高效的学生模型的学习过程。将知识蒸馏和提示框架与我们新提出的损失函数结合起来,我们在各种实验设置中展示了持续的改进。值得注意的是,使用我们的学习框架训练的紧凑模型在PASCAL数据集上的执行速度显著快于教师模型,几乎没有准确性的妥协。我们的实证分析揭示了目标检测学习中存在欠拟合问题的存在,这可能为该领域的进一步发展提供了有益的见解。


http://www.ppmy.cn/ops/3116.html

相关文章

vue element-ui 表格横向滚动条在合计项下方

目前效果 需求效果 1.隐藏bodyWrapper滚动条,显示footerWrapper滚动条 css代码如下: div ::v-deep .el-table--scrollable-x .el-table__body-wrapper{overflow-x: hidden!important;z-index: 2!important;} div ::v-deep .el-table__footer-wrapper …

大模型引领未来:探索其在多个领域的深度应用与无限可能【第二章、金融领域:大模型重塑金融生态】

大模型引领未来:探索其在多个领域的深度应用与无限可能【第二章、金融领域:大模型重塑金融生态】 1.智能客服与自动化交易系统的崛起2.风险评估与投资决策的精准化3.客户服务的个性化与智能化4.金融领域大模型清单 1.智能客服与自动化交易系统的崛起 随…

JavaScript 设计模式 —— 富有表现力的JavaScript

第一部分 第1章(富有表现力的JavaScript) 揭示了JavaScript语言富有表现力的特点。从中你可以体会到,这种语言允许你用各种各样的编程风格来完成同样的任务,还允许你在面向对象编程的过程中借用函数式编程的概念来丰富其实现方式。 这一章解释了究竟为什么应该使用设计模式…

spark运行报错

File “D:\ProgramData\anaconda3\envs\python10\lib\site-packages\pyspark\sql\readwriter.py”, line 314, in load return self._df(self._jreader.load()) File “D:\ProgramData\anaconda3\envs\python10\lib\site-packages\py4j\java_gateway.py”, line 1322, in call …

在Linux系统中搜索当前路径及其子目录下所有PDF文件中是否包含特定字符串

目录标题 方法一:pdfgrep方法二:使用find和xargs与pdftotext(将PDF转换为文本)组合,然后用grep搜索 方法一:pdfgrep pdfgrep -ri "rockchip" .方法二:使用find和xargs与pdftotext&am…

Git

提交: git add . git status git commit -m "补充UserFragment" git pull origin master(注意:master是主分支) git push origin master Git 常用命令 git branch 查看本地所有分支 git status 查看当前状态 git commit 提交 git branch -a…

智能商品计划系统如何提升鞋服零售品牌的竞争力

国内鞋服零售企业经过多年的发展,已经形成了众多知名品牌,然而近年来一些企业频频受到库存问题的困扰,这一问题不仅影响了品牌商自身,也给长期合作的经销商带来了困扰。订货会制度在初期曾经有效地解决了盲目生产的问题&#xff0…

ceph osd分组

一、前言 使用分组可以更好的管理osd,将不同类型的磁盘,分到不同的组中,例如hhd类型的osd分配到hhd组,ssd类型的osd分配到ssd组,将io要求不高的分配到hhd组做存储,io要求高的分配到ssd组做存储 二、配置 查…