python目标检测非极大抑制NMS与Soft-NMS

2025-06-23 05:24:03

Soft-NMS对于大多数数据集而言，作用比较小，提升效果非常不明显，它起作用的地方是大量密集的同类重叠场景，大量密集的不同类重叠场景其实也没什么作用，同学们可以借助Soft-NMS理解非极大抑制的含义，但是实现的必要性确实不强，在提升网络性能上，不建议死磕Soft-NMS。

已对该博文中的代码进行了重置，视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。

学习前言

非极大抑制是目标检测中非常非常非常非常非常重要的一部分，了解一下原理，撕一下代码是必要的！

什么是非极大抑制NMS

非极大抑制的概念只需要看这两幅图就知道了：

下图是经过非极大抑制的。

下图是未经过非极大抑制的。

可以很明显的看出来，未经过非极大抑制的图片有许多重复的框，这些框都指向了同一个物体！

可以用一句话概括非极大抑制的功能就是：

筛选出一定区域内属于同一种类得分最大的框。

1、非极大抑制NMS的实现过程

本博文实现的是多分类的非极大抑制，该非极大抑制使用在我的pytorch-yolov3例子中：
输入shape为[ batch_size, all_anchors, 5+num_classes ]

第一个维度是图片的数量。
第二个维度是所有的预测框。
第三个维度是所有的预测框的预测结果。

非极大抑制的执行过程如下所示：
1、对所有图片进行循环。
2、找出该图片中得分大于门限函数的框。在进行重合框筛选前就进行得分的筛选可以大幅度减少框的数量。
3、判断第2步中获得的框的种类与得分。取出预测结果中框的位置与之进行堆叠。此时最后一维度里面的内容由5+num_classes变成了4+1+2，四个参数代表框的位置，一个参数代表预测框是否包含物体，两个参数分别代表种类的置信度与种类。
4、对种类进行循环，非极大抑制的作用是筛选出一定区域内属于同一种类得分最大的框，对种类进行循环可以帮助我们对每一个类分别进行非极大抑制。
5、根据得分对该种类进行从大到小排序。
6、每次取出得分最大的框，计算其与其它所有预测框的重合程度，重合程度过大的则剔除。

视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。

实现代码如下：

def bbox_iou(self, box1, box2, x1y1x2y2=True):    """        计算IOU    """    if not x1y1x2y2:        b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2        b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2        b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2        b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2    else:        b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]        b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]    inter_rect_x1 = torch.max(b1_x1, b2_x1)    inter_rect_y1 = torch.max(b1_y1, b2_y1)    inter_rect_x2 = torch.min(b1_x2, b2_x2)    inter_rect_y2 = torch.min(b1_y2, b2_y2)    inter_area = torch.clamp(inter_rect_x2 - inter_rect_x1, min=0) * \                torch.clamp(inter_rect_y2 - inter_rect_y1, min=0)                    b1_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1)    b2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1)        iou = inter_area / torch.clamp(b1_area + b2_area - inter_area, min = 1e-6)    return ioudef non_max_suppression(self, prediction, num_classes, input_shape, image_shape, letterbox_image, conf_thres=0.5, nms_thres=0.4):    #----------------------------------------------------------#    #   将预测结果的格式转换成左上角右下角的格式。    #   prediction  [batch_size, num_anchors, 85]    #----------------------------------------------------------#    box_corner          = prediction.new(prediction.shape)    box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2    box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2    box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2    box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2    prediction[:, :, :4] = box_corner[:, :, :4]    output = [None for _ in range(len(prediction))]    for i, image_pred in enumerate(prediction):        #----------------------------------------------------------#        #   对种类预测部分取max。        #   class_conf  [num_anchors, 1]    种类置信度        #   class_pred  [num_anchors, 1]    种类        #----------------------------------------------------------#        class_conf, class_pred = torch.max(image_pred[:, 5:5 + num_classes], 1, keepdim=True)        #----------------------------------------------------------#        #   利用置信度进行第一轮筛选        #----------------------------------------------------------#        conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze()        #----------------------------------------------------------#        #   根据置信度进行预测结果的筛选        #----------------------------------------------------------#        image_pred = image_pred[conf_mask]        class_conf = class_conf[conf_mask]        class_pred = class_pred[conf_mask]        if not image_pred.size(0):            continue        #-------------------------------------------------------------------------#        #   detections  [num_anchors, 7]        #   7的内容为：x1, y1, x2, y2, obj_conf, class_conf, class_pred        #-------------------------------------------------------------------------#        detections = torch.cat((image_pred[:, :5], class_conf.float(), class_pred.float()), 1)        #------------------------------------------#        #   获得预测结果中包含的所有种类        #------------------------------------------#        unique_labels = detections[:, -1].cpu().unique()        if prediction.is_cuda:            unique_labels = unique_labels.cuda()            detections = detections.cuda()        for c in unique_labels:            #------------------------------------------#            #   获得某一类得分筛选后全部的预测结果            #------------------------------------------#            detections_class = detections[detections[:, -1] == c]            # #------------------------------------------#            # #   使用官方自带的非极大抑制会速度更快一些！            # #------------------------------------------#            # keep = nms(            #     detections_class[:, :4],            #     detections_class[:, 4] * detections_class[:, 5],            #     nms_thres            # )            # max_detections = detections_class[keep]                        # 按照存在物体的置信度排序            _, conf_sort_index = torch.sort(detections_class[:, 4]*detections_class[:, 5], descending=True)            detections_class = detections_class[conf_sort_index]            # 进行非极大抑制            max_detections = []            while detections_class.size(0):                # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉                max_detections.append(detections_class[0].unsqueeze(0))                if len(detections_class) == 1:                    break                ious = self.bbox_iou(max_detections[-1], detections_class[1:])                detections_class = detections_class[1:][ious < nms_thres]            # 堆叠            max_detections = torch.cat(max_detections).data                        # Add max detections to outputs            output[i] = max_detections if output[i] is None else torch.cat((output[i], max_detections))                if output[i] is not None:            output[i]           = output[i].cpu().numpy()            box_xy, box_wh      = (output[i][:, 0:2] + output[i][:, 2:4])/2, output[i][:, 2:4] - output[i][:, 0:2]            output[i][:, :4]    = self.yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape, letterbox_image)    return output

2、柔性非极大抑制Soft-NMS的实现过程

柔性非极大抑制和普通的非极大抑制相差不大，只差了几行代码。

柔性非极大抑制认为不应该直接只通过重合程度进行筛选，如图所示，很明显图片中存在两匹马，但是此时两匹马的重合程度较高，此时我们如果使用普通nms，后面那匹得分比较低的马会直接被剔除。

Soft-NMS认为在进行非极大抑制的时候要同时考虑得分和重合程度。

我们直接看NMS和Soft-NMS的代码差别：
视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。
如下为NMS：

while detections_class.size(0):    # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉    max_detections.append(detections_class[0].unsqueeze(0))    if len(detections_class) == 1:        break    ious = self.bbox_iou(max_detections[-1], detections_class[1:])    detections_class = detections_class[1:][ious < nms_thres]

如下为Soft-NMS：

while detections_class.size(0):    # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉    max_detections.append(detections_class[0].unsqueeze(0))    if len(detections_class) == 1:        break    ious                    = self.bbox_iou(max_detections[-1], detections_class[1:])    detections_class[1:, 4] = torch.exp(-(ious * ious) / sigma) * detections_class[1:, 4]    detections_class        = detections_class[1:]    detections_class        = detections_class[detections_class[:, 4] >= conf_thres]    arg_sort                = torch.argsort(detections_class[:, 4], descending = True)    detections_class        = detections_class[arg_sort]

我们可以看到，对于NMS而言，其直接将 与得分最大的框重合程度较高的其它预测剔除。而Soft-NMS则以一个权重的形式，将获得的IOU取高斯指数后乘上原得分，之后重新排序。继续循环。

视频中实现的代码是numpy形式，而且库比较久远。这里改成pytorch的形式，且适应当前的库。
实现代码如下：

def bbox_iou(self, box1, box2, x1y1x2y2=True):    """        计算IOU    """    if not x1y1x2y2:        b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2        b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2        b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2        b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2    else:        b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]        b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]    inter_rect_x1 = torch.max(b1_x1, b2_x1)    inter_rect_y1 = torch.max(b1_y1, b2_y1)    inter_rect_x2 = torch.min(b1_x2, b2_x2)    inter_rect_y2 = torch.min(b1_y2, b2_y2)    inter_area = torch.clamp(inter_rect_x2 - inter_rect_x1, min=0) * \                torch.clamp(inter_rect_y2 - inter_rect_y1, min=0)                    b1_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1)    b2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1)        iou = inter_area / torch.clamp(b1_area + b2_area - inter_area, min = 1e-6)    return ioudef non_max_suppression(self, prediction, num_classes, input_shape, image_shape, letterbox_image, conf_thres=0.5, nms_thres=0.4, sigma=0.5):    #----------------------------------------------------------#    #   将预测结果的格式转换成左上角右下角的格式。    #   prediction  [batch_size, num_anchors, 85]    #----------------------------------------------------------#    box_corner          = prediction.new(prediction.shape)    box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2    box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2    box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2    box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2    prediction[:, :, :4] = box_corner[:, :, :4]    output = [None for _ in range(len(prediction))]    for i, image_pred in enumerate(prediction):        #----------------------------------------------------------#        #   对种类预测部分取max。        #   class_conf  [num_anchors, 1]    种类置信度        #   class_pred  [num_anchors, 1]    种类        #----------------------------------------------------------#        class_conf, class_pred = torch.max(image_pred[:, 5:5 + num_classes], 1, keepdim=True)        #----------------------------------------------------------#        #   利用置信度进行第一轮筛选        #----------------------------------------------------------#        conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze()        #----------------------------------------------------------#        #   根据置信度进行预测结果的筛选        #----------------------------------------------------------#        image_pred = image_pred[conf_mask]        class_conf = class_conf[conf_mask]        class_pred = class_pred[conf_mask]        if not image_pred.size(0):            continue        #-------------------------------------------------------------------------#        #   detections  [num_anchors, 7]        #   7的内容为：x1, y1, x2, y2, obj_conf, class_conf, class_pred        #-------------------------------------------------------------------------#        detections = torch.cat((image_pred[:, :5], class_conf.float(), class_pred.float()), 1)        #------------------------------------------#        #   获得预测结果中包含的所有种类        #------------------------------------------#        unique_labels = detections[:, -1].cpu().unique()        if prediction.is_cuda:            unique_labels = unique_labels.cuda()            detections = detections.cuda()        for c in unique_labels:            #------------------------------------------#            #   获得某一类得分筛选后全部的预测结果            #------------------------------------------#            detections_class = detections[detections[:, -1] == c]            # #------------------------------------------#            # #   使用官方自带的非极大抑制会速度更快一些！            # #------------------------------------------#            # keep = nms(            #     detections_class[:, :4],            #     detections_class[:, 4] * detections_class[:, 5],            #     nms_thres            # )            # max_detections = detections_class[keep]                        # 按照存在物体的置信度排序            _, conf_sort_index = torch.sort(detections_class[:, 4]*detections_class[:, 5], descending=True)            detections_class = detections_class[conf_sort_index]            # 进行非极大抑制            max_detections = []            while detections_class.size(0):                # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉                max_detections.append(detections_class[0].unsqueeze(0))                if len(detections_class) == 1:                    break                ious                    = self.bbox_iou(max_detections[-1], detections_class[1:])                detections_class[1:, 4] = torch.exp(-(ious * ious) / sigma) * detections_class[1:, 4]                detections_class        = detections_class[1:]                detections_class        = detections_class[detections_class[:, 4] >= conf_thres]                arg_sort                = torch.argsort(detections_class[:, 4], descending = True)                detections_class        = detections_class[arg_sort]            # 堆叠            max_detections = torch.cat(max_detections).data                        # Add max detections to outputs            output[i] = max_detections if output[i] is None else torch.cat((output[i], max_detections))                if output[i] is not None:            output[i]           = output[i].cpu().numpy()            box_xy, box_wh      = (output[i][:, 0:2] + output[i][:, 2:4])/2, output[i][:, 2:4] - output[i][:, 0:2]            output[i][:, :4]    = self.yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape, letterbox_image)    return output

以上就是python目标检测非极大抑制NMS与Soft-NMS的详细内容，更多关于非极大抑制NMS Soft-NMS的资料请关注我们其它相关文章！

python 实现非极大值抑制算法（Non-maximum suppression, NMS）

NMS 算法在目标检测,目标定位领域有较广泛的应用. 算法原理非极大值抑制算法(Non-maximum suppression, NMS)的本质是搜索局部极大值,抑制非极大值元素. 算法的作用当算法对一个目标产生了多个候选框的时候,选择 score 最高的框,并抑制其他对于改目标的候选框适用场景一幅图中有多个目标(如果只有一个目标,那么直接取 score 最高的候选框即可). 算法的输入算法对一幅图产生的所有的候选框,以及每个框对应的 score (可以用一个 5 维数组 dets 表
Python 非极大值抑制(NMS)的四种实现详解

目录一. 几点说明 1. 简单说明Cython: 2. 简单介绍NMS: 二. 四种方法实现 1. 纯python实现:nms_py.py 2.直接利用Cython模块编译:nms_py1.pyx 3. 更改变量定义后再利用Cython模块编译:nms_py2.pyx 4. 在方法3的基础上利用GPU:gpu_nms.pyx 方法1:纯python语言实现:简介方便.速度慢方法2:直接利用Cython模块编译方法3:先将全部变量定义为静态类型,再利用Cython模块编译方法4:在方法
python目标检测非极大抑制NMS与Soft-NMS

目录睿智的目标检测31——非极大抑制NMS与Soft-NMS 注意事项学习前言什么是非极大抑制NMS1.非极大抑制NMS的实现过程2.柔性非极大抑制Soft-NMS的实现过程注意事项 Soft-NMS对于大多数数据集而言,作用比较小,提升效果非常不明显,它起作用的地方是大量密集的同类重叠场景,大量密集的不同类重叠场景其实也没什么作用,同学们可以借助Soft-NMS理解非极大抑制的含义,但是实现的必要性确实不强,在提升网络性能上,不建议死磕Soft-NMS. 已对该博文中的代码进行了重置,视频
python目标检测yolo2详解及预测代码复现

目录前言实现思路 1.yolo2的预测思路(网络构建思路) 2.先验框的生成 3.利用先验框对网络的输出进行解码 4.进行得分排序与非极大抑制筛选实现结果前言 ……最近在学习yolo1.yolo2和yolo3,写这篇博客主要是为了让自己对yolo2的结构有更加深刻的理解,同时要理解清楚先验框的含义. 尽量配合代码观看会更容易理解. 直接下载实现思路 1.yolo2的预测思路(网络构建思路) YOLOv2使用了一个新的分类网络DarkNet19作为特征提取部分,DarkNet19包含19
python目标检测SSD算法预测部分源码详解

目录学习前言什么是SSD算法 ssd_vgg_300主体的源码学习前言 ……学习了很多有关目标检测的概念呀,咕噜咕噜,可是要怎么才能进行预测呢,我看了好久的SSD源码,将其中的预测部分提取了出来,训练部分我还没看懂什么是SSD算法 SSD是一种非常优秀的one-stage方法,one-stage算法就是目标检测和分类是同时完成的,其主要思路是均匀地在图片的不同位置进行密集抽样,抽样时可以采用不同尺度和长宽比,然后利用CNN提取特征后直接进行分类与回归,整个过程只需要一步,所以其优势是速度
python目标检测yolo3详解预测及代码复现

目录学习前言实现思路 1.yolo3的预测思路(网络构建思路) 2.利用先验框对网络的输出进行解码 3.进行得分排序与非极大抑制筛选实现结果学习前言对yolo2解析完了之后当然要讲讲yolo3,yolo3与yolo2的差别主要在网络的特征提取部分,实际的解码部分其实差距不大代码下载本次教程主要基于github中的项目点击直接下载,该项目相比于yolo3-Keras的项目更容易看懂一些,不过它的许多代码与yolo3-Keras相同. 我保留了预测部分的代码,在实际可以通过执行dete
python目标检测实现黑花屏分类任务示例

目录背景核心技术与架构图技术实现 1.数据的标注 2.训练过程 3.损失的计算 4.对输出内容的处理效果展示总结背景视频帧的黑.花屏的检测是视频质量检测中比较重要的一部分,传统做法是由测试人员通过肉眼来判断视频中是否有黑.花屏的现象,这种方式不仅耗费人力且效率较低. 为了进一步节省人力.提高效率,一种自动的检测方法是大家所期待的.目前,通过分类网络模型对视频帧进行分类来自动检测是否有黑.花屏是比较可行且高效的. 然而,在项目过程中,视频帧数据的收集比较困难,数据量较少,部分花屏和正
python:目标检测模型预测准确度计算方式(基于IoU)

训练完目标检测模型之后,需要评价其性能,在不同的阈值下的准确度是多少,有没有漏检,在这里基于IoU(Intersection over Union)来计算. 希望能提供一些思路,如果觉得有用欢迎赞我表扬我~ IoU的值可以理解为系统预测出来的框与原来图片中标记的框的重合程度.系统预测出来的框是利用目标检测模型对测试数据集进行识别得到的. 计算方法即检测结果DetectionResult与GroundTruth的交集比上它们的并集,如下图: 蓝色的框是:GroundTruth 黄色的框是:Dete
python目标检测给图画框,bbox画到图上并保存案例

我就废话不多说了,还是直接上代码吧! import os import xml.dom.minidom import cv2 as cv ImgPath = 'C:/Users/49691/Desktop/gangjin/gangjin_test/JPEGImages/' AnnoPath = 'C:/Users/49691/Desktop/gangjin/gangjin_test/Annotations/' #xml文件地址 save_path = '' def draw_anchor(Img
python目标检测IOU的概念与示例

目录学习前言什么是IOU IOU的特点全部代码学习前言神经网络的应用还有许多,目标检测就是其中之一,目标检测中有一个很重要的概念便是IOU 什么是IOU IOU是一种评价目标检测器的一种指标. 下图是一个示例:图中绿色框为实际框(好像不是很绿……),红色框为预测框,当我们需要判断两个框之间的关系时,需要用什么指标呢? 此时便需要用到IOU. 计算IOU的公式为: 可以看到IOU是一个比值,即交并比. 在分子部分,值为预测框和实际框之间的重叠区域: 在分母部分,值为预测框和实际框所占有的
python目标检测SSD算法训练部分源码详解

目录学习前言讲解构架模型训练的流程 1.设置参数 2.读取数据集 3.建立ssd网络. 4.预处理数据集 5.框的编码 6.计算loss值 7.训练模型并保存开始训练学习前言 ……又看了很久的SSD算法,今天讲解一下训练部分的代码.预测部分的代码可以参照https://blog.csdn.net/weixin_44791964/article/details/102496765 讲解构架本次教程的讲解主要是对训练部分的代码进行讲解,该部分讲解主要是对训练函数的执行过程与执行思路进行详
python目标检测yolo1 yolo2 yolo3和SSD网络结构对比

目录睿智的目标检测5——yolo1.yolo2.yolo3和SSD的网络结构汇总对比学习前言各个网络的结构图与其实现代码1.yolo12.yolo23.yolo34.SSD 总结学习前言 ……最近在学习yolo1.yolo2和yolo3,事实上它们和SSD网络有一定的相似性,我准备汇总一下,看看有什么差别. 各个网络的结构图与其实现代码 1.yolo1 由图可见,其进行了二十多次卷积还有四次最大池化,其中3x3卷积用于提取特征,1x1卷积用于压缩特征,最后将图像压缩到7x7xfilter的

python目标检测非极大抑制NMS与Soft-NMS

目录

相关推荐

随机推荐