Atlas800昇腾服务器(型号:3000)—YOLO全系列NPU推理【检测】(五)

embedded/2024/10/21 11:10:51/

服务器配置如下:

CPU/NPU:鲲鹏 CPU(ARM64)+A300I pro推理卡
系统:Kylin V10 SP1【下载链接】【安装链接】
驱动与固件版本版本
Ascend-hdk-310p-npu-driver_23.0.1_linux-aarch64.run【下载链接】
Ascend-hdk-310p-npu-firmware_7.1.0.4.220.run【下载链接】
MCU版本:Ascend-hdk-310p-mcu_23.2.3【下载链接】
CANN开发套件:版本7.0.1【Toolkit下载链接】【Kernels下载链接】

测试om模型环境如下:

Python:版本3.8.11
推理工具:ais_bench
测试YOLO系列:v5/6/7/8/9/10/11

专栏其他文章
Atlas800昇腾服务器(型号:3000)—驱动与固件安装(一)
Atlas800昇腾服务器(型号:3000)—CANN安装(二)
Atlas800昇腾服务器(型号:3000)—YOLO全系列om模型转换测试(三)
Atlas800昇腾服务器(型号:3000)—AIPP加速前处理(四)
Atlas800昇腾服务器(型号:3000)—YOLO全系列NPU推理【检测】(五)
Atlas800昇腾服务器(型号:3000)—YOLO全系列NPU推理【实例分割】(六)
Atlas800昇腾服务器(型号:3000)—YOLO全系列NPU推理【关键点】(七)
Atlas800昇腾服务器(型号:3000)—YOLO全系列NPU推理【跟踪】(八)

全部代码github:https://github.com/Bigtuo/NPU-ais_bench

1 基础环境安装

详情见第(三)章环境安装:https://blog.csdn.net/weixin_45679938/article/details/142966255

2 ais_bench编译安装

注意:目前ais_bench工具只支持单个input的带有动态AIPP配置的模型,只支持静态shape、动态batch、动态宽高三种场景,不支持动态shape场景。
参考链接:https://gitee.com/ascend/tools/tree/master/ais-bench_workload/tool/ais_bench

2.1 安装aclruntime包

在安装环境执行如下命令安装aclruntime包:
说明:若为覆盖安装,请增加“–force-reinstall”参数强制安装.

pip3 install -v 'git+https://gitee.com/ascend/tools.git#egg=aclruntime&subdirectory=ais-bench_workload/tool/ais_bench/backend' -i https://pypi.tuna.tsinghua.edu.cn/simple

在这里插入图片描述

2.2 安装ais_bench推理程序包

在安装环境执行如下命令安装ais_bench推理程序包:

 pip3 install -v 'git+https://gitee.com/ascend/tools.git#egg=ais_bench&subdirectory=ais-bench_workload/tool/ais_bench' -i https://pypi.tuna.tsinghua.edu.cn/simple

在这里插入图片描述
卸载和更新【忽略】:

# 卸载aclruntime
pip3 uninstall aclruntime
# 卸载ais_bench推理程序
pip3 uninstall ais_bench

3 裸代码推理测试

# 1.进入运行环境yolo【普通用户】
conda activate yolo
# 2.激活atc【atc --help测试是否可行】
source ~/bashrc

注意:ais_bench调用和使用方式与onnx-runtime几乎一致,因此可参考进行撰写脚本!

代码逻辑如下
下面代码整个处理过程主要包括:预处理—>推理—>后处理—>画图。
假设图像resize为640×640,
前处理输出结果维度:(1, 3, 640, 640);
YOLOv5/6/7推理输出结果维度:(1, 8400×3, 85),其中85表示4个box坐标信息+置信度分数+80个类别概率,8400×3表示(80×80+40×40+20×20)×3,不同于v8与v9采用类别里面最大的概率作为置信度score;
YOLOv8/9/11推理输出结果维度:(1, 84, 8400),其中84表示4个box坐标信息+80个类别概率,8400表示80×80+40×40+20×20;
YOLOv10推理输出结果维度:(1, 300, 6),其中300是默认输出数量,无nms操作,阈值过滤即可,6是4个box坐标信息+置信度分数+类别。
后处理输出结果维度:(5, 6),其中第一个5表示图bus.jpg检出5个目标,第二个维度6表示(x1, y1, x2, y2, conf, cls)。

完整代码如下
新建YOLO_ais_bench_det_aipp.py,内容如下:

import argparse
import time 
import cv2
import numpy as np
import osfrom ais_bench.infer.interface import InferSessionclass YOLO:"""YOLO object detection model class for handling inference"""def __init__(self, om_model, imgsz=(640, 640), device_id=0, model_ndtype=np.single, mode="static", postprocess_type="v8", aipp=False):"""Initialization.Args:om_model (str): Path to the om model."""# 构建ais_bench推理引擎self.session = InferSession(device_id=device_id, model_path=om_model)# Numpy dtype: support both FP32(np.single) and FP16(np.half) om modelself.ndtype = model_ndtypeself.mode = modeself.postprocess_type = postprocess_typeself.aipp = aipp  self.model_height, self.model_width = imgsz[0], imgsz[1]  # 图像resize大小def __call__(self, im0, conf_threshold=0.4, iou_threshold=0.45):"""The whole pipeline: pre-process -> inference -> post-process.Args:im0 (Numpy.ndarray): original input image.conf_threshold (float): confidence threshold for filtering predictions.iou_threshold (float): iou threshold for NMS.Returns:boxes (List): list of bounding boxes."""# 前处理Pre-processt1 = time.time()im, ratio, (pad_w, pad_h) = self.preprocess(im0)pre_time = round(time.time() - t1, 3)# 推理 inferencet2 = time.time()preds = self.session.infer([im], mode=self.mode)[0]  # mode有动态"dymshape"和静态"static"等det_time = round(time.time() - t2, 3)# 后处理Post-processt3 = time.time()if self.postprocess_type == "v5":boxes = self.postprocess_v5(preds,im0=im0,ratio=ratio,pad_w=pad_w,pad_h=pad_h,conf_threshold=conf_threshold,iou_threshold=iou_threshold,)elif self.postprocess_type == "v8":boxes = self.postprocess_v8(preds,im0=im0,ratio=ratio,pad_w=pad_w,pad_h=pad_h,conf_threshold=conf_threshold,iou_threshold=iou_threshold,)elif self.postprocess_type == "v10":boxes = self.postprocess_v10(preds,im0=im0,ratio=ratio,pad_w=pad_w,pad_h=pad_h,conf_threshold=conf_threshold)else:boxes = []post_time = round(time.time() - t3, 3)return boxes, (pre_time, det_time, post_time)# 前处理,包括:resize, pad, 其中HWC to CHW,BGR to RGB,归一化,增加维度CHW -> BCHW可选择是否开启AIPP加速处理def preprocess(self, img):"""Pre-processes the input image.Args:img (Numpy.ndarray): image about to be processed.Returns:img_process (Numpy.ndarray): image preprocessed for inference.ratio (tuple): width, height ratios in letterbox.pad_w (float): width padding in letterbox.pad_h (float): height padding in letterbox."""# Resize and pad input image using letterbox() (Borrowed from Ultralytics)shape = img.shape[:2]  # original image shapenew_shape = (self.model_height, self.model_width)r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])ratio = r, rnew_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))pad_w, pad_h = (new_shape[1] - new_unpad[0]) / 2, (new_shape[0] - new_unpad[1]) / 2  # wh paddingif shape[::-1] != new_unpad:  # resizeimg = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)top, bottom = int(round(pad_h - 0.1)), int(round(pad_h + 0.1))left, right = int(round(pad_w - 0.1)), int(round(pad_w + 0.1))img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))  # 填充# 是否开启aipp加速预处理,需atc中完成if self.aipp:return img, ratio, (pad_w, pad_h)# Transforms: HWC to CHW -> BGR to RGB -> div(255) -> contiguous -> add axis(optional)img = np.ascontiguousarray(np.einsum('HWC->CHW', img)[::-1], dtype=self.ndtype) / 255.0img_process = img[None] if len(img.shape) == 3 else imgreturn img_process, ratio, (pad_w, pad_h)# YOLOv5/6/7通用后处理,包括:阈值过滤与NMSdef postprocess_v5(self, preds, im0, ratio, pad_w, pad_h, conf_threshold, iou_threshold):"""Post-process the prediction.Args:preds (Numpy.ndarray): predictions come from ort.session.run().im0 (Numpy.ndarray): [h, w, c] original input image.ratio (tuple): width, height ratios in letterbox.pad_w (float): width padding in letterbox.pad_h (float): height padding in letterbox.conf_threshold (float): conf threshold.iou_threshold (float): iou threshold.Returns:boxes (List): list of bounding boxes."""# (Batch_size, Num_anchors, xywh_score_conf_cls), v5和v6的[..., 4]是置信度分数,v8v9采用类别里面最大的概率作为置信度scorex = preds  # outputs: predictions (1, 8400*3, 85)# Predictions filtering by conf-thresholdx = x[x[..., 4] > conf_threshold]# Create a new matrix which merge these(box, score, cls) into one# For more details about `numpy.c_()`: https://numpy.org/doc/1.26/reference/generated/numpy.c_.htmlx = np.c_[x[..., :4], x[..., 4], np.argmax(x[..., 5:], axis=-1)]# NMS filtering# 经过NMS后的值, np.array([[x, y, w, h, conf, cls], ...]), shape=(-1, 4 + 1 + 1)x = x[cv2.dnn.NMSBoxes(x[:, :4], x[:, 4], conf_threshold, iou_threshold)]# 重新缩放边界框,为画图做准备if len(x) > 0:# Bounding boxes format change: cxcywh -> xyxyx[..., [0, 1]] -= x[..., [2, 3]] / 2x[..., [2, 3]] += x[..., [0, 1]]# Rescales bounding boxes from model shape(model_height, model_width) to the shape of original imagex[..., :4] -= [pad_w, pad_h, pad_w, pad_h]x[..., :4] /= min(ratio)# Bounding boxes boundary clampx[..., [0, 2]] = x[:, [0, 2]].clip(0, im0.shape[1])x[..., [1, 3]] = x[:, [1, 3]].clip(0, im0.shape[0])return x[..., :6]  # boxeselse:return []# YOLOv8/9/11通用后处理,包括:阈值过滤与NMSdef postprocess_v8(self, preds, im0, ratio, pad_w, pad_h, conf_threshold, iou_threshold):"""Post-process the prediction.Args:preds (Numpy.ndarray): predictions come from ort.session.run().im0 (Numpy.ndarray): [h, w, c] original input image.ratio (tuple): width, height ratios in letterbox.pad_w (float): width padding in letterbox.pad_h (float): height padding in letterbox.conf_threshold (float): conf threshold.iou_threshold (float): iou threshold.Returns:boxes (List): list of bounding boxes."""x = preds  # outputs: predictions (1, 84, 8400)# Transpose the first output: (Batch_size, xywh_conf_cls, Num_anchors) -> (Batch_size, Num_anchors, xywh_conf_cls)x = np.einsum('bcn->bnc', x)  # (1, 8400, 84)# Predictions filtering by conf-thresholdx = x[np.amax(x[..., 4:], axis=-1) > conf_threshold]# Create a new matrix which merge these(box, score, cls) into one# For more details about `numpy.c_()`: https://numpy.org/doc/1.26/reference/generated/numpy.c_.htmlx = np.c_[x[..., :4], np.amax(x[..., 4:], axis=-1), np.argmax(x[..., 4:], axis=-1)]# NMS filtering# 经过NMS后的值, np.array([[x, y, w, h, conf, cls], ...]), shape=(-1, 4 + 1 + 1)x = x[cv2.dnn.NMSBoxes(x[:, :4], x[:, 4], conf_threshold, iou_threshold)]# 重新缩放边界框,为画图做准备if len(x) > 0:# Bounding boxes format change: cxcywh -> xyxyx[..., [0, 1]] -= x[..., [2, 3]] / 2x[..., [2, 3]] += x[..., [0, 1]]# Rescales bounding boxes from model shape(model_height, model_width) to the shape of original imagex[..., :4] -= [pad_w, pad_h, pad_w, pad_h]x[..., :4] /= min(ratio)# Bounding boxes boundary clampx[..., [0, 2]] = x[:, [0, 2]].clip(0, im0.shape[1])x[..., [1, 3]] = x[:, [1, 3]].clip(0, im0.shape[0])return x[..., :6]  # boxeselse:return []# YOLOv10后处理,包括:阈值过滤-无NMSdef postprocess_v10(self, preds, im0, ratio, pad_w, pad_h, conf_threshold):x = preds  # outputs: predictions (1, 300, 6) -> (xyxy_conf_cls)# Predictions filtering by conf-thresholdx = x[x[..., 4] > conf_threshold]# 重新缩放边界框,为画图做准备if len(x) > 0:# Rescales bounding boxes from model shape(model_height, model_width) to the shape of original imagex[..., :4] -= [pad_w, pad_h, pad_w, pad_h]x[..., :4] /= min(ratio)# Bounding boxes boundary clampx[..., [0, 2]] = x[:, [0, 2]].clip(0, im0.shape[1])x[..., [1, 3]] = x[:, [1, 3]].clip(0, im0.shape[0])return x  # boxeselse:return []if __name__ == '__main__':# Create an argument parser to handle command-line argumentsparser = argparse.ArgumentParser()parser.add_argument('--det_model', type=str, default=r"yolov8s.om", help='Path to OM model')parser.add_argument('--source', type=str, default=r'images', help='Path to input image')parser.add_argument('--out_path', type=str, default=r'results', help='结果保存文件夹')parser.add_argument('--imgsz_det', type=tuple, default=(640, 640), help='Image input size')parser.add_argument('--classes', type=list, default=['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light','fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow','elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee','skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard','tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich','orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed','dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven','toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'], help='类别')parser.add_argument('--conf', type=float, default=0.25, help='Confidence threshold')parser.add_argument('--iou', type=float, default=0.6, help='NMS IoU threshold')parser.add_argument('--device_id', type=int, default=0, help='device id')parser.add_argument('--mode', default='static', help='om是动态dymshape或静态static')parser.add_argument('--model_ndtype', default=np.single, help='om是fp32或fp16')parser.add_argument('--postprocess_type', type=str, default='v8', help='后处理方式, 对应v5/v8/v10三种后处理')parser.add_argument('--aipp', default=False, action='store_true', help='是否开启aipp加速YOLO预处理, 需atc中完成om集成')args = parser.parse_args()# 创建结果保存文件夹if not os.path.exists(args.out_path):os.mkdir(args.out_path)print('开始运行:')# Build modeldet_model = YOLO(args.det_model, args.imgsz_det, args.device_id, args.model_ndtype, args.mode, args.postprocess_type, args.aipp)color_palette = np.random.uniform(0, 255, size=(len(args.classes), 3))  # 为每个类别生成调色板for i, img_name in enumerate(os.listdir(args.source)):try:t1 = time.time()# Read image by OpenCVimg = cv2.imread(os.path.join(args.source, img_name))# 检测Inferenceboxes, (pre_time, det_time, post_time) = det_model(img, conf_threshold=args.conf, iou_threshold=args.iou)print('{}/{} ==>总耗时间: {:.3f}s, 其中, 预处理: {:.3f}s, 推理: {:.3f}s, 后处理: {:.3f}s, 识别{}个目标'.format(i+1, len(os.listdir(args.source)), time.time() - t1, pre_time, det_time, post_time, len(boxes)))# Draw rectanglesfor (*box, conf, cls_) in boxes:cv2.rectangle(img, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])),color_palette[int(cls_)], 2, cv2.LINE_AA)cv2.putText(img, f'{args.classes[int(cls_)]}: {conf:.3f}', (int(box[0]), int(box[1] - 9)),cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2, cv2.LINE_AA)cv2.imwrite(os.path.join(args.out_path, img_name), img)except Exception as e:print(e)     

检测结果可视化如下,效果与GPU上推理几乎一致:
在这里插入图片描述

4 推理耗时

YOLO各系列推理耗时(640*640)如下:
YOLOv5s:8-9ms
YOLOv7-tiny:7-8ms
YOLOv7:14ms
YOLOv8s:6ms
YOLOv9s:12ms
YOLOv10s:6ms
YOLOv11s:8ms
预处理耗时(bus.jpg):12ms
后处理耗时:除YOLOv10几乎无耗时外,其余1-2ms。
注意,上述耗时未使用AIPP进行前处理加速,如YOLOv8s加速后前处理+推理大约6-7ms。


http://www.ppmy.cn/embedded/129250.html

相关文章

基于Python的自然语言处理系列(38):从现有数据训练新的 Tokenizer

在自然语言处理任务中,虽然 Huggingface 提供了大量预训练的 Tokenizer,但有时我们可能需要针对特定语言(例如中文或编程语言)定制一个新的 Tokenizer。本文将展示如何使用 Huggingface 的 AutoTokenizer.train_new_from_iterator…

WPF -- 实现打印实时数据功能

一、实现打印过程 在WPF中,我读取了CSV文件中的内容(主要是表格),通过条件筛选内容之后将其赋值给FlowDocument,再将FlowDocument的内容赋值给RichTextBox 在前端显示,满足了我打印RichTextBox 的内容这个…

asp.net mvc return json()设置maxJsonLength

asp.net mvc异常信息 Error during serialization or deserialization using the JSON JavaScriptSerializer. The length of the string exceeds the value set on the maxJsonLength property. 在ASP.NET MVC中,当你遇到使用JavaScriptSerializer进行JSON序列化…

006_django基于Python的二手房源信息爬取与分析2024_l77153d4

目录 系统展示 开发背景 代码实现 项目案例 获取源码 博主介绍:CodeMentor毕业设计领航者、全网关注者30W群落,InfoQ特邀专栏作家、技术博客领航者、InfoQ新星培育计划导师、Web开发领域杰出贡献者,博客领航之星、开发者头条/腾讯云/AW…

域7:安全运营 第17章 事件的预防和响应

第七域包括 16、17、18、19 章。 事件的预防和响应是安全运营管理的核心环节,对于组织有效识别、评估、控制和减轻网络安全威胁至关重要。这一过程是循环往复的,要求组织不断总结经验,优化策略,提升整体防护能力。通过持续的监测、…

PHP仿爱站网站ICP备案查询源码修复版

源码简介 PHP仿爱站网站ICP备案查询源码,自己小改了一下,内置本地加载(非接口),短时间请求次数多会被限制,无后台。 安装环境 测试搭建环境:php7.0 搭建教程 上传源码压缩包到网站目录并解压即可 首页截图 源码下…

小米电机与STM32——CAN通信

背景介绍:为了利用小米电机,搭建机械臂的关节,需要学习小米电机的使用方法。计划采用STM32驱动小米电机,实现指定运动,为此需要了解他们之间的通信方式,指令写入方法等。花了很多时间学习,但网络…

防范数据泄露,守护安全存储新时代

在这个信息爆炸的时代,数据安全已经成为我们生活中不可或缺的一部分。无论是个人隐私、企业机密还是国家机密,数据的安全存储和传输都是我们最关注的问题。数据泄露、黑客攻击、隐私侵犯...这些词汇听起来就让人不寒而栗。想象一下,这时候一个…