YOLOv5 训练并部署到 青云1000
⚠️ 这只是初步的展示性质文档,更加详细的操作流程与步骤会在测试后发布。
准备数据集(PC)
- 在个人电脑(PC)端准备
- 待标注的图片,放置于全英文路径下
- AI辅助标注工具:X-Anylabeling
昇腾模型适配工具(PC)
- 在个人电脑(PC)端安装
- 安装文档
- 训练目标检测模型
安装CANN环境(青云)
CANN介绍
- CANN(Compute Architecture for Neural Networks)是华为针对AI场景推出的异构计算架构
- 用户在程序中调用CANN提供的接口(或包装后的接口),可以让程序利用昇腾NPU的算力进行计算。地位类似于NVIDIA CUDA
- 以
CANN 6.0.0.alpha006
为例,CANN开发文档
CANN 6.0.0.alpha006 安装
- 在青云1000端安装
- 需要 Python3.7.x(3.7.0-3.7.11)或 Python3.8.x(3.8.0-3.8.11)或 Python3.9.x(3.9.0-3.9.7)
- 在Ubuntu 18.04上安装高版本Python3,可以参考此博客
- 确定要安装的CANN版本(以
CANN 6.0.0.alpha006
为例)- CANN与青云固件版本应当匹配
- 如
固件 1.0.13.alpha
对应CANN 6.0.0.alpha00X
- CANN安装包下载
- CANN安装
部署YOLO模型(青云)
环境需求
- 在青云1000端部署
- CANN环境(以
CANN 6.0.0.alpha006
为例) - Python3.7.x(3.7.0-3.7.11)或 Python3.8.x(3.8.0-3.8.11)或 Python3.9.x(3.9.0-3.9.7)
- ais_bench推理工具(Python包)
部署与推理
- 部署文档
- 青云1000的昇腾芯片型号为
Ascend310
(不带任何后缀),在模型转换时需要修改配置文件中的SOC型号与atc.sh模型转换脚本 - 由工具生成的推理代码中,输入图像预处理(尤其是归一化)占用了大量时间,远远超过了模型推理本身的时间。可以使用预处理工具AIPP进行输入预处理。
附录
infer_project/
解压后的目录结构
infer_project
├── benchmark.aarch64
├── common
│ ├── eval.sh
│ ├── onnx2om.sh
│ ├── pth2om.sh
│ ├── quantize
│ ├── util
│ ├── world_cup.jpg
│ ├── yolov5_camera.ipynb
│ ├── yolov5_image.ipynb
│ └── yolov5_video.ipynb
├── config.yaml
├── data.yaml
├── edge_infer
│ ├── acl_image.py
│ ├── acl_model.py
│ ├── acl_net_dynamic.py
│ ├── acl_resource.py
│ ├── coco_names.txt
│ ├── constants.py
│ ├── deep_dims.om
│ ├── deepsort
│ ├── DeepSortDetector.py
│ ├── det_utils.py
│ ├── fusion_result.json
│ ├── mAP
│ ├── utils.py
│ ├── v5_object_detect.py
│ ├── video.py
│ ├── yolov5_infer.ipynb
│ └── yolov5s_v6.1_track.ipynb
├── models
│ ├── __init__.py
│ ├── __pycache__
│ ├── common.py
│ ├── experimental.py
│ ├── hub
│ ├── segment
│ ├── tf.py
│ ├── yolo.py
│ ├── yolov5l.yaml
│ ├── yolov5m.yaml
│ ├── yolov5n.yaml
│ ├── yolov5s.yaml
│ └── yolov5x.yaml
├── om_infer.py
├── onnx2om.py
├── run.py
├── test
│ ├── images
│ ├── labels
│ └── test.json
├── utils
│ ├── __init__.py
│ ├── __pycache__
│ ├── activations.py
│ ├── augmentations.py
│ ├── autoanchor.py
│ ├── autobatch.py
│ ├── aws
│ ├── callbacks.py
│ ├── dataloaders.py
│ ├── docker
│ ├── downloads.py
│ ├── flask_rest_api
│ ├── general.py
│ ├── google_app_engine
│ ├── loggers
│ ├── loss.py
│ ├── metrics.py
│ ├── plots.py
│ ├── segment
│ ├── torch_utils.py
│ └── triton.py
├── yolov5s.onnx
└── yolov5s.pt
infer_project/config.yaml
替换infer_project/config.yaml
配置文件里的soc型号,替换为Ascend310
sed -i 's/soc: Ascend310.*/soc: Ascend310/g' infer_project/config.yaml
infer_project/common/util/atc.sh
为青云适配soc型号,并且使用AIPP进行模型预处理
onnx=$1
om=$2
bs=$3
soc=$4input_shape="images:${bs},3,640,640"
input_fp16_nodes="images"if [[ ${soc} == Ascend310B1 ]];thenatc --model=${onnx} \--framework=5 \--output=${om}_bs${bs} \--input_format=NCHW \--input_shape=${input_shape} \--log=error \--soc_version=${soc} \--input_fp16_nodes=${input_fp16_nodes} \--output_type=FP16
fiif [[ ${soc} == Ascend310P? ]];thenatc --model=${onnx} \--framework=5 \--output=${om}_bs${bs} \--input_format=NCHW \--input_shape=${input_shape} \--log=error \--soc_version=${soc} \--input_fp16_nodes=${input_fp16_nodes} \--output_type=FP16 \--optypelist_for_implmode="Sigmoid" \--op_select_implmode=high_performance \--fusion_switch_file=common/util/fusion.cfg
fi# 青云1000为Ascend310芯片
if [[ ${soc} == Ascend310 ]];thenatc --model=${onnx} \--framework=5 \--output=${om}_bs${bs} \--input_format=NCHW \--input_shape=${input_shape} \--log=error \--soc_version=${soc} \# 在模型内部进行输入图像归一化预处理,因此输入为整型# --input_fp16_nodes=${input_fp16_nodes} \--output_type=FP16 \--optypelist_for_implmode="Sigmoid" \--op_select_implmode=high_performance \--fusion_switch_file=common/util/fusion.cfg \# 使用AIPP进行输入图像归一化预处理--insert_op_conf=common/util/aipp_yolov5s.cfg
fi
infer_project/common/util/aipp_yolov5s.cfg
AIPP预处理配置文件(请按照标题路径手动新建),将输入的八位整型三通道RGB图像像素值归一化至0-1,归一化后像素值类别为半精度浮点数。
aipp_op {aipp_mode : staticrelated_input_rank : 0src_image_size_w : 640src_image_size_h : 640input_format : RGB888_U8mean_chn_0 : 0mean_chn_1 : 0mean_chn_2 : 0min_chn_0 : 0min_chn_1 : 0min_chn_2 : 0var_reci_chn_0 : 0.0039216var_reci_chn_1 : 0.0039216var_reci_chn_2 : 0.0039216
}
infer_project/edge_infer/yolov5_infer.py
以下代码是使用摄像头进行推理的案例,将其复制到infer_project/edge_infer/
目录下的yolov5_infer.py
文件(新建)即可。
请先填写代码内部留出的模型文件路径与标签文件路径。标签文件的模板请参考infer_project/edge_infer/coco_names.txt
。
进行推理时,请用MobaXterm通过SSH连接青云开发板,激活相关Python环境后进入infer_project/edge_infer
目录,执行DISPLAY=$SSH_CLIENT:0.0 python3 yolov5_infer.py
即可调用摄像头进行推理。
#!/usr/bin/env python
# coding: utf-8import time
import cv2
import torch
from skvideo.io import vreader, FFmpegWriter
from ais_bench.infer.interface import InferSessionfrom det_utils import letterbox, scale_coords, nmsdef preprocess_image(image, cfg, bgr2rgb=True):img, scale_ratio, pad_size = letterbox(image, new_shape=cfg['input_shape'])if bgr2rgb:img = img[:, :, ::-1]img = img.transpose(2, 0, 1) # HWC2CHWreturn img, scale_ratio, pad_sizedef draw_bbox(bbox, img0, color, wt, names):det_result_str = ''for idx, class_id in enumerate(bbox[:, 5]):if float(bbox[idx][4] < float(0.05)):continueimg0 = cv2.rectangle(img0, (int(bbox[idx][0]), int(bbox[idx][1])), (int(bbox[idx][2]), int(bbox[idx][3])),color, wt)img0 = cv2.putText(img0, str(idx) + ' ' + names[int(class_id)], (int(bbox[idx][0]), int(bbox[idx][1] + 16)),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)img0 = cv2.putText(img0, '{:.4f}'.format(bbox[idx][4]), (int(bbox[idx][0]), int(bbox[idx][1] + 32)),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)det_result_str += '{} {} {} {} {} {}\n'.format(names[bbox[idx][5]], str(bbox[idx][4]), bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3])return img0def get_labels_from_txt(path):labels_dict = dict()with open(path) as f:for cat_id, label in enumerate(f.readlines()):labels_dict[cat_id] = label.strip()return labels_dictdef draw_prediction(pred, image, labels):img_dw = draw_bbox(pred, image, (0, 255, 0), 2, labels)cv2.imshow('result', img_dw)def infer_image(img_path, model, class_names, cfg):image = cv2.imread(img_path)img, scale_ratio, pad_size = preprocess_image(image, cfg)output = model.infer([img])[0]output = torch.tensor(output)boxout = nms(output, conf_thres=cfg["conf_thres"], iou_thres=cfg["iou_thres"])pred_all = boxout[0].numpy()scale_coords(cfg['input_shape'], pred_all[:, :4], image.shape, ratio_pad=(scale_ratio, pad_size))draw_prediction(pred_all, image, class_names)def infer_frame_with_vis(image, model, labels_dict, cfg, bgr2rgb=True):img, scale_ratio, pad_size = preprocess_image(image, cfg, bgr2rgb)output = model.infer([img])[0]output = torch.tensor(output)boxout = nms(output, conf_thres=cfg["conf_thres"], iou_thres=cfg["iou_thres"])pred_all = boxout[0].numpy()scale_coords(cfg['input_shape'], pred_all[:, :4], image.shape, ratio_pad=(scale_ratio, pad_size))img_vis = draw_bbox(pred_all, image, (0, 255, 0), 2, labels_dict)return img_visdef img2bytes(image):return bytes(cv2.imencode('.jpg', image)[1])def infer_video(video_path, model, labels_dict, cfg, output_path='output.mp4'):cap = vreader(video_path)video_writer = Nonefor img_frame in cap:image_pred = infer_frame_with_vis(img_frame, model, labels_dict, cfg, bgr2rgb=False)cv2.imshow('result', image_pred)if video_writer is None:video_writer = FFmpegWriter(output_path)video_writer.writeFrame(image_pred)video_writer.close()def infer_camera(model, labels_dict, cfg):cap = cv2.VideoCapture(0)while True:_, img_frame = cap.read()infer_start = time.time()image_pred = infer_frame_with_vis(img_frame, model, labels_dict, cfg)infer_time = time.time() - infer_startprint(1 / infer_time)cv2.imshow('result', image_pred)cv2.waitKey(1)cfg = {'conf_thres': 0.4,'iou_thres': 0.5,'input_shape': [640, 640],
}model_path = 'om模型文件路径'
label_path = '标签文件路径'
model = InferSession(0, model_path)
labels_dict = get_labels_from_txt(label_path)infer_mode = 'camera'if infer_mode == 'image':img_path = 'world_cup.jpg'infer_image(img_path, model, labels_dict, cfg)
elif infer_mode == 'camera':infer_camera(model, labels_dict, cfg)
elif infer_mode == 'video':video_path = 'world_cup.mp4'infer_video(video_path, model, labels_dict, cfg, output_path='output.mp4')