Paddle Lite在ARM上的应用,以Yolov5为例

news/2025/3/15 5:03:49/

目录

一、Paddle Lite简介

二、环境安装

2.1 本地环境安装(基于python3.6):

2.2 开发板Paddle Lite编译(基于python3.7):

2.2.1 已经编译好的whl包(arm,支持python、耗时分析功能)下载链接

2.2.2 自己编译(本地编译)

三、模型转换(在本地环境中进行)

四、模型部署,推理及应用

4.1 使用 Paddle Lite 执行推理的主要步骤

4.2 以Yolov5为例,使用.nb模型进行推理

4.3 推理结果


一、Paddle Lite简介

        Paddle Lite 是一种轻量级、灵活性强、易于扩展的高性能的深度学习预测框架,它可以支持诸如 ARM、OpenCL 、NPU 等等多种终端,同时拥有强大的图优化及预测加速能力。

二、环境安装

2.1 本地环境安装(基于python3.6):

pip3 install paddlelite==2.12 -i http://pypi.douban.com/simple/pip3 install x2paddle -i http://pypi.douban.com/simple/

2.2 开发板Paddle Lite编译(基于python3.7):

2.2.1 已经编译好的whl包(arm,支持python、耗时分析功能)下载链接

https://download.csdn.net/download/m0_46303486/87364716https://download.csdn.net/download/m0_46303486/87364716

2.2.2 自己编译(本地编译)

(1)基本环境安装(如已安装,请跳过)

sudo apt updatesudo apt-get install -y --no-install-recommends \gcc g++ make wget python unzip patchelf python-dev

(2) cmake安装,推荐使用3.10及以上版本(如已安装,请跳过)

wget https://www.cmake.org/files/v3.10/cmake-3.10.3.tar.gztar -zxvf cmake-3.10.3.tar.gzcd cmake-3.10.3./configuremakesudo make install

(3)下载Paddle Lite源码并编译

git clone https://github.com/PaddlePaddle/Paddle-Lite.gitcd Paddle-Litesudo rm -rf third-party#  --with_python=ON和--with_profile=ON为编译参数,编译过程中的可选参数见(4)常用编译参数,
本教程基于python,故使用python编译包sudo ./lite/tools/build_linux.sh --with_python=ON --with_profile=ON

(4)常用编译参数

参数

说明

可选范围

默认值

arch

目标硬件的架构版本

armv8 / armv7hf / armv7

armv8

toolchain

C++ 语言的编译器工具链

gcc

gcc

with_python

是否包含 python 编译包,目标应用程序是 python 语言时需配置为 ON

OFF / ON

OFF

with_cv

是否将 cv 函数加入编译包中

OFF / ON

OFF

with_log

是否在执行过程打印日志

OFF / ON

ON

with_exception

是否开启 C++ 异常

OFF / ON

OFF

with_profile

是否打开执行耗时分析

OFF / ON

OFF

with_precision_profile

是否打开逐层精度结果分析

OFF / ON

OFF

with_opencl

是否编译支持 OpenCL 的预测库

OFF / ON

OFF

(5)编译产物

        编译成功后,会在/Paddle-Lite/build.lite.linux.armv8.gcc/

inference_lite_lib.armlinux.armv8/python/install/dist 目录下生成对应的.whl包,安装即可。

并且会生成相应的python版本的demo。

三、模型转换(在本地环境中进行)

        如果想用 Paddle Lite 运行第三方来源(TensorFlow、Caffe、ONNX、PyTorch)模型,一般需要经过两次转化。即使用 X2paddle 工具将第三方模型转化为 PaddlePaddle 格式,再使用 opt工具 将 PaddlePaddle 模型转化为Padde Lite 可支持格式。

        为了简化这一过程,X2Paddle 集成了 opt 工具,提供一键转换 API,以 ONNX 为例(大部分模型都可以转换成ONNX):

        TensorFlow、Caffe、PyTorch直接转Padde Lite相关部分的API可参考:https://github.com/PaddlePaddle/X2Paddle/blob/develop/docs/inference_model_convertor/convert2lite_api.mdhttps://github.com/PaddlePaddle/X2Paddle/blob/develop/docs/inference_model_convertor/convert2lite_api.md

from x2paddle.convert import onnx2paddlemodel_path = "/pose/light_pose_sim.onnx"
save_dir = "./paddleLite_models/light_pose_sim_paddle"onnx2paddle(model_path, save_dir,convert_to_lite=True,lite_valid_places="arm",lite_model_type="naive_buffer")# model_path(str) 为 ONNX 模型路径
# save_dir(str) 为转换后模型保存路径
# convert_to_lite(bool) 表示是否使用 opt 工具,默认为 False# lite_valid_places(str) 指定转换类型,默认为 arm
# lite_valid_places参数目前可支持 arm、 opencl、 x86、 metal、 xpu、 bm、 mlu、 
# intel_fpga、 huawei_ascend_npu、imagination_nna、
# rockchip_npu、 mediatek_apu、 huawei_kirin_npu、 amlogic_npu,可以同时指定多个硬件平台 
# (以逗号分隔,优先级高的在前),opt 将会自动选择最佳方式。# lite_model_type(str) 指定模型转化类型,目前支持两种类型:protobuf 和 naive_buffer,默认为 naive_buffer

        转换后,会在指定目录下生成.nb文件,该文件就是在部署PaddleLite时需要用到的模型      

四、模型部署,推理及应用

        经过以上步骤,你已经成功完成了所有准备步骤,接下来就是将相关代码和模型移植到开发板上即可。

4.1 使用 Paddle Lite 执行推理的主要步骤

# (1) 设置配置信息config = MobileConfig()config.set_model_from_file("Your dictionary/opt.nb")# (2) 创建预测器predictor = create_paddle_predictor(config)# (3) 获取输入Tensor的引用,用来设置输入数据,参数表示第几个输入,单输入时为0input_tensor = predictor.get_input(0)input_tensor.from_numpy(input_data)# (4) 执行推理,需要在设置输入数据后使用predictor.run()# (5) 获取输出Tensor的引用,用来设置输出数据,参数表示第几个输出,单输出时为0output_tensor = predictor.get_output(0)# 将tensor数据类型转为ndarray类型ort_outs = output_tensor.numpy()

4.2 以Yolov5为例,使用.nb模型进行推理

        代码是从Yolov5官方源码中扣出来的,修改main函数中的路径即可!

import cv2
import numpy as np
import onnxruntime as rt
from paddlelite.lite import *CLASSES = {0: 'person',1: 'bicycle',2: 'car',3: 'motorbike',4: 'aeroplane',5: 'bus',6: 'train',7: 'truck',8: 'boat',9: 'traffic light',10: 'fire hydrant',11: 'stop sign',12: 'parking meter',13: 'bench',14: 'bird',15: 'cat',16: 'dog',17: 'horse',18: 'sheep',19: 'cow',20: 'elephant',21: 'bear',22: 'zebra',23: 'giraffe',24: 'backpack',25: 'umbrella',26: 'handbag',27: 'tie',28: 'suitcase',29: 'frisbee',30: 'skis',31: 'snowboard',32: 'sports ball',33: 'kite',34: 'baseball bat',35: 'baseball glove',36: 'skateboard',37: 'surfboard',38: 'tennis racket',39: 'bottle',40: 'wine glass',41: 'cup',42: 'fork',43: 'knife',44: 'spoon',45: 'bowl',46: 'banana',47: 'apple',48: 'sandwich',49: 'orange',50: 'broccoli',51: 'carrot',52: 'hot dog',53: 'pizza',54: 'donut',55: 'cake',56: 'chair',57: 'sofa',58: 'potted plant',59: 'bed',60: 'dining table',61: 'toilet',62: 'tvmonitor',63: 'laptop',64: 'mouse',65: 'remote',66: 'keyboard',67: 'cell phone',68: 'microwave',69: 'oven',70: 'toaster',71: 'sink',72: 'refrigerator',73: 'book',74: 'clock',75: 'vase',76: 'scissors',77: 'teddy bear',78: 'hair drier',79: 'toothbrush'
}
def box_iou(box1, box2, eps=1e-7):(a1, a2), (b1, b2) = box1.unsqueeze(1).chunk(2, 2), box2.unsqueeze(0).chunk(2, 2)inter = (np.min(a2, b2) - np.max(a1, b1)).clamp(0).prod(2)return inter / ((a2 - a1).prod(2) + (b2 - b1).prod(2) - inter + eps)def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):# Resize and pad image while meeting stride-multiple constraintsshape = im.shape[:2]  # current shape [height, width]if isinstance(new_shape, int):new_shape = (new_shape, new_shape)# Scale ratio (new / old)r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])if not scaleup:  # only scale down, do not scale up (for better val mAP)r = min(r, 1.0)# Compute paddingratio = r, r  # width, height ratiosnew_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh paddingif auto:  # minimum rectangledw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh paddingelif scaleFill:  # stretchdw, dh = 0.0, 0.0new_unpad = (new_shape[1], new_shape[0])ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratiosdw /= 2  # divide padding into 2 sidesdh /= 2if shape[::-1] != new_unpad:  # resizeim = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))left, right = int(round(dw - 0.1)), int(round(dw + 0.1))im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add borderreturn im, ratio, (dw, dh)def onnx_inf(onnxModulePath, data):sess = rt.InferenceSession(onnxModulePath)input_name = sess.get_inputs()[0].nameoutput_name = sess.get_outputs()[0].namepred_onnx = sess.run([output_name], {input_name: data.reshape(1, 3, 640, 640).astype(np.float32)})return pred_onnxdef xywh2xyxy(x):# Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right# isinstance 用来判断某个变量是否属于某种类型y = np.copy(x)y[..., 0] = x[..., 0] - x[..., 2] / 2  # top left xy[..., 1] = x[..., 1] - x[..., 3] / 2  # top left yy[..., 2] = x[..., 0] + x[..., 2] / 2  # bottom right xy[..., 3] = x[..., 1] + x[..., 3] / 2  # bottom right yreturn ydef nms_boxes(boxes, scores):x = boxes[:, 0]y = boxes[:, 1]w = boxes[:, 2] - boxes[:, 0]h = boxes[:, 3] - boxes[:, 1]areas = w * horder = scores.argsort()[::-1]keep = []while order.size > 0:i = order[0]keep.append(i)xx1 = np.maximum(x[i], x[order[1:]])yy1 = np.maximum(y[i], y[order[1:]])xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]])yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]])w1 = np.maximum(0.0, xx2 - xx1 + 0.00001)h1 = np.maximum(0.0, yy2 - yy1 + 0.00001)inter = w1 * h1ovr = inter / (areas[i] + areas[order[1:]] - inter)inds = np.where(ovr <= 0.45)[0]order = order[inds + 1]keep = np.array(keep)return keepdef non_max_suppression(prediction,conf_thres=0.25,iou_thres=0.45,classes=None,agnostic=False,multi_label=False,labels=(),max_det=300,nm=0,  # number of masks
):"""Non-Maximum Suppression (NMS) on inference results to reject overlapping detectionsReturns:list of detections, on (n,6) tensor per image [xyxy, conf, cls]"""# Checksassert 0 <= conf_thres <= 1, f'Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0'assert 0 <= iou_thres <= 1, f'Invalid IoU {iou_thres}, valid values are between 0.0 and 1.0'if isinstance(prediction, (list, tuple)):  # YOLOv5 model in validation model, output = (inference_out, loss_out)prediction = prediction[0]  # select only inference outputbs = prediction.shape[0]  # batch sizenc = prediction.shape[2] - nm - 5  # number of classesxc = prediction[..., 4] > conf_thres  # candidates# Settingsmax_wh = 7680  # (pixels) maximum box width and heightmax_nms = 30000  # maximum number of boxes into torchvision.ops.nms()redundant = True  # require redundant detectionsmulti_label &= nc > 1  # multiple labels per box (adds 0.5ms/img)merge = False  # use merge-NMSmi = 5 + nc  # mask start indexoutput = [np.zeros((0, 6 + nm))] * bsfor xi, x in enumerate(prediction):  # image index, image inferencex = x[xc[xi]]  # confidenceif labels and len(labels[xi]):lb = labels[xi]v = np.zeros(len(lb), nc + nm + 5)v[:, :4] = lb[:, 1:5]  # boxv[:, 4] = 1.0  # confv[range(len(lb)), lb[:, 0].long() + 5] = 1.0  # clsx = np.concatenate((x, v), 0)# If none remain process next imageif not x.shape[0]:continuex[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf# Box/Maskbox = xywh2xyxy(x[:, :4])  # center_x, center_y, width, height) to (x1, y1, x2, y2)mask = x[:, mi:]  # zero columns if no masks# Detections matrix nx6 (xyxy, conf, cls)if multi_label:i, j = (x[:, 5:mi] > conf_thres).nonzero(as_tuple=False).Tx = np.concatenate((box[i], x[i, 5 + j, None], j[:, None].float(), mask[i]), 1)else:  # best class onlyconf = np.max(x[:, 5:mi], 1).reshape(box.shape[:1][0], 1)j = np.argmax(x[:, 5:mi], 1).reshape(box.shape[:1][0], 1)x = np.concatenate((box, conf, j, mask), 1)[conf.reshape(box.shape[:1][0]) > conf_thres]# Filter by classif classes is not None:x = x[(x[:, 5:6] == np.array(classes, device=x.device)).any(1)]# Check shapen = x.shape[0]  # number of boxesif not n:  # no boxescontinueindex = x[:, 4].argsort(axis=0)[:max_nms][::-1]x = x[index]# Batched NMSc = x[:, 5:6] * (0 if agnostic else max_wh)  # classesboxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scoresi = nms_boxes(boxes, scores)i = i[:max_det]  # limit detections# 用来合并框的if merge and (1 < n < 3E3):  # Merge NMS (boxes merged using weighted mean)iou = box_iou(boxes[i], boxes) > iou_thres  # iou matrixweights = iou * scores[None]  # box weightsx[i, :4] = np.multiply(weights, x[:, :4]).float() / weights.sum(1, keepdim=True)  # merged boxesif redundant:i = i[iou.sum(1) > 1]  # require redundancyoutput[xi] = x[i]return outputdef clip_boxes(boxes, shape):# Clip boxes (xyxy) to image shape (height, width)boxes[..., [0, 2]] = boxes[..., [0, 2]].clip(0, shape[1])  # x1, x2boxes[..., [1, 3]] = boxes[..., [1, 3]].clip(0, shape[0])  # y1, y2def scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None):# Rescale boxes (xyxy) from img1_shape to img0_shapeif ratio_pad is None:  # calculate from img0_shapegain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / newpad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2  # wh paddingelse:gain = ratio_pad[0][0]pad = ratio_pad[1]boxes[..., [0, 2]] -= pad[0]  # x paddingboxes[..., [1, 3]] -= pad[1]  # y paddingboxes[..., :4] /= gainclip_boxes(boxes, img0_shape)return boxesif __name__ == "__main__":PaddleLite_ModulePath = "/PATH_to_nb_Model"IMG_Path = "/PATH_to_test.jpg"imgsz = (640, 640)img = cv2.imread(IMG_Path)img = cv2.resize(img, (640, 640))# preprocessim = letterbox(img, imgsz, auto=True)[0]  # padded resizeim = im.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGBim = np.ascontiguousarray(im)  # contiguousim = im.astype(np.float32)im /= 255  # 0 - 255 to 0.0 - 1.0if len(im.shape) == 3:im = im[None]  # expand for batch dim# 1. 设置配置信息config = MobileConfig()config.set_model_from_file(PaddleLite_ModulePath)# 2. 创建预测器predictor = create_paddle_predictor(config)# 3. 获取输入Tensor的引用,用来设置输入数据,参数表示第几个输入,单输入时为0input_tensor = predictor.get_input(0)input_tensor.from_numpy(im)# 4. 执行推理,需要在设置输入数据后使用predictor.run()print("predictor:", predictor)# 5. 获取输出Tensor的引用,用来设置输出数据,参数表示第几个输出,单输出时为0output_tensor = predictor.get_output(0)pred = output_tensor.numpy()# NMSconf_thres = 0.25  # confidence thresholdiou_thres = 0.45  # NMS IOU thresholdmax_det = 1000  # maximum detections per imageclasses = None  # filter by class: --class 0, or --class 0 2 3agnostic_nms = False  # class-agnostic NMSpred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)# Process predictionsseen = 0for i, det in enumerate(pred):  # per imageseen += 1if len(det):# Rescale boxes from img_size to im0 sizedet[:, :4] = scale_boxes(im.shape[2:], det[:, :4], img.shape).round()# print(pred)outputs = pred[0][:, :6]if len(outputs[:, 4:] > 0):for i in outputs:prob = i[4]cls = int(i[5])prob = np.around(prob, decimals=2)if prob >= 0.4:all_pred_boxes = i[:4]for b in range(len(all_pred_boxes)):x1 = int(all_pred_boxes[0])y1 = int(all_pred_boxes[1])x2 = int(all_pred_boxes[2])y2 = int(all_pred_boxes[3])cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 1)cv2.putText(img, CLASSES[cls]+' '+str(prob), (x1, y1), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0, 255, 0), 1, 4)cv2.imwrite('./data/images/test_paddle_03.png', img)

4.3推理结果

 

 


http://www.ppmy.cn/news/229965.html

相关文章

android11有哪种手机,支持安卓11系统的机型有哪些

安卓11系统可以在什么手机上使用?支持Android 11升级的具体机型有哪些?美国当地时间周二,谷歌正式发布了最新版本的移动操作系统 Android 11,并将源代码推给了 Android 开放源代码项目 (AOSP)。那么安卓11支持哪些机型呢?下面就让小编给大家介绍一下。 与之前的旗舰版本不…

小米8线刷pixel experience全过程记录

文章目录 刷recovery刷入系统root遇到的问题 刷recovery 1.下载 recovery镜像 2.执行 adb fastboot reboot 3.执行 fastboot flash recovery twrp.img 4.fastboot reboot 5.在执行4的同时&#xff0c;要按住音量上键电源键&#xff0c;来进入一次fastboot&#xff0c;否则会导…

spring_2

Spring Bean的作用域 singleton: Spring的默认作用域,容器里拥有唯一的Bean实例,适合无状态的Bean.prototype: 针对每个getBean请求, 容器都会创建一个Bean对象,适合有状态的Bean.request: 会为每个Http请求创建一个Bean实例,该作用域只针对web容器有效.session: 会为每个sess…

小米手机10 青春版(Mi 10 Lite Zoom)TWRP刷入12.5版教程

本刷机教程是本人在无内测资格的情况下&#xff0c;刷入TWRP后输入12.5开发版系统&#xff0c;本文提供了对应的TWRP的下载&#xff0c;已经目前最新12.5卡刷包下载。 注意针对机型&#xff1a;Mi 10 Lite Zoom 注意针对机型&#xff1a;Mi 10 Lite Zoom 注意针对机型&#xf…

学习HTTP2

这是有关HTTP最后一部分内容的学习啦~今天一起来学习HTTP/2&#xff01; HTTP2 HTTP/2&#xff08;超文本传输协议第2版&#xff0c;最初命名为HTTP 2.0&#xff09;&#xff0c;简称为h2&#xff08;基于TLS/1.2或以上版本的加密连接&#xff09;或h2c&#xff08;非加密连接…

【Linux】TCP网络套接字编程+协议定制+序列化和反序列化

悟已往之不谏&#xff0c;知来者之可追。抓不住的就放手&#xff0c;属于你的都在路上…… 文章目录 一、TCP网络套接字编程1.日志等级分类的日志输出API2.单进程版本的服务器客户端通信3.多进程版本和多线程版本4.线程池版本5.守护进程化的线程池服务器6.三次握手和四次挥手的…

kafka二

练一练 需求&#xff1a;写一个生产者&#xff0c;不断的去生产用户行为数据&#xff0c;写入到kafka的一个topic中 生产的数据格式&#xff1a; 造数据 {"guid":1,"eventId":"pageview","timestamp":1637868346789} isNew 1 {&quo…

Linux进程间通信【匿名管道】

✨个人主页&#xff1a; 北 海 &#x1f389;所属专栏&#xff1a; Linux学习之旅 &#x1f383;操作环境&#xff1a; CentOS 7.6 阿里云远程服务器 文章目录 &#x1f307;前言&#x1f3d9;️正文1、进程间通信相关概念1.1、目的1.2、发展1.3、分类 2、什么是管道&#xff1…