【YOLOv8改进[Backbone]】使用MobileNetV3助力YOLOv8网络结构轻量化并助力涨点

目录

一 MobileNetV3

1 面向块搜索的平台感知NAS和NetAdapt

2 反向残差和线性瓶颈

YOLOv8-toc" style="margin-left:0px;">二 使用MobileNetV3助力YOLOv8

1 整体修改

① 添加MobileNetV3.py文件

② 修改ultralytics/nn/tasks.py文件

③ 修改ultralytics/utils/torch_utils.py文件

2 配置文件

3 训练

其他

报错


一 MobileNetV3

官方论文地址:https://arxiv.org/pdf/1905.02244v5.pdf

官方代码地址:https://gitcode.com/Shubhamai/pytorchmobilenet/blob/main/MobileNetV3.py

本论文中提出了基于互补搜索技术新颖架构设计相结合的下一代mobilenetMobileNetV3通过硬件感知网络架构搜索(NAS)NetAdapt算法的结合来调整优化,适应移动电话CPU,然后通过新的架构设计(反转残差结构、线性瓶颈层)进行改进。本文开始探索自动搜索算法和网络设计如何协同工作,以利用互补的方法提高整体技术水平。通过这个过程,创建了两个新的MobileNet模型:MobileNetV3-Large和MobileNetV3-Small,它们分别针对高资源和低资源用例。然后将这些模型应用于目标检测和语义分割任务。对于语义分割(或任何密集像素预测)的任务,提出了一种新的高效分割解码器:精简空间金字塔池(LR-ASPP)。在移动分类、检测和分割方面取得了最新的成果。

与MobileNetV2相比MobileNetV3-Large在减少延迟(减少了20%)的同时,在ImageNet分类上的准确率提高了3.2%。与具有相似可比延迟的MobileNetV2模型相比,MobileNetV3-Small的准确率提高了6.6%MobileNetV3-Large检测速度超过25%,与MobileNetV2在COCO检测上的精度大致相同。在相似的城市景观分割任务准确度相近的情况下,MobileNetV3-Large LR-ASPP比MobileNetV2 R-ASPP快34%

1 面向块搜索的平台感知NAS和NetAdapt

网络搜索已经证明自己是发现和优化网络架构的一个非常强大的工具。对于MobileNetV3,使用平台感知NAS通过优化每个网络块来搜索全局的网络结构。然后,使用NetAdapt算法搜索每层filter的数量。这些技术是互补的,可以结合起来有效地找到针对给定硬件平台的优化模型。

采用了一种平台感知的神经结构方法来寻找全局网络结构。使用相同的基于rnn的控制器和相同的分解分层搜索空间,对于目标延迟约为80ms的大型移动模型,我们发现与[43]Mnasnet: Platform-aware neural architecture search for mobile. 结果相似。简单地重用相同的MnasNet-A1[43]作为最初的大型移动模型,然后在其上应用NetAdapt和其他优化。对于小模型,准确率随延迟的变化更为显著;因此,需要一个较小的权重因子w = - 0.15来补偿不同延迟带来的较大精度变化在这个新的权重因子w的增强下,从头开始一个新的架构搜索,以找到初始的种子模型,然后应用NetAdapt(它允许以顺序的方式对各个层进行微调,而不是试图进行粗略的推断而是全局架构)和其他优化来获得最终的MobileNetV3-Small模型。

2 反向残差和线性瓶颈

MobileNetV2层(反向残差和线性瓶颈)。每个块由狭窄的输入和输出(瓶颈)组成,它们不具有非线性,然后扩展到更高维度的空间并投影到输出。残差连接瓶颈(而不是扩展)。

MobileNetV2 + queeze-and-Excite。与此形成鲜明对比的是,对残差层施加压缩和激励操作。根据不同的层使用不同的非线性。

YOLOv8" style="margin-left:.0001pt;text-align:left;">二 使用MobileNetV3助力YOLOv8

1 整体修改

① 添加MobileNetV3.py文件

ultralytics/nn/modules目录下新建MobileNetV3.py文件,文件的内容如下:

"""A from-scratch implementation of MobileNetV3 paper ( for educational purposes ).
PaperSearching for MobileNetV3 - https://arxiv.org/abs/1905.02244v5
author : shubham.aiengineer@gmail.com
"""import torch
from torch import nn
from torchsummary import summary__all__ = ['MobileNetV3']class SqueezeExitationBlock(nn.Module):def __init__(self, in_channels: int):"""Constructor for SqueezeExitationBlock.Args:in_channels (int): Number of input channels."""super().__init__()self.pool1 = nn.AdaptiveAvgPool2d(1)self.linear1 = nn.Linear(in_channels, in_channels // 4)  # divide by 4 is mentioned in the paper, 5.3. Large squeeze-and-exciteself.act1 = nn.ReLU()self.linear2 = nn.Linear(in_channels // 4, in_channels)self.act2 = nn.Hardsigmoid()def forward(self, x):"""Forward pass for SqueezeExitationBlock."""identity = xx = self.pool1(x)x = torch.flatten(x, 1)x = self.linear1(x)x = self.act1(x)x = self.linear2(x)x = self.act2(x)x = identity * x[:, :, None, None]return xclass ConvNormActivationBlock(nn.Module):def __init__(self,in_channels: int,out_channels: int,kernel_size: list,stride: int = 1,padding: int = 0,groups: int = 1,bias: bool = False,activation: torch.nn = nn.Hardswish,):"""Constructs a block containing a convolution, batch normalization and activation layerArgs:in_channels (int): number of input channelsout_channels (int): number of output channelskernel_size (list): size of the convolutional kernelstride (int, optional): stride of the convolutional kernel. Defaults to 1.padding (int, optional): padding of the convolutional kernel. Defaults to 0.groups (int, optional): number of groups for depthwise seperable convolution. Defaults to 1.bias (bool, optional): whether to use bias. Defaults to False.activation (torch.nn, optional): activation function. Defaults to nn.Hardswish."""super().__init__()self.conv = nn.Conv2d(in_channels,out_channels,kernel_size,stride=stride,padding=padding,groups=groups,bias=bias,)self.norm = nn.BatchNorm2d(out_channels)self.activation = activation()def forward(self, x):"""Perform forward pass."""x = self.conv(x)x = self.norm(x)x = self.activation(x)return xclass InverseResidualBlock(nn.Module):def __init__(self,in_channels: int,out_channels: int,kernel_size: int,expansion_size: int = 6,stride: int = 1,squeeze_exitation: bool = True,activation: nn.Module = nn.Hardswish,):"""Constructs a inverse residual blockArgs:in_channels (int): number of input channelsout_channels (int): number of output channelskernel_size (int): size of the convolutional kernelexpansion_size (int, optional): size of the expansion factor. Defaults to 6.stride (int, optional): stride of the convolutional kernel. Defaults to 1.squeeze_exitation (bool, optional): whether to add squeeze and exitation block or not. Defaults to True.activation (nn.Module, optional): activation function. Defaults to nn.Hardswish."""super().__init__()self.residual = in_channels == out_channels and stride == 1self.squeeze_exitation = squeeze_exitationself.conv1 = (ConvNormActivationBlock(in_channels, expansion_size, (1, 1), activation=activation)if in_channels != expansion_sizeelse nn.Identity())  # If it's not the first layer, then we need to add a 1x1 convolutional layer to expand the number of channelsself.depthwise_conv = ConvNormActivationBlock(expansion_size,expansion_size,(kernel_size, kernel_size),stride=stride,padding=kernel_size // 2,groups=expansion_size,activation=activation,)if self.squeeze_exitation:self.se = SqueezeExitationBlock(expansion_size)self.conv2 = nn.Conv2d(expansion_size, out_channels, (1, 1), bias=False)  # bias is false because we are using batch normalization, which already has biasself.norm = nn.BatchNorm2d(out_channels)def forward(self, x):"""Perform forward pass."""identity = xx = self.conv1(x)x = self.depthwise_conv(x)if self.squeeze_exitation:x = self.se(x)x = self.conv2(x)x = self.norm(x)if self.residual:x = x + identityreturn xclass MobileNetV3(nn.Module):def __init__(self,n_classes: int = 1000,input_channel: int = 3,config: str = "large",dropout: float = 0.8,):"""Constructs MobileNetV3 architectureArgs:`n_classes`: An integer count of output neuron in last layer, default 1000`input_channel`: An integer value input channels in first conv layer, default is 3.`config`: A string value indicating the configuration of MobileNetV3, either `large` or `small`, default is `large`.`dropout` [0, 1] : A float parameter for dropout in last layer, between 0 and 1, default is 0.8."""super().__init__()# The configuration of MobileNetv3.# input channels, kernel size, expension size, output channels, squeeze exitation, activation, strideRE = nn.ReLUHS = nn.Hardswishconfigs_dict = {"small": ((16, 3, 16, 16, True, RE, 2),(16, 3, 72, 24, False, RE, 2),(24, 3, 88, 24, False, RE, 1),(24, 5, 96, 40, True, HS, 2),(40, 5, 240, 40, True, HS, 1),(40, 5, 240, 40, True, HS, 1),(40, 5, 120, 48, True, HS, 1),(48, 5, 144, 48, True, HS, 1),(48, 5, 288, 96, True, HS, 2),(96, 5, 576, 96, True, HS, 1),(96, 5, 576, 96, True, HS, 1),),"large": ((16, 3, 16, 16, False, RE, 1),(16, 3, 64, 24, False, RE, 2),(24, 3, 72, 24, False, RE, 1),(24, 5, 72, 40, True, RE, 2),(40, 5, 120, 40, True, RE, 1),(40, 5, 120, 40, True, RE, 1),(40, 3, 240, 80, False, HS, 2),(80, 3, 200, 80, False, HS, 1),(80, 3, 184, 80, False, HS, 1),(80, 3, 184, 80, False, HS, 1),(80, 3, 480, 112, True, HS, 1),(112, 3, 672, 112, True, HS, 1),(112, 5, 672, 160, True, HS, 2),(160, 5, 960, 160, True, HS, 1),(160, 5, 960, 160, True, HS, 1),),}self.model = nn.Sequential(ConvNormActivationBlock(input_channel, 16, (3, 3), stride=2, padding=1, activation=nn.Hardswish),)for (in_channels,kernel_size,expansion_size,out_channels,squeeze_exitation,activation,stride,) in configs_dict[config]:self.model.append(InverseResidualBlock(in_channels=in_channels,out_channels=out_channels,kernel_size=kernel_size,expansion_size=expansion_size,stride=stride,squeeze_exitation=squeeze_exitation,activation=activation,))hidden_channels = 576 if config == "small" else 960_out_channel = 1024 if config == "small" else 1280self.model.append(ConvNormActivationBlock(out_channels,hidden_channels,(1, 1),bias=False,activation=nn.Hardswish,))if config == 'small':self.index = [16, 24, 48, 576]else:self.index = [24, 40, 112, 960]self.width_list = [i.size(1) for i in self.forward(torch.randn(1, 3, 640, 640))]def forward(self, x):"""Perform forward pass."""results = [None, None, None, None]for model in self.model:x = model(x)if x.size(1) in self.index:position = self.index.index(x.size(1))  # Find the position in the index listresults[position] = x# results.append(x)return resultsif __name__ == "__main__":# Generating Sample imageimage_size = (1, 3, 640, 640)image = torch.rand(*image_size)# Modelmobilenet_v3 = MobileNetV3(config="large")# summary(#     mobilenet_v3,#     input_data=image,#     col_names=["input_size", "output_size", "num_params"],#     device="cpu",#     depth=2,# )out = mobilenet_v3(image)print(out)

② 修改ultralytics/nn/tasks.py文件

具体的修改内容如下图所示:

③ 修改ultralytics/utils/torch_utils.py文件

2 配置文件

yolov8_MobileNetV3.yaml 的内容与原版对比:

3 训练

上述修改完毕后,开始训练吧!🌺🌺🌺

训练示例:

yolo task=detect mode=train model=cfg/models/v8/yolov8_MobileNetV3.yaml data=cfg/datasets/coco128.yaml epochs=100 batch=16 device=cpu project=yolov8

其他

如果觉得替换部分内容不方便的话,可以直接复制下述文件对应替换原始py文件的内容:

  • 修改后的task.py
# Ultralytics YOLO 🚀, AGPL-3.0 licenseimport contextlib
from copy import deepcopy
from pathlib import Pathimport torch
import torch.nn as nnfrom ultralytics.nn.modules import (AIFI,C1,C2,C3,C3TR,OBB,SPP,SPPELAN,SPPF,ADown,Bottleneck,BottleneckCSP,C2f,C2fAttn,C3Ghost,C3x,CBFuse,CBLinear,Classify,Concat,Conv,Conv2,ConvTranspose,Detect,DWConv,DWConvTranspose2d,Focus,GhostBottleneck,GhostConv,HGBlock,HGStem,ImagePoolingAttn,Pose,RepC3,RepConv,RepNCSPELAN4,ResNetLayer,RTDETRDecoder,Segment,Silence,WorldDetect,
)
from ultralytics.utils import DEFAULT_CFG_DICT, DEFAULT_CFG_KEYS, LOGGER, colorstr, emojis, yaml_load
from ultralytics.utils.checks import check_requirements, check_suffix, check_yaml
from ultralytics.utils.loss import v8ClassificationLoss, v8DetectionLoss, v8OBBLoss, v8PoseLoss, v8SegmentationLoss
from ultralytics.utils.plotting import feature_visualization
from ultralytics.utils.torch_utils import (fuse_conv_and_bn,fuse_deconv_and_bn,initialize_weights,intersect_dicts,make_divisible,model_info,scale_img,time_sync,
)
from .modules.DynamicHead import Detect_DynamicHead #导入DynamicHead.py中的检测头
from .modules.OREPA import OREPA #导入OREPA
from .modules.BiFPN import Bi_FPN #导入Bi_FPN
from .modules.MobileNetV3 import MobileNetV3 #导入MobileNetV3
try:import thop
except ImportError:thop = Noneclass BaseModel(nn.Module):"""The BaseModel class serves as a base class for all the models in the Ultralytics YOLO family."""def forward(self, x, *args, **kwargs):"""Forward pass of the model on a single scale. Wrapper for `_forward_once` method.Args:x (torch.Tensor | dict): The input image tensor or a dict including image tensor and gt labels.Returns:(torch.Tensor): The output of the network."""if isinstance(x, dict):  # for cases of training and validating while training.return self.loss(x, *args, **kwargs)return self.predict(x, *args, **kwargs)def predict(self, x, profile=False, visualize=False, augment=False, embed=None):"""Perform a forward pass through the network.Args:x (torch.Tensor): The input tensor to the model.profile (bool):  Print the computation time of each layer if True, defaults to False.visualize (bool): Save the feature maps of the model if True, defaults to False.augment (bool): Augment image during prediction, defaults to False.embed (list, optional): A list of feature vectors/embeddings to return.Returns:(torch.Tensor): The last output of the model."""if augment:return self._predict_augment(x)return self._predict_once(x, profile, visualize, embed)def _predict_once(self, x, profile=False, visualize=False, embed=None):"""Perform a forward pass through the network.Args:x (torch.Tensor): The input tensor to the model.profile (bool):  Print the computation time of each layer if True, defaults to False.visualize (bool): Save the feature maps of the model if True, defaults to False.embed (list, optional): A list of feature vectors/embeddings to return.Returns:(torch.Tensor): The last output of the model."""y, dt, embeddings = [], [], []  # outputsfor m in self.model:if m.f != -1:  # if not from previous layerx = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layersif profile:self._profile_one_layer(m, x, dt)if hasattr(m, 'backbone'):x = m(x)if len(x) != 5:  # 0 - 5x.insert(0, None)for index, i in enumerate(x):if index in self.save:y.append(i)else:y.append(None)x = x[-1]  # 最后一个输出传给下一层else:x = m(x)  # runy.append(x if m.i in self.save else None)  # save outputif visualize:feature_visualization(x, m.type, m.i, save_dir=visualize)if embed and m.i in embed:embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1))  # flattenif m.i == max(embed):return torch.unbind(torch.cat(embeddings, 1), dim=0)return xdef _predict_augment(self, x):"""Perform augmentations on input image x and return augmented inference."""LOGGER.warning(f"WARNING ⚠️ {self.__class__.__name__} does not support augmented inference yet. "f"Reverting to single-scale inference instead.")return self._predict_once(x)def _profile_one_layer(self, m, x, dt):"""Profile the computation time and FLOPs of a single layer of the model on a given input. Appends the results tothe provided list.Args:m (nn.Module): The layer to be profiled.x (torch.Tensor): The input data to the layer.dt (list): A list to store the computation time of the layer.Returns:None"""c = m == self.model[-1] and isinstance(x, list)  # is final layer list, copy input as inplace fixflops = thop.profile(m, inputs=[x.copy() if c else x], verbose=False)[0] / 1e9 * 2 if thop else 0  # FLOPst = time_sync()for _ in range(10):m(x.copy() if c else x)dt.append((time_sync() - t) * 100)if m == self.model[0]:LOGGER.info(f"{'time (ms)':>10s} {'GFLOPs':>10s} {'params':>10s}  module")LOGGER.info(f"{dt[-1]:10.2f} {flops:10.2f} {m.np:10.0f}  {m.type}")if c:LOGGER.info(f"{sum(dt):10.2f} {'-':>10s} {'-':>10s}  Total")def fuse(self, verbose=True):"""Fuse the `Conv2d()` and `BatchNorm2d()` layers of the model into a single layer, in order to improve thecomputation efficiency.Returns:(nn.Module): The fused model is returned."""if not self.is_fused():for m in self.model.modules():if isinstance(m, (Conv, Conv2, DWConv)) and hasattr(m, "bn"):if isinstance(m, Conv2):m.fuse_convs()m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update convdelattr(m, "bn")  # remove batchnormm.forward = m.forward_fuse  # update forwardif isinstance(m, ConvTranspose) and hasattr(m, "bn"):m.conv_transpose = fuse_deconv_and_bn(m.conv_transpose, m.bn)delattr(m, "bn")  # remove batchnormm.forward = m.forward_fuse  # update forwardif isinstance(m, RepConv):m.fuse_convs()m.forward = m.forward_fuse  # update forwardself.info(verbose=verbose)return selfdef is_fused(self, thresh=10):"""Check if the model has less than a certain threshold of BatchNorm layers.Args:thresh (int, optional): The threshold number of BatchNorm layers. Default is 10.Returns:(bool): True if the number of BatchNorm layers in the model is less than the threshold, False otherwise."""bn = tuple(v for k, v in nn.__dict__.items() if "Norm" in k)  # normalization layers, i.e. BatchNorm2d()return sum(isinstance(v, bn) for v in self.modules()) < thresh  # True if < 'thresh' BatchNorm layers in modeldef info(self, detailed=False, verbose=True, imgsz=640):"""Prints model information.Args:detailed (bool): if True, prints out detailed information about the model. Defaults to Falseverbose (bool): if True, prints out the model information. Defaults to Falseimgsz (int): the size of the image that the model will be trained on. Defaults to 640"""return model_info(self, detailed=detailed, verbose=verbose, imgsz=imgsz)def _apply(self, fn):"""Applies a function to all the tensors in the model that are not parameters or registered buffers.Args:fn (function): the function to apply to the modelReturns:(BaseModel): An updated BaseModel object."""self = super()._apply(fn)m = self.model[-1]  # Detect()if isinstance(m, (Detect, Detect_DynamicHead)):  # includes all Detect subclasses like Segment, Pose, OBB, WorldDetectm.stride = fn(m.stride)m.anchors = fn(m.anchors)m.strides = fn(m.strides)return selfdef load(self, weights, verbose=True):"""Load the weights into the model.Args:weights (dict | torch.nn.Module): The pre-trained weights to be loaded.verbose (bool, optional): Whether to log the transfer progress. Defaults to True."""model = weights["model"] if isinstance(weights, dict) else weights  # torchvision models are not dictscsd = model.float().state_dict()  # checkpoint state_dict as FP32csd = intersect_dicts(csd, self.state_dict())  # intersectself.load_state_dict(csd, strict=False)  # loadif verbose:LOGGER.info(f"Transferred {len(csd)}/{len(self.model.state_dict())} items from pretrained weights")def loss(self, batch, preds=None):"""Compute loss.Args:batch (dict): Batch to compute loss onpreds (torch.Tensor | List[torch.Tensor]): Predictions."""if not hasattr(self, "criterion"):self.criterion = self.init_criterion()preds = self.forward(batch["img"]) if preds is None else predsreturn self.criterion(preds, batch)def init_criterion(self):"""Initialize the loss criterion for the BaseModel."""raise NotImplementedError("compute_loss() needs to be implemented by task heads")class DetectionModel(BaseModel):"""YOLOv8 detection model."""def __init__(self, cfg="yolov8n.yaml", ch=3, nc=None, verbose=True):  # model, input channels, number of classes"""Initialize the YOLOv8 detection model with the given config and parameters."""super().__init__()self.yaml = cfg if isinstance(cfg, dict) else yaml_model_load(cfg)  # cfg dict# Define modelch = self.yaml["ch"] = self.yaml.get("ch", ch)  # input channelsif nc and nc != self.yaml["nc"]:LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}")self.yaml["nc"] = nc  # override YAML valueself.model, self.save = parse_model(deepcopy(self.yaml), ch=ch, verbose=verbose)  # model, savelistself.names = {i: f"{i}" for i in range(self.yaml["nc"])}  # default names dictself.inplace = self.yaml.get("inplace", True)# Build stridesm = self.model[-1]  # Detect()if isinstance(m, (Detect, Detect_DynamicHead)):  # includes all Detect subclasses like Segment, Pose, OBB, WorldDetects = 256  # 2x min stridem.inplace = self.inplaceforward = lambda x: self.forward(x)[0] if isinstance(m, (Segment, Pose, OBB)) else self.forward(x)m.stride = torch.tensor([s / x.shape[-2] for x in forward(torch.zeros(1, ch, s, s))])  # forwardself.stride = m.stridem.bias_init()  # only run onceelse:self.stride = torch.Tensor([32])  # default stride for i.e. RTDETR# Init weights, biasesinitialize_weights(self)if verbose:self.info()LOGGER.info("")def _predict_augment(self, x):"""Perform augmentations on input image x and return augmented inference and train outputs."""img_size = x.shape[-2:]  # height, widths = [1, 0.83, 0.67]  # scalesf = [None, 3, None]  # flips (2-ud, 3-lr)y = []  # outputsfor si, fi in zip(s, f):xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max()))yi = super().predict(xi)[0]  # forwardyi = self._descale_pred(yi, fi, si, img_size)y.append(yi)y = self._clip_augmented(y)  # clip augmented tailsreturn torch.cat(y, -1), None  # augmented inference, train@staticmethoddef _descale_pred(p, flips, scale, img_size, dim=1):"""De-scale predictions following augmented inference (inverse operation)."""p[:, :4] /= scale  # de-scalex, y, wh, cls = p.split((1, 1, 2, p.shape[dim] - 4), dim)if flips == 2:y = img_size[0] - y  # de-flip udelif flips == 3:x = img_size[1] - x  # de-flip lrreturn torch.cat((x, y, wh, cls), dim)def _clip_augmented(self, y):"""Clip YOLO augmented inference tails."""nl = self.model[-1].nl  # number of detection layers (P3-P5)g = sum(4**x for x in range(nl))  # grid pointse = 1  # exclude layer counti = (y[0].shape[-1] // g) * sum(4**x for x in range(e))  # indicesy[0] = y[0][..., :-i]  # largei = (y[-1].shape[-1] // g) * sum(4 ** (nl - 1 - x) for x in range(e))  # indicesy[-1] = y[-1][..., i:]  # smallreturn ydef init_criterion(self):"""Initialize the loss criterion for the DetectionModel."""return v8DetectionLoss(self)class OBBModel(DetectionModel):"""YOLOv8 Oriented Bounding Box (OBB) model."""def __init__(self, cfg="yolov8n-obb.yaml", ch=3, nc=None, verbose=True):"""Initialize YOLOv8 OBB model with given config and parameters."""super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose)def init_criterion(self):"""Initialize the loss criterion for the model."""return v8OBBLoss(self)class SegmentationModel(DetectionModel):"""YOLOv8 segmentation model."""def __init__(self, cfg="yolov8n-seg.yaml", ch=3, nc=None, verbose=True):"""Initialize YOLOv8 segmentation model with given config and parameters."""super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose)def init_criterion(self):"""Initialize the loss criterion for the SegmentationModel."""return v8SegmentationLoss(self)class PoseModel(DetectionModel):"""YOLOv8 pose model."""def __init__(self, cfg="yolov8n-pose.yaml", ch=3, nc=None, data_kpt_shape=(None, None), verbose=True):"""Initialize YOLOv8 Pose model."""if not isinstance(cfg, dict):cfg = yaml_model_load(cfg)  # load model YAMLif any(data_kpt_shape) and list(data_kpt_shape) != list(cfg["kpt_shape"]):LOGGER.info(f"Overriding model.yaml kpt_shape={cfg['kpt_shape']} with kpt_shape={data_kpt_shape}")cfg["kpt_shape"] = data_kpt_shapesuper().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose)def init_criterion(self):"""Initialize the loss criterion for the PoseModel."""return v8PoseLoss(self)class ClassificationModel(BaseModel):"""YOLOv8 classification model."""def __init__(self, cfg="yolov8n-cls.yaml", ch=3, nc=None, verbose=True):"""Init ClassificationModel with YAML, channels, number of classes, verbose flag."""super().__init__()self._from_yaml(cfg, ch, nc, verbose)def _from_yaml(self, cfg, ch, nc, verbose):"""Set YOLOv8 model configurations and define the model architecture."""self.yaml = cfg if isinstance(cfg, dict) else yaml_model_load(cfg)  # cfg dict# Define modelch = self.yaml["ch"] = self.yaml.get("ch", ch)  # input channelsif nc and nc != self.yaml["nc"]:LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}")self.yaml["nc"] = nc  # override YAML valueelif not nc and not self.yaml.get("nc", None):raise ValueError("nc not specified. Must specify nc in model.yaml or function arguments.")self.model, self.save = parse_model(deepcopy(self.yaml), ch=ch, verbose=verbose)  # model, savelistself.stride = torch.Tensor([1])  # no stride constraintsself.names = {i: f"{i}" for i in range(self.yaml["nc"])}  # default names dictself.info()@staticmethoddef reshape_outputs(model, nc):"""Update a TorchVision classification model to class count 'n' if required."""name, m = list((model.model if hasattr(model, "model") else model).named_children())[-1]  # last moduleif isinstance(m, Classify):  # YOLO Classify() headif m.linear.out_features != nc:m.linear = nn.Linear(m.linear.in_features, nc)elif isinstance(m, nn.Linear):  # ResNet, EfficientNetif m.out_features != nc:setattr(model, name, nn.Linear(m.in_features, nc))elif isinstance(m, nn.Sequential):types = [type(x) for x in m]if nn.Linear in types:i = types.index(nn.Linear)  # nn.Linear indexif m[i].out_features != nc:m[i] = nn.Linear(m[i].in_features, nc)elif nn.Conv2d in types:i = types.index(nn.Conv2d)  # nn.Conv2d indexif m[i].out_channels != nc:m[i] = nn.Conv2d(m[i].in_channels, nc, m[i].kernel_size, m[i].stride, bias=m[i].bias is not None)def init_criterion(self):"""Initialize the loss criterion for the ClassificationModel."""return v8ClassificationLoss()class RTDETRDetectionModel(DetectionModel):"""RTDETR (Real-time DEtection and Tracking using Transformers) Detection Model class.This class is responsible for constructing the RTDETR architecture, defining loss functions, and facilitating boththe training and inference processes. RTDETR is an object detection and tracking model that extends from theDetectionModel base class.Attributes:cfg (str): The configuration file path or preset string. Default is 'rtdetr-l.yaml'.ch (int): Number of input channels. Default is 3 (RGB).nc (int, optional): Number of classes for object detection. Default is None.verbose (bool): Specifies if summary statistics are shown during initialization. Default is True.Methods:init_criterion: Initializes the criterion used for loss calculation.loss: Computes and returns the loss during training.predict: Performs a forward pass through the network and returns the output."""def __init__(self, cfg="rtdetr-l.yaml", ch=3, nc=None, verbose=True):"""Initialize the RTDETRDetectionModel.Args:cfg (str): Configuration file name or path.ch (int): Number of input channels.nc (int, optional): Number of classes. Defaults to None.verbose (bool, optional): Print additional information during initialization. Defaults to True."""super().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose)def init_criterion(self):"""Initialize the loss criterion for the RTDETRDetectionModel."""from ultralytics.models.utils.loss import RTDETRDetectionLossreturn RTDETRDetectionLoss(nc=self.nc, use_vfl=True)def loss(self, batch, preds=None):"""Compute the loss for the given batch of data.Args:batch (dict): Dictionary containing image and label data.preds (torch.Tensor, optional): Precomputed model predictions. Defaults to None.Returns:(tuple): A tuple containing the total loss and main three losses in a tensor."""if not hasattr(self, "criterion"):self.criterion = self.init_criterion()img = batch["img"]# NOTE: preprocess gt_bbox and gt_labels to list.bs = len(img)batch_idx = batch["batch_idx"]gt_groups = [(batch_idx == i).sum().item() for i in range(bs)]targets = {"cls": batch["cls"].to(img.device, dtype=torch.long).view(-1),"bboxes": batch["bboxes"].to(device=img.device),"batch_idx": batch_idx.to(img.device, dtype=torch.long).view(-1),"gt_groups": gt_groups,}preds = self.predict(img, batch=targets) if preds is None else predsdec_bboxes, dec_scores, enc_bboxes, enc_scores, dn_meta = preds if self.training else preds[1]if dn_meta is None:dn_bboxes, dn_scores = None, Noneelse:dn_bboxes, dec_bboxes = torch.split(dec_bboxes, dn_meta["dn_num_split"], dim=2)dn_scores, dec_scores = torch.split(dec_scores, dn_meta["dn_num_split"], dim=2)dec_bboxes = torch.cat([enc_bboxes.unsqueeze(0), dec_bboxes])  # (7, bs, 300, 4)dec_scores = torch.cat([enc_scores.unsqueeze(0), dec_scores])loss = self.criterion((dec_bboxes, dec_scores), targets, dn_bboxes=dn_bboxes, dn_scores=dn_scores, dn_meta=dn_meta)# NOTE: There are like 12 losses in RTDETR, backward with all losses but only show the main three losses.return sum(loss.values()), torch.as_tensor([loss[k].detach() for k in ["loss_giou", "loss_class", "loss_bbox"]], device=img.device)def predict(self, x, profile=False, visualize=False, batch=None, augment=False, embed=None):"""Perform a forward pass through the model.Args:x (torch.Tensor): The input tensor.profile (bool, optional): If True, profile the computation time for each layer. Defaults to False.visualize (bool, optional): If True, save feature maps for visualization. Defaults to False.batch (dict, optional): Ground truth data for evaluation. Defaults to None.augment (bool, optional): If True, perform data augmentation during inference. Defaults to False.embed (list, optional): A list of feature vectors/embeddings to return.Returns:(torch.Tensor): Model's output tensor."""y, dt, embeddings = [], [], []  # outputsfor m in self.model[:-1]:  # except the head partif m.f != -1:  # if not from previous layerx = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layersif profile:self._profile_one_layer(m, x, dt)x = m(x)  # runy.append(x if m.i in self.save else None)  # save outputif visualize:feature_visualization(x, m.type, m.i, save_dir=visualize)if embed and m.i in embed:embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1))  # flattenif m.i == max(embed):return torch.unbind(torch.cat(embeddings, 1), dim=0)head = self.model[-1]x = head([y[j] for j in head.f], batch)  # head inferencereturn xclass WorldModel(DetectionModel):"""YOLOv8 World Model."""def __init__(self, cfg="yolov8s-world.yaml", ch=3, nc=None, verbose=True):"""Initialize YOLOv8 world model with given config and parameters."""self.txt_feats = torch.randn(1, nc or 80, 512)  # features placeholderself.clip_model = None  # CLIP model placeholdersuper().__init__(cfg=cfg, ch=ch, nc=nc, verbose=verbose)def set_classes(self, text, batch=80, cache_clip_model=True):"""Set classes in advance so that model could do offline-inference without clip model."""try:import clipexcept ImportError:check_requirements("git+https://github.com/ultralytics/CLIP.git")import clipif (not getattr(self, "clip_model", None) and cache_clip_model):  # for backwards compatibility of models lacking clip_model attributeself.clip_model = clip.load("ViT-B/32")[0]model = self.clip_model if cache_clip_model else clip.load("ViT-B/32")[0]device = next(model.parameters()).devicetext_token = clip.tokenize(text).to(device)txt_feats = [model.encode_text(token).detach() for token in text_token.split(batch)]txt_feats = txt_feats[0] if len(txt_feats) == 1 else torch.cat(txt_feats, dim=0)txt_feats = txt_feats / txt_feats.norm(p=2, dim=-1, keepdim=True)self.txt_feats = txt_feats.reshape(-1, len(text), txt_feats.shape[-1])self.model[-1].nc = len(text)def predict(self, x, profile=False, visualize=False, txt_feats=None, augment=False, embed=None):"""Perform a forward pass through the model.Args:x (torch.Tensor): The input tensor.profile (bool, optional): If True, profile the computation time for each layer. Defaults to False.visualize (bool, optional): If True, save feature maps for visualization. Defaults to False.txt_feats (torch.Tensor): The text features, use it if it's given. Defaults to None.augment (bool, optional): If True, perform data augmentation during inference. Defaults to False.embed (list, optional): A list of feature vectors/embeddings to return.Returns:(torch.Tensor): Model's output tensor."""txt_feats = (self.txt_feats if txt_feats is None else txt_feats).to(device=x.device, dtype=x.dtype)if len(txt_feats) != len(x):txt_feats = txt_feats.repeat(len(x), 1, 1)ori_txt_feats = txt_feats.clone()y, dt, embeddings = [], [], []  # outputsfor m in self.model:  # except the head partif m.f != -1:  # if not from previous layerx = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layersif profile:self._profile_one_layer(m, x, dt)if isinstance(m, C2fAttn):x = m(x, txt_feats)elif isinstance(m, WorldDetect):x = m(x, ori_txt_feats)elif isinstance(m, ImagePoolingAttn):txt_feats = m(x, txt_feats)else:x = m(x)  # runy.append(x if m.i in self.save else None)  # save outputif visualize:feature_visualization(x, m.type, m.i, save_dir=visualize)if embed and m.i in embed:embeddings.append(nn.functional.adaptive_avg_pool2d(x, (1, 1)).squeeze(-1).squeeze(-1))  # flattenif m.i == max(embed):return torch.unbind(torch.cat(embeddings, 1), dim=0)return xdef loss(self, batch, preds=None):"""Compute loss.Args:batch (dict): Batch to compute loss on.preds (torch.Tensor | List[torch.Tensor]): Predictions."""if not hasattr(self, "criterion"):self.criterion = self.init_criterion()if preds is None:preds = self.forward(batch["img"], txt_feats=batch["txt_feats"])return self.criterion(preds, batch)class Ensemble(nn.ModuleList):"""Ensemble of models."""def __init__(self):"""Initialize an ensemble of models."""super().__init__()def forward(self, x, augment=False, profile=False, visualize=False):"""Function generates the YOLO network's final layer."""y = [module(x, augment, profile, visualize)[0] for module in self]# y = torch.stack(y).max(0)[0]  # max ensemble# y = torch.stack(y).mean(0)  # mean ensembley = torch.cat(y, 2)  # nms ensemble, y shape(B, HW, C)return y, None  # inference, train output# Functions ------------------------------------------------------------------------------------------------------------@contextlib.contextmanager
def temporary_modules(modules=None):"""Context manager for temporarily adding or modifying modules in Python's module cache (`sys.modules`).This function can be used to change the module paths during runtime. It's useful when refactoring code,where you've moved a module from one location to another, but you still want to support the old importpaths for backwards compatibility.Args:modules (dict, optional): A dictionary mapping old module paths to new module paths.Example:```pythonwith temporary_modules({'old.module.path': 'new.module.path'}):import old.module.path  # this will now import new.module.path```Note:The changes are only in effect inside the context manager and are undone once the context manager exits.Be aware that directly manipulating `sys.modules` can lead to unpredictable results, especially in largerapplications or libraries. Use this function with caution."""if not modules:modules = {}import importlibimport systry:# Set modules in sys.modules under their old namefor old, new in modules.items():sys.modules[old] = importlib.import_module(new)yieldfinally:# Remove the temporary module pathsfor old in modules:if old in sys.modules:del sys.modules[old]def torch_safe_load(weight):"""This function attempts to load a PyTorch model with the torch.load() function. If a ModuleNotFoundError is raised,it catches the error, logs a warning message, and attempts to install the missing module via thecheck_requirements() function. After installation, the function again attempts to load the model using torch.load().Args:weight (str): The file path of the PyTorch model.Returns:(dict): The loaded PyTorch model."""from ultralytics.utils.downloads import attempt_download_assetcheck_suffix(file=weight, suffix=".pt")file = attempt_download_asset(weight)  # search online if missing locallytry:with temporary_modules({"ultralytics.yolo.utils": "ultralytics.utils","ultralytics.yolo.v8": "ultralytics.models.yolo","ultralytics.yolo.data": "ultralytics.data",}):  # for legacy 8.0 Classify and Pose modelsckpt = torch.load(file, map_location="cpu")except ModuleNotFoundError as e:  # e.name is missing module nameif e.name == "models":raise TypeError(emojis(f"ERROR ❌️ {weight} appears to be an Ultralytics YOLOv5 model originally trained "f"with https://github.com/ultralytics/yolov5.\nThis model is NOT forwards compatible with "f"YOLOv8 at https://github.com/ultralytics/ultralytics."f"\nRecommend fixes are to train a new model using the latest 'ultralytics' package or to "f"run a command with an official YOLOv8 model, i.e. 'yolo predict model=yolov8n.pt'")) from eLOGGER.warning(f"WARNING ⚠️ {weight} appears to require '{e.name}', which is not in ultralytics requirements."f"\nAutoInstall will run now for '{e.name}' but this feature will be removed in the future."f"\nRecommend fixes are to train a new model using the latest 'ultralytics' package or to "f"run a command with an official YOLOv8 model, i.e. 'yolo predict model=yolov8n.pt'")check_requirements(e.name)  # install missing moduleckpt = torch.load(file, map_location="cpu")if not isinstance(ckpt, dict):# File is likely a YOLO instance saved with i.e. torch.save(model, "saved_model.pt")LOGGER.warning(f"WARNING ⚠️ The file '{weight}' appears to be improperly saved or formatted. "f"For optimal results, use model.save('filename.pt') to correctly save YOLO models.")ckpt = {"model": ckpt.model}return ckpt, file  # loaddef attempt_load_weights(weights, device=None, inplace=True, fuse=False):"""Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a."""ensemble = Ensemble()for w in weights if isinstance(weights, list) else [weights]:ckpt, w = torch_safe_load(w)  # load ckptargs = {**DEFAULT_CFG_DICT, **ckpt["train_args"]} if "train_args" in ckpt else None  # combined argsmodel = (ckpt.get("ema") or ckpt["model"]).to(device).float()  # FP32 model# Model compatibility updatesmodel.args = args  # attach args to modelmodel.pt_path = w  # attach *.pt file path to modelmodel.task = guess_model_task(model)if not hasattr(model, "stride"):model.stride = torch.tensor([32.0])# Appendensemble.append(model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval())  # model in eval mode# Module updatesfor m in ensemble.modules():if hasattr(m, "inplace"):m.inplace = inplaceelif isinstance(m, nn.Upsample) and not hasattr(m, "recompute_scale_factor"):m.recompute_scale_factor = None  # torch 1.11.0 compatibility# Return modelif len(ensemble) == 1:return ensemble[-1]# Return ensembleLOGGER.info(f"Ensemble created with {weights}\n")for k in "names", "nc", "yaml":setattr(ensemble, k, getattr(ensemble[0], k))ensemble.stride = ensemble[int(torch.argmax(torch.tensor([m.stride.max() for m in ensemble])))].strideassert all(ensemble[0].nc == m.nc for m in ensemble), f"Models differ in class counts {[m.nc for m in ensemble]}"return ensembledef attempt_load_one_weight(weight, device=None, inplace=True, fuse=False):"""Loads a single model weights."""ckpt, weight = torch_safe_load(weight)  # load ckptargs = {**DEFAULT_CFG_DICT, **(ckpt.get("train_args", {}))}  # combine model and default args, preferring model argsmodel = (ckpt.get("ema") or ckpt["model"]).to(device).float()  # FP32 model# Model compatibility updatesmodel.args = {k: v for k, v in args.items() if k in DEFAULT_CFG_KEYS}  # attach args to modelmodel.pt_path = weight  # attach *.pt file path to modelmodel.task = guess_model_task(model)if not hasattr(model, "stride"):model.stride = torch.tensor([32.0])model = model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval()  # model in eval mode# Module updatesfor m in model.modules():if hasattr(m, "inplace"):m.inplace = inplaceelif isinstance(m, nn.Upsample) and not hasattr(m, "recompute_scale_factor"):m.recompute_scale_factor = None  # torch 1.11.0 compatibility# Return model and ckptreturn model, ckptdef parse_model(d, ch, verbose=True):  # model_dict, input_channels(3)"""Parse a YOLO model.yaml dictionary into a PyTorch model."""import ast# Argsmax_channels = float("inf")nc, act, scales = (d.get(x) for x in ("nc", "activation", "scales"))depth, width, kpt_shape = (d.get(x, 1.0) for x in ("depth_multiple", "width_multiple", "kpt_shape"))if scales:scale = d.get("scale")if not scale:scale = tuple(scales.keys())[0]LOGGER.warning(f"WARNING ⚠️ no model scale passed. Assuming scale='{scale}'.")depth, width, max_channels = scales[scale]if act:Conv.default_act = eval(act)  # redefine default activation, i.e. Conv.default_act = nn.SiLU()if verbose:LOGGER.info(f"{colorstr('activation:')} {act}")  # printif verbose:LOGGER.info(f"\n{'':>3}{'from':>20}{'n':>3}{'params':>10}  {'module':<45}{'arguments':<30}")ch = [ch]layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch outbackbone = Falsefor i, (f, n, m, args) in enumerate(d["backbone"] + d["head"]):  # from, number, module, argst = mm = getattr(torch.nn, m[3:]) if "nn." in m else globals()[m]  # get modulefor j, a in enumerate(args):if isinstance(a, str):with contextlib.suppress(ValueError):args[j] = locals()[a] if a in locals() else ast.literal_eval(a)n = n_ = max(round(n * depth), 1) if n > 1 else n  # depth gainif m in {Classify,Conv,ConvTranspose,GhostConv,Bottleneck,GhostBottleneck,SPP,SPPF,DWConv,Focus,BottleneckCSP,C1,C2,C2f,RepNCSPELAN4,ADown,SPPELAN,C2fAttn,C3,C3TR,C3Ghost,nn.ConvTranspose2d,DWConvTranspose2d,C3x,RepC3,}:c1, c2 = ch[f], args[0]if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)c2 = make_divisible(min(c2, max_channels) * width, 8)if m is C2fAttn:args[1] = make_divisible(min(args[1], max_channels // 2) * width, 8)  # embed channelsargs[2] = int(max(round(min(args[2], max_channels // 2 // 32)) * width, 1) if args[2] > 1 else args[2])  # num headsargs = [c1, c2, *args[1:]]if m in {BottleneckCSP, C1, C2, C2f, C2fAttn, C3, C3TR, C3Ghost, C3x, RepC3}:args.insert(2, n)  # number of repeatsn = 1elif m in {MobileNetV3}: #注册MobileNetV3模块m = m()c2 = m.width_list  backbone = Trueelif m is AIFI:args = [ch[f], *args]elif m in {OREPA}:args = [ch[f], *args]elif m in {HGStem, HGBlock}:c1, cm, c2 = ch[f], args[0], args[1]args = [c1, cm, c2, *args[2:]]if m is HGBlock:args.insert(4, n)  # number of repeatsn = 1elif m in {Bi_FPN}: #注册Bi_FPN模块args = [len([ch[x] for x in f])]elif m is ResNetLayer:c2 = args[1] if args[3] else args[1] * 4elif m is nn.BatchNorm2d:args = [ch[f]]elif m is Concat:c2 = sum(ch[x] for x in f)elif m in {Detect, WorldDetect, Segment, Pose, OBB, ImagePoolingAttn, Detect_DynamicHead}:args.append([ch[x] for x in f])if m is Segment:args[2] = make_divisible(min(args[2], max_channels) * width, 8)elif m is RTDETRDecoder:  # special case, channels arg must be passed in index 1args.insert(1, [ch[x] for x in f])elif m is CBLinear:c2 = args[0]c1 = ch[f]args = [c1, c2, *args[1:]]elif m is CBFuse:c2 = ch[f[-1]]else:c2 = ch[f]if isinstance(c2, list):m_ = mm_.backbone = Trueelse:m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args)  # modulet = str(m)[8:-2].replace('__main__.', '')  # module typem.np = sum(x.numel() for x in m_.parameters())  # number paramsm_.i, m_.f, m_.type = i + 4 if backbone else i, f, t  # attach index, 'from' index, typeif verbose:LOGGER.info(f'{i:>3}{str(f):>20}{n_:>3}{m.np:10.0f}  {t:<45}{str(args):<30}')  # printsave.extend(x % (i + 4 if backbone else i) for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelistlayers.append(m_)if i == 0:ch = []if isinstance(c2, list):ch.extend(c2)if len(c2) != 5:ch.insert(0, 0)else:ch.append(c2)return nn.Sequential(*layers), sorted(save)def yaml_model_load(path):"""Load a YOLOv8 model from a YAML file."""import repath = Path(path)if path.stem in (f"yolov{d}{x}6" for x in "nsmlx" for d in (5, 8)):new_stem = re.sub(r"(\d+)([nslmx])6(.+)?$", r"\1\2-p6\3", path.stem)LOGGER.warning(f"WARNING ⚠️ Ultralytics YOLO P6 models now use -p6 suffix. Renaming {path.stem} to {new_stem}.")path = path.with_name(new_stem + path.suffix)unified_path = re.sub(r"(\d+)([nslmx])(.+)?$", r"\1\3", str(path))  # i.e. yolov8x.yaml -> yolov8.yamlyaml_file = check_yaml(unified_path, hard=False) or check_yaml(path)d = yaml_load(yaml_file)  # model dictd["scale"] = guess_model_scale(path)d["yaml_file"] = str(path)return ddef guess_model_scale(model_path):"""Takes a path to a YOLO model's YAML file as input and extracts the size character of the model's scale. The functionuses regular expression matching to find the pattern of the model scale in the YAML file name, which is denoted byn, s, m, l, or x. The function returns the size character of the model scale as a string.Args:model_path (str | Path): The path to the YOLO model's YAML file.Returns:(str): The size character of the model's scale, which can be n, s, m, l, or x."""with contextlib.suppress(AttributeError):import rereturn re.search(r"yolov\d+([nslmx])", Path(model_path).stem).group(1)  # n, s, m, l, or xreturn ""def guess_model_task(model):"""Guess the task of a PyTorch model from its architecture or configuration.Args:model (nn.Module | dict): PyTorch model or model configuration in YAML format.Returns:(str): Task of the model ('detect', 'segment', 'classify', 'pose').Raises:SyntaxError: If the task of the model could not be determined."""def cfg2task(cfg):"""Guess from YAML dictionary."""m = cfg["head"][-1][-2].lower()  # output module nameif m in {"classify", "classifier", "cls", "fc"}:return "classify"if m == "detect":return "detect"if m == "segment":return "segment"if m == "pose":return "pose"if m == "obb":return "obb"else:return "detect"# Guess from model cfgif isinstance(model, dict):with contextlib.suppress(Exception):return cfg2task(model)# Guess from PyTorch modelif isinstance(model, nn.Module):  # PyTorch modelfor x in "model.args", "model.model.args", "model.model.model.args":with contextlib.suppress(Exception):return eval(x)["task"]for x in "model.yaml", "model.model.yaml", "model.model.model.yaml":with contextlib.suppress(Exception):return cfg2task(eval(x))for m in model.modules():if isinstance(m, Segment):return "segment"elif isinstance(m, Classify):return "classify"elif isinstance(m, Pose):return "pose"elif isinstance(m, OBB):return "obb"elif isinstance(m, (Detect, WorldDetect, Detect_DynamicHead)):return "detect"# Guess from model filenameif isinstance(model, (str, Path)):model = Path(model)if "-seg" in model.stem or "segment" in model.parts:return "segment"elif "-cls" in model.stem or "classify" in model.parts:return "classify"elif "-pose" in model.stem or "pose" in model.parts:return "pose"elif "-obb" in model.stem or "obb" in model.parts:return "obb"elif "detect" in model.parts:return "detect"# Unable to determine task from modelLOGGER.warning("WARNING ⚠️ Unable to automatically guess model task, assuming 'task=detect'. ""Explicitly define task for your model, i.e. 'task=detect', 'segment', 'classify','pose' or 'obb'.")return "detect"  # assume detect

报错

【报错】❤️ ❤️ ❤️

ModuleNotFoundError: No module named 'torchsummary'

【解决方法】💚 💚 💚

pip --default-timeout=100 install torchsummary -i https://pypi.tuna.tsinghua.edu.cn/simple

到此,本文分享的内容就结束啦!遇见便是缘,感恩遇见!!!💛 💙 💜 ❤️ 💚


http://www.ppmy.cn/embedded/3879.html

相关文章

若依前后端部署到一起

引用&#xff1a;https://blog.csdn.net/qq_42341853/article/details/129127553 前端改造&#xff1a; 配置打包前缀 修改router.js 编程hash模式&#xff1a; 前端打包&#xff1a;npm run build:prod 后端修改&#xff1a; 添加thymeleaf包&#xff0c;和配置文件 spri…

《战神4》和《战神5》有什么联系吗 苹果电脑如何运行《战神4》苹果电脑玩战神 Mac玩游戏 战神5攻略 crossover激活码

《战神4》&#xff08;God of War 2018&#xff09;和《战神5》&#xff08;God of War: Ragnark&#xff09;是一对引人注目的游戏作品&#xff0c;它们不仅在游戏界引起了广泛的关注&#xff0c;也给玩家带来了深入探索北欧神话世界的机会。这两部游戏之间的联系不仅体现在剧…

Android12中JAVA项目中proto文件的编译方式

一. 起因 最近的工作有涉及到将原来Android9平台下的java工程防到Android12中编译&#xff0c;结果发现在Android9中可以编译的工程&#xff0c;没有修改Android.bp&#xff0c;在Android12中编译失败了&#xff0c;原因是java文件中以来项目中的proto文件生成的java类&#xf…

Apache Tomcat 简单使用

Apache Tomcat 下载 download Tomcat服务器是一个免费的开放源代码的Web应用服务器&#xff0c;属于轻量级应用服务器&#xff0c;在中小型系统和并发访问用户不是很多的场合下被普遍使用&#xff0c;是开发和调试JSP程序的首选。 支持JDK版本 JDK8 要下载tomcat 9.0.x版本 …

深入理解Git命令:Merge与Rebase的使用场景和注意事项

深入理解Git命令&#xff1a;Merge与Rebase的使用场景和注意事项 在Git版本控制中&#xff0c;Merge和Rebase是常用的两种操作&#xff0c;用于将不同分支的代码整合到一起。虽然它们的目的相同&#xff0c;但实现方式和影响提交历史的方式有所不同。本文将结合具体示例详细介…

CSS基础:最详细 padding的 4 种用法解析

你好&#xff0c;我是云桃桃。 一个希望帮助更多朋友快速入门 WEB 前端的程序媛。 云桃桃&#xff0c;大专生&#xff0c;一枚程序媛&#xff0c;感谢关注。回复 “前端基础题”&#xff0c;可免费获得前端基础 100 题汇总&#xff0c;回复 “前端工具”&#xff0c;可获取 We…

华硕ROG幻16笔记本电脑模式切换管理工具完美替代华硕奥创中心管理工具

文章目录 华硕ROG幻16笔记本电脑模式切换管理工具完美替代华硕奥创中心管理工具1. 介绍2. 下载3. 静音模式、平衡模式、增强模式配置4. 配置电源方案与模式切换绑定5. 启动Ghelper控制面板6. 目前支持的设备型号 华硕ROG幻16笔记本电脑模式切换管理工具完美替代华硕奥创中心管理…

基于SpringBoot的“人职匹配推荐系统”的设计与实现(源码+数据库+文档+PPT)

基于SpringBoot的“人职匹配推荐系统”的设计与实现&#xff08;源码数据库文档PPT) 开发语言&#xff1a;Java 数据库&#xff1a;MySQL 技术&#xff1a;SpringBoot 工具&#xff1a;IDEA/Ecilpse、Navicat、Maven 系统展示 网上商城购物系统结构图 管理员登录界面图 个…

生活中的洪特规则

不知道你还记不记得高中物理所学的一个奇特的物理规则&#xff1a;洪特规则。 洪特规则是德国人弗里德里希洪特&#xff08;F.Hund&#xff09;根据大量光谱实验数据总结出的一个规律&#xff0c;它指出电子分布到能量简并的原子轨道时&#xff0c;优先以自旋相同的方式分别占…

【Entity Framework】你知道如何处理无键实体吗

【Entity Framework】你知道如何处理无键实体吗 文章目录 【Entity Framework】你知道如何处理无键实体吗一、概述二、定义无键实体类型数据注释 三、无键实体类型特征四、无键实体使用场景五、无键实体使用场景六、无键使用示例6.1 定义一个简单的Blog和Post模型&#xff1a;6…

【软件设计师】计算机软考下午题试题六,Java设计模式之简单工厂模式。

【软件设计师】计算机软考下午题试题六&#xff0c;Java设计模式之简单工厂模式。 代码如下&#xff1a; //简单工厂模式 public class SimpleFactory {public static void main(String[] args) {Product ProductAFactory.createProduct("A");ProductA.info();Produc…

在flask服务中远程读取该Excel的内容

在flask服务中远程读取该Excel的内容: from flask import Flask, jsonify import requests import pandas as pd import os import tempfile app Flask(__name__) app.route(/read_excel, methods[GET]) def read_excel(): # Excel 文件的 URL excel_url http…

为什么要注册缅甸公司

缅甸作为东南亚新兴市场之一&#xff0c;吸引了越来越多的外国投资者前来开展业务。注册一家公司是在缅甸开展商业活动的第一步。以下是关于在缅甸注册公司的公司类型、注册要求以及注册优势的详细介绍。 在缅甸注册的外国公司主要有以下几种类型&#xff1a; 1、有限责任公司…

初步学习node.js文件模块

环境已安装好&#xff1b; 写一个read1.js如下&#xff1b; var fs require("fs"); var data ;// 创建一个流 var stream1 fs.createReadStream(test1.jsp); stream1.setEncoding(UTF8);// 绑定data事件 stream1.on(data, function(mydata) {data mydata; });/…

Java类加载

类加载 类加载过程 类加载过程&#xff1a;Java类加载过程大致分为加载&#xff08;Loading&#xff09;、验证&#xff08;Verification&#xff09;、准备&#xff08;Preparation&#xff09;、解析&#xff08;Resolution&#xff09;、初始化&#xff08;Initialization…

【CANN训练营】目标检测(YoloV5s)实践(Python实现)

样例介绍 使用多路离线视频流&#xff08;* .mp4&#xff09;作为应用程序的输入&#xff0c;基于YoloV5s模型对输入视频中的物体做实时检测&#xff0c;将推理结果信息使用imshow方式显示。 样例代码逻辑如下所示&#xff1a; 环境信息 CPU&#xff1a;Intel Xeon Gold 63…

STL容器搜索:当直接访问STL容器时,如何执行有效和正确的搜索?

掌握STL容器搜索技巧:在C中实现高效和准确的数据访问 一、简介二、std::vector, std::deque, std::list三、std::map, std::multimap, std::set, std::multiset四、std::string六、总结 一、简介 本文主要了解如何在直接访问c容器时高效地进行搜索。在STL容器中搜索&#xff0…

51单片机实验03-单片机定时/计数器实验

目录 一、实验目的 二、实验说明 1、51单片机有两个16位内部计数器/定时器&#xff08;C/T&#xff0c; Counter/Timer&#xff09;。 2、模式寄存器TMOD 1) M1M0工作模式控制位&#xff1b; 2) C/T定时器或计数器选择位&#xff1a; 3&#xff09;GATE定时器/计数器运行…

基于单片机雨天自动关窗器的设计

摘 要:文章设计了一种基于单片机雨天自动关窗系统,介绍了该系统设计思路及步骤,并对系统进行了测试。 关键词:单片机;传感器;步进电机 0 引言 随着社会的进步,科技的发展,人们生活水平也日益提高,智能家居的概念已深入人们的生活。开窗能使空气流通,降温避暑,一…

论文阅读-Federated-Unlearning-With-Momentum-Degradation

论文阅读-Federated Unlearning With Momentum Degradation 联邦忘却与动量退化 Yian Zhao IEEE Internet of Things Journal 2023 年 10 月 2 日 CCF-C momentum degradation-MoDe 动量退化 memory guidance-记忆引导 knowledge erasure-知识擦除 Deep-learning neural n…