Yolov3 模型构建和深入掌握快速搭建网络的搭积木方法

news/2024/11/28 7:33:34/

(一)设计Conv2dBatchLeaky

1、了解LeakyReLU激活函数

LeakyReLU 激活层,创建一个可调用对象以计算输入 x 的 LeakReLU 。其中,x为输入的 Tensor

感觉和飞桨的api有点相同,可以对照参考理解:

 LeakyReLU激活函数的图像

 Examples:

import torch
import torch.nn as nn
m = nn.LeakyReLU(0.1)
input = torch.randn(2)
output = m(input)
print(input)
print(output)

演示效果: 

 

 2、了解isinstance内置函数

'''isinstance函数:是Python中的一个内置函数,用来判断一个函数是否是一个已知的类型,类似 type()。isinstance()与type()的区别:例如在继承上的区别:1、isinstance() 会认为子类是一种父类类型,考虑继承关系。2、type() 不会认为子类是一种父类类型,不考虑继承关系。
'''a = 2
print(isinstance(a,int) )     # 结果返回 True
print(isinstance(a,str))      # 结果返回 False
print(isinstance(a,(str,int,list)) )     # 是元组中的一个,结果返回 Trueprint("=======================================")class Parent:passclass Son(Parent):passprint(isinstance(Parent(), Parent))  # returns True
print(type(Parent()) == Parent ) # returns True
print(isinstance(Son(), Parent))  # returns True
print(type(Son()) == Parent)  # returns False

 3、了解self.padding = int(kernel_size/2)

推荐文章:

CNN中卷积层的计算细节:CNN中卷积层的计算细节 - 知乎前几天在看CS231n中的CNN经典模型讲解时,花了一些时间才搞清楚卷积层输入输出的尺寸关系到底是什么样的,现总结如下。(可以参照我画的题图理解卷积层的运算) 卷积层尺寸的计算原理输入矩阵格式:四个维度,依次…https://zhuanlan.zhihu.com/p/29119239

卷积神经网络:卷积神经网络2-padding_卷积padding计算公式_安好1997的博客-CSDN博客什么是padding? padding就是填充、覆盖的意思,也就是过滤器。padding的选择? 我们通常选择3x3的过滤器,但是如果你阅读大量文献,你也会发现也有许多5x5、7x7的过滤器。不难发现,这里都是奇数类型的过滤器。如果你阅读许多相关文献,你也会发现绝大多数都选择的是奇数过滤器,这也符合计算机视觉的惯例。之所以选择奇数,一方面有便于确定过滤器当前所在的位置(只有一个中心点),另外也不止是这个原因。当然,也可以选择偶数过滤器,可能也会有很...https://blog.csdn.net/qq_37031892/article/details/109141826?spm=1001.2014.3001.5502pytorch中padding=kernel_size//2:

 

pytorch中padding=kernel_size//2_padding=2_gggoogle1020的博客-CSDN博客pytorch中padding=kernel_size//2到底是实现神魔形式的padding?padding=(kernel_size-1)/2若kernel_size是7*7,5*5,3*3,1*1常见的则padding是 3,2 ,1 ,0nn.Conv2d的padding是在卷积之前补0,如果愿意的话,可以通过使用torch.nn.Functional.pad来补非0的内容。四周都补!如果pad输入是一个tuple的话,则第一个参数表示上,下底的padding,第2个参数表示宽..https://blog.csdn.net/qq_36249824/article/details/107005949

笔记抄录归纳:

 4、torch.nn.Sequential(*args)

顺序容器。模块将按照它们在构造函数中传递的顺序添加到它中。或者,可以传入一个包含模块的OrderedDict。Sequential的forward()方法接受任何输入并将其转发到它包含的第一个模块。然后,它将输出按顺序“链接”到每个后续模块的输入,最后返回最后一个模块的输出。

Sequential通过手动调用模块序列提供的价值是,它允许将整个容器视为单个模块,这样在Sequential上执行转换就可以应用于它存储的每个模块(每个模块都是Sequential的注册子模块)。

Sequential和torch.nn.ModuleList的区别是什么?模块列表顾名思义就是一个用于存储模块的列表!另一方面,sequence中的层以级联方式连接。

# Using Sequential to create a small model. When `model` is run,
# input will first be passed to `Conv2d(1,20,5)`. The output of
# `Conv2d(1,20,5)` will be used as the input to the first
# `ReLU`; the output of the first `ReLU` will become the input
# for `Conv2d(20,64,5)`. Finally, the output of
# `Conv2d(20,64,5)` will be used as input to the second `ReLU`
model = nn.Sequential(nn.Conv2d(1,20,5),nn.ReLU(),nn.Conv2d(20,64,5),nn.ReLU())# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([('conv1', nn.Conv2d(1,20,5)),('relu1', nn.ReLU()),('conv2', nn.Conv2d(20,64,5)),('relu2', nn.ReLU())]))

使用Sequential创建一个小模型。当' model '运行时,输入将首先传递给' Conv2d(1,20,5) '。' Conv2d(1,20,5) '的输出将被用作第一个的输入“ReLU”;第一个“ReLU”的输出将成为“Conv2d(20,64,5)”的输入。最后,' Conv2d(20,64,5) '的输出将用作第二个' ReLU '的输入。

model = nn.Sequential(nn.Conv2d(1,20,5),nn.ReLU(),nn.Conv2d(20,64,5),nn.ReLU())

使用Sequential和OrderedDict。这在功能上与上面的代码相同

model = nn.Sequential(OrderedDict([('conv1', nn.Conv2d(1,20,5)),('relu1', nn.ReLU()),('conv2', nn.Conv2d(20,64,5)),('relu2', nn.ReLU())]))

 5、 torch.nn.Conv2d(in_channelsout_channelskernel_sizestride=1padding=0dilation=1groups=1bias=Truepadding_mode='zeros'device=Nonedtype=None)

 查看飞桨的Conv2d函数解释,可以参考理解:

 Examples:

# With square kernels and equal stride
m = nn.Conv2d(16, 33, 3, stride=2)
# non-square kernels and unequal stride and with padding
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
# non-square kernels and unequal stride and with padding and dilation
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
input = torch.randn(20, 16, 50, 100)
output = m(input)

6、torch.nn.BatchNorm2d(num_featureseps=1e-05momentum=0.1affine=Truetrack_running_stats=Truedevice=Nonedtype=None)

import torch
import torch.nn as nn
# With Learnable Parameters
m = nn.BatchNorm2d(100)
# Without Learnable Parameters
m = nn.BatchNorm2d(100, affine=False)
input = torch.randn(20, 100, 35, 45)
output = m(input)
print(output)

7、设计Conv2dBatchLeaky(用自定义layer充当积木)

思考:实际操作中我们往往更想要设计一些自定义的layer,怎么办呢?

此时若我们需要用nn.Conv2d,BatchNorm2d,又要LeakyReLU等公用积木搭网络,我们可以直接设计一个layer,叫做Conv2dBatchLeaky(),一块积木顶三块

'''in_channels:输入通道out_channels:输出通道kernel_size:核的大小stride:步长leaky_slop:默认设置为0.1
'''
class Conv2dBatchLeaky(nn.Module):def __init__(self, in_channels, out_channels, kernel_size, stride, leaky_slope=0.1):super(Conv2dBatchLeaky, self).__init__()self.in_channels = in_channelsself.out_channels = out_channelsself.kernel_size = kernel_sizeself.stride = strideif isinstance(kernel_size, (list, tuple)):self.padding = [int(ii/2) for ii in kernel_size]if flag_yolo_structure:print('------------------->>>> Conv2dBatchLeaky isinstance')else:self.padding = int(kernel_size/2)self.leaky_slope = leaky_slope# Layer# LeakyReLU : y = max(0, x) + leaky_slope*min(0,x)self.layers = nn.Sequential(nn.Conv2d(self.in_channels, self.out_channels, self.kernel_size, self.stride, self.padding, bias=False),nn.BatchNorm2d(self.out_channels),nn.LeakyReLU(self.leaky_slope, inplace=True))def forward(self, x):x = self.layers(x)return x

(二)设计ResBlockSum实现List0"黑色框框"结构(通俗讲,不知道怎么准确描述,哈哈哈~)

比如说我们要设计list0,可以发现图中的三个黑色的框框是一样的结构,都是两个Convolutional+一个Residual,其中第一个Convolutional的卷积信息为 32 1x1,第二个Convolutional的卷积信息为64 3x3;

观察list0,我们可以发现三个黑色框框除了是一样的结构,而且多次使用,所以我们可以考虑封装一个函数来实现复用。为此设计出了这个函数:ResBlockSum

class ResBlockSum(nn.Module):def __init__(self, nchannels):super().__init__()self.block = nn.Sequential(Conv2dBatchLeaky(nchannels, int(nchannels/2), 1, 1),Conv2dBatchLeaky(int(nchannels/2), nchannels, 3, 1))def forward(self, x):return x + self.block(x)

 核心实现:

self.block = nn.Sequential(Conv2dBatchLeaky(nchannels, int(nchannels/2), 1, 1),Conv2dBatchLeaky(int(nchannels/2), nchannels, 3, 1))

当传入nchannels=64,调用1次ResBlockSum(64),即

self.block = nn.Sequential(
            Conv2dBatchLeaky(64, 32, 1, 1),
            Conv2dBatchLeaky(32, 64, 3, 1)
            )

 

即可实现该结构

当传入nchannels=128,调用2次ResBlockSum(128),即

self.block = nn.Sequential(
            Conv2dBatchLeaky(128, 64, 1, 1),
            Conv2dBatchLeaky(64, 128, 3, 1)
            )

即可实现该结构

当传入nchannels=256,调用8次ResBlockSum(256),即

self.block = nn.Sequential(
            Conv2dBatchLeaky(256, 128, 1, 1),
            Conv2dBatchLeaky(128, 256, 3, 1)
            )

即可实现该结构

(三)设计HeadBody,实现List2、List6、List10中的ConvolutionalSet

 

 

class HeadBody(nn.Module):def __init__(self, in_channels, out_channels):super(HeadBody, self).__init__()self.layer = nn.Sequential(Conv2dBatchLeaky(in_channels, out_channels, 1, 1),Conv2dBatchLeaky(out_channels, out_channels*2, 3, 1),Conv2dBatchLeaky(out_channels*2, out_channels, 1, 1),Conv2dBatchLeaky(out_channels, out_channels*2, 3, 1),Conv2dBatchLeaky(out_channels*2, out_channels, 1, 1))def forward(self, x):x = self.layer(x)return x

(四)实现上采样Upsample,需要实现两次Upsample

推荐文章:

interpolate-API文档-PaddlePaddle深度学习平台调整一个 batch 中图片的大小。 输入为 4-D Tensor 时形状为(num_batches, channels, in_h, in_w)或者(num_batches, in_h, in_w,https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/functional/interpolate_cn.html

torch.nn.functional.interpolate(inputsize=Nonescale_factor=Nonemode='nearest'align_corners=Nonerecompute_scale_factor=None)

torch.nn.functional.interpolate — PyTorch 1.10 documentationicon-default.png?t=N4P3https://pytorch.org/docs/1.10/generated/torch.nn.functional.interpolate.html?highlight=f%20interpolate#torch.nn.functional.interpolate

了解torch的上采样Upsample的api:

import torch
import torch.nn as nn# output_shape = [64, 48]
# up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
up = nn.Upsample(scale_factor=2)
input = torch.rand(32, 17, 32, 24)
output = up(input)
print(output.shape)

import torch
import torch.nn.functional as Finput = torch.rand(32, 17, 32, 24)
output = F.interpolate(input,scale_factor=2)
print(output.shape)

class Upsample(nn.Module):# Custom Upsample layer (nn.Upsample gives deprecated warning message)def __init__(self, scale_factor=1, mode='nearest'):super(Upsample, self).__init__()self.scale_factor = scale_factorself.mode = modedef forward(self, x):return F.interpolate(x, scale_factor=self.scale_factor, mode=self.mode)

(五)实现YOLOLayer,得到物体的anchor和num_classes

# default anchors=[(10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198), (373,326)]
class YOLOLayer(nn.Module):def __init__(self, anchors, nC):super(YOLOLayer, self).__init__()self.anchors = torch.FloatTensor(anchors)self.nA = len(anchors)  # number of anchors (3)self.nC = nC  # number of classesself.img_size = 0if flag_yolo_structure:print('init YOLOLayer ------ >>> ')print('anchors  : ',self.anchors)print('nA       : ',self.nA)print('nC       : ',self.nC)print('img_size : ',self.img_size)def forward(self, p, img_size, var=None):# p : feature mapbs, nG = p.shape[0], p.shape[-1] # batch_size , gridif flag_yolo_structure:print('bs, nG --->>> ',bs, nG)if self.img_size != img_size:create_grids(self, img_size, nG, p.device)# p.view(bs, 255, 13, 13) -- > (bs, 3, 13, 13, 85)  # (bs, anchors, grid, grid, xywh + confidence + classes)p = p.view(bs, self.nA, self.nC + 5, nG, nG).permute(0, 1, 3, 4, 2).contiguous()  #  predictionif self.training:return pelse:  # inferenceio = p.clone()  # inference outputio[..., 0:2] = torch.sigmoid(io[..., 0:2]) + self.grid_xy  # xyio[..., 2:4] = torch.exp(io[..., 2:4]) * self.anchor_wh  # wh yolo methodio[..., 4:] = torch.sigmoid(io[..., 4:])  # p_conf, p_clsio[..., :4] *= self.strideif self.nC == 1:io[..., 5] = 1  # single-class model# flatten prediction, reshape from [bs, nA, nG, nG, nC] to [bs, nA * nG * nG, nC]return io.view(bs, -1, 5 + self.nC), pdef create_grids(self, img_size, nG, device='cpu'):# self.nA : len(anchors)  # number of anchors (3)# self.nC : nC  # number of classes# nG : feature map grid  13*13  26*26 52*52self.img_size = img_sizeself.stride = img_size / nGif flag_yolo_structure:print('create_grids stride : ',self.stride)# build xy offsetsgrid_x = torch.arange(nG).repeat((nG, 1)).view((1, 1, nG, nG)).float()grid_y = grid_x.permute(0, 1, 3, 2)self.grid_xy = torch.stack((grid_x, grid_y), 4).to(device)if flag_yolo_structure:print('grid_x : ',grid_x.size(),grid_x)print('grid_y : ',grid_y.size(),grid_y)print('grid_xy : ',self.grid_xy.size(),self.grid_xy)# build wh gainsself.anchor_vec = self.anchors.to(device) / self.stride # 基于 stride 的归一化# print('self.anchor_vecself.anchor_vecself.anchor_vec:',self.anchor_vec)self.anchor_wh = self.anchor_vec.view(1, self.nA, 1, 1, 2).to(device)self.nG = torch.FloatTensor([nG]).to(device)def get_yolo_layer_index(module_list):yolo_layer_index = []for index, l in enumerate(module_list):try:a = l[0].img_size and l[0].nG  # only yolo layer need img_size and nGyolo_layer_index.append(index)except:passassert len(yolo_layer_index) > 0, "can not find yolo layer"return yolo_layer_index

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>得到大物体的anchor和num_classes>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

【一】实现List0

推荐文章:

Pytorch 快速搭建网络搭积木方法:Pytorch 快速搭建网络搭积木方法_w55100的博客-CSDN博客前言研究lightnet源代码时,看到这种技巧,惊为天人,于是单独摘出来。感谢作者EAVISE,lightnet传送门。一、 使用OrderedDict([ ])import torchimport torch.nn as nnfrom collections import OrderedDictlayer_list = [ # Seque...https://blog.csdn.net/w55100/article/details/89083776

1、了解OrderedDict(参考上文,运行结果如下)

import torch
import torch.nn as nn
from collections import OrderedDictlayer_list = [# Sequence 1 :OrderedDict([('1_conv2d', nn.Conv2d(32, 64, 1, 1)),('2_Relu', nn.ReLU(inplace=True)),]),# Sequence 2 :OrderedDict([('3_conv2d', nn.Conv2d((4 * 64) + 1024, 1024, 3, 1)),('4_bn', nn.BatchNorm2d(1024, 20, 1, 1, 0)),]),
]sequence_list = [nn.Sequential(layer_dict) for layer_dict in layer_list]
print(sequence_list)

 >>>>>>>>>>>>>>>>>>>>>>准备就绪,开始实现List0的结构搭建>>>>>>>>>>>>>>>>>>>>>>>

# list 0
layer_list.append(OrderedDict([('0_stage1_conv', Conv2dBatchLeaky(3, 32, 3, 1, 1)),  # 416 x 416 x 32        # Convolutional("0_stage2_conv", Conv2dBatchLeaky(32, 64, 3, 2)),  # 208 x 208 x 64          # Convolutional("0_stage2_ressum1", ResBlockSum(64)),                                        # Convolutional*2 + Resiudal("0_stage3_conv", Conv2dBatchLeaky(64, 128, 3, 2)),  # 104 x 104 128          # Convolutional("0_stage3_ressum1", ResBlockSum(128)),("0_stage3_ressum2", ResBlockSum(128)),                                       # (Convolutional*2 + Resiudal)**2("0_stage4_conv", Conv2dBatchLeaky(128, 256, 3, 2)),  # 52 x 52 x 256         # Convolutional("0_stage4_ressum1", ResBlockSum(256)),("0_stage4_ressum2", ResBlockSum(256)),("0_stage4_ressum3", ResBlockSum(256)),("0_stage4_ressum4", ResBlockSum(256)),("0_stage4_ressum5", ResBlockSum(256)),("0_stage4_ressum6", ResBlockSum(256)),("0_stage4_ressum7", ResBlockSum(256)),("0_stage4_ressum8", ResBlockSum(256)),  # 52 x 52 x 256 output_feature_0      (Convolutional*2 + Resiudal)**8]))

【二】实现List1

  >>>>>>>>>>>>>>>>>>>>>>准备就绪,开始实现List1的结构搭建>>>>>>>>>>>>>>>>>>>>>>>

# list 1
layer_list.append(OrderedDict([("1_stage5_conv", Conv2dBatchLeaky(256, 512, 3, 2)),  # 26 x 26 x 512         # Convolutional("1_stage5_ressum1", ResBlockSum(512)),("1_stage5_ressum2", ResBlockSum(512)),("1_stage5_ressum3", ResBlockSum(512)),("1_stage5_ressum4", ResBlockSum(512)),("1_stage5_ressum5", ResBlockSum(512)),("1_stage5_ressum6", ResBlockSum(512)),("1_stage5_ressum7", ResBlockSum(512)),("1_stage5_ressum8", ResBlockSum(512)),  # 26 x 26 x 512 output_feature_1     # (Convolutional*2 + Resiudal)**8]))

【三】实现List2

  >>>>>>>>>>>>>>>>>>>>>>准备就绪,开始实现List2的结构搭建>>>>>>>>>>>>>>>>>>>>>>>

# list 2
layer_list.append(OrderedDict([("2_stage6_conv", Conv2dBatchLeaky(512, 1024, 3, 2)),  # 13 x 13 x 1024      # Convolutional("2_stage6_ressum1", ResBlockSum(1024)),("2_stage6_ressum2", ResBlockSum(1024)),("2_stage6_ressum3", ResBlockSum(1024)),("2_stage6_ressum4", ResBlockSum(1024)),  # 13 x 13 x 1024 output_feature_2 # (Convolutional*2 + Resiudal)**4("2_headbody1", HeadBody(in_channels=1024, out_channels=512)), # 13 x 13 x 512  # Convalutional Set = Conv2dBatchLeaky * 5]))

 【四】实现List3

  >>>>>>>>>>>>>>>>>>>>>>准备就绪,开始实现List3的结构搭建>>>>>>>>>>>>>>>>>>>>>>>

# list 3
layer_list.append(OrderedDict([("3_conv_1", Conv2dBatchLeaky(in_channels=512, out_channels=1024, kernel_size=3, stride=1)),("3_conv_2", nn.Conv2d(in_channels=1024, out_channels=len(anchor_mask1) * (num_classes + 5), kernel_size=1, stride=1, padding=0, bias=True)),
])) # predict one

【五】实现List4,得到大物体的anchor和num_classes

# list 4
layer_list.append(OrderedDict([("4_yolo", YOLOLayer([anchors[i] for i in anchor_mask1], num_classes))
])) # 3*((x, y, w, h, confidence) + classes )

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>得到中物体的anchor和num_classes>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

【一】实现List5

# list 5
layer_list.append(OrderedDict([("5_conv", Conv2dBatchLeaky(512, 256, 1, 1)),("5_upsample", Upsample(scale_factor=2)),
]))

【二】实现List6

# list 6
layer_list.append(OrderedDict([("6_head_body2", HeadBody(in_channels=768, out_channels=256)) # Convalutional Set = Conv2dBatchLeaky * 5
]))

【三】实现List7

# list 7
layer_list.append(OrderedDict([("7_conv_1", Conv2dBatchLeaky(in_channels=256, out_channels=512, kernel_size=3, stride=1)),("7_conv_2", nn.Conv2d(in_channels=512, out_channels=len(anchor_mask2) * (num_classes + 5), kernel_size=1, stride=1, padding=0, bias=True)),
])) # predict two

 【四】实现List8,得到中物体的anchor和num_classes

# list 8
layer_list.append(OrderedDict([("8_yolo", YOLOLayer([anchors[i] for i in anchor_mask2], num_classes))
])) # 3*((x, y, w, h, confidence) + classes )

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>得到小物体的anchor和num_classes>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

【一】实现List9

# list 9
layer_list.append(OrderedDict([("9_conv", Conv2dBatchLeaky(256, 128, 1, 1)),("9_upsample", Upsample(scale_factor=2)),
]))

【二】实现List10

# list 10
layer_list.append(OrderedDict([("10_head_body3", HeadBody(in_channels=384, out_channels=128))  # Convalutional Set = Conv2dBatchLeaky * 5
]))

 【三】实现List11

# list 11
layer_list.append(OrderedDict([("11_conv_1", Conv2dBatchLeaky(in_channels=128, out_channels=256, kernel_size=3, stride=1)),("11_conv_2", nn.Conv2d(in_channels=256, out_channels=len(anchor_mask3) * (num_classes + 5), kernel_size=1, stride=1, padding=0, bias=True)),
])) # predict three

【四】实现List12,得到小物体的anchor和num_classes

# list 12
layer_list.append(OrderedDict([("12_yolo", YOLOLayer([anchors[i] for i in anchor_mask3], num_classes))
])) # 3*((x, y, w, h, confidence) + classes )

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>整合List1-List12>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

nn.ModuleList类似于pytho中的list类型,只是将一系列层装入列表,并没有实现forward()方法,因此也不会有网络模型产生的副作用

# nn.ModuleList类似于pytho中的list类型,只是将一系列层装入列表,并没有实现forward()方法,因此也不会有网络模型产生的副作用
self.module_list = nn.ModuleList([nn.Sequential(i) for i in layer_list])
self.yolo_layer_index = get_yolo_layer_index(self.module_list)
if flag_yolo_structure:print('yolo_layer : ',len(layer_list),'\n')print(self.module_list[4])print(self.module_list[8])print(self.module_list[12])

实现yolov3的forward()函数,将module_list中的list结构像搭积木一样搭建出该网络模型

    def forward(self, x):img_size = x.shape[-1]if flag_yolo_structure:print('forward img_size : ',img_size,x.shape)output = []x = self.module_list[0](x)x_route1 = xx = self.module_list[1](x)x_route2 = xx = self.module_list[2](x)yolo_head = self.module_list[3](x)if flag_yolo_structure:print('mask1 yolo_head : ',yolo_head.size())yolo_head_out_13x13 = self.module_list[4][0](yolo_head, img_size)output.append(yolo_head_out_13x13)x = self.module_list[5](x)x = torch.cat([x, x_route2], 1)x = self.module_list[6](x)yolo_head = self.module_list[7](x)if flag_yolo_structure:print('mask2 yolo_head : ',yolo_head.size())yolo_head_out_26x26 = self.module_list[8][0](yolo_head, img_size)output.append(yolo_head_out_26x26)x = self.module_list[9](x)x = torch.cat([x, x_route1], 1)x = self.module_list[10](x)yolo_head = self.module_list[11](x)if flag_yolo_structure:print('mask3 yolo_head : ',yolo_head.size())yolo_head_out_52x52 = self.module_list[12][0](yolo_head, img_size)output.append(yolo_head_out_52x52)if self.training:return outputelse:io, p = list(zip(*output))  # inference output, training outputreturn torch.cat(io, 1), p

 完整代码

import os
import numpy as np
from collections import OrderedDictimport torch
import torch.nn.functional as F
import torch.nn as nn
flag_yolo_structure = False # True 查看 相关的网络 log
'''in_channels:输入通道out_channels:输出通道kernel_size:核的大小stride:步长leaky_slop:默认设置为0.1
'''
class Conv2dBatchLeaky(nn.Module):def __init__(self, in_channels, out_channels, kernel_size, stride, leaky_slope=0.1):super(Conv2dBatchLeaky, self).__init__()self.in_channels = in_channelsself.out_channels = out_channelsself.kernel_size = kernel_sizeself.stride = strideif isinstance(kernel_size, (list, tuple)):self.padding = [int(ii/2) for ii in kernel_size]if flag_yolo_structure:print('------------------->>>> Conv2dBatchLeaky isinstance')else:self.padding = int(kernel_size/2)self.leaky_slope = leaky_slope# Layer# LeakyReLU : y = max(0, x) + leaky_slope*min(0,x)self.layers = nn.Sequential(nn.Conv2d(self.in_channels, self.out_channels, self.kernel_size, self.stride, self.padding, bias=False),nn.BatchNorm2d(self.out_channels),nn.LeakyReLU(self.leaky_slope, inplace=True))def forward(self, x):x = self.layers(x)return xclass ResBlockSum(nn.Module):def __init__(self, nchannels):super().__init__()self.block = nn.Sequential(Conv2dBatchLeaky(nchannels, int(nchannels/2), 1, 1),Conv2dBatchLeaky(int(nchannels/2), nchannels, 3, 1))def forward(self, x):return x + self.block(x)
class HeadBody(nn.Module):def __init__(self, in_channels, out_channels):super(HeadBody, self).__init__()self.layer = nn.Sequential(Conv2dBatchLeaky(in_channels, out_channels, 1, 1),Conv2dBatchLeaky(out_channels, out_channels*2, 3, 1),Conv2dBatchLeaky(out_channels*2, out_channels, 1, 1),Conv2dBatchLeaky(out_channels, out_channels*2, 3, 1),Conv2dBatchLeaky(out_channels*2, out_channels, 1, 1))def forward(self, x):x = self.layer(x)return xclass Upsample(nn.Module):# Custom Upsample layer (nn.Upsample gives deprecated warning message)def __init__(self, scale_factor=1, mode='nearest'):super(Upsample, self).__init__()self.scale_factor = scale_factorself.mode = modedef forward(self, x):return F.interpolate(x, scale_factor=self.scale_factor, mode=self.mode)# default anchors=[(10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198), (373,326)]
class YOLOLayer(nn.Module):def __init__(self, anchors, nC):super(YOLOLayer, self).__init__()self.anchors = torch.FloatTensor(anchors)self.nA = len(anchors)  # number of anchors (3)self.nC = nC  # number of classesself.img_size = 0if flag_yolo_structure:print('init YOLOLayer ------ >>> ')print('anchors  : ',self.anchors)print('nA       : ',self.nA)print('nC       : ',self.nC)print('img_size : ',self.img_size)def forward(self, p, img_size, var=None):# p : feature mapbs, nG = p.shape[0], p.shape[-1] # batch_size , gridif flag_yolo_structure:print('bs, nG --->>> ',bs, nG)if self.img_size != img_size:create_grids(self, img_size, nG, p.device)# p.view(bs, 255, 13, 13) -- > (bs, 3, 13, 13, 85)  # (bs, anchors, grid, grid, xywh + confidence + classes)p = p.view(bs, self.nA, self.nC + 5, nG, nG).permute(0, 1, 3, 4, 2).contiguous()  #  predictionif self.training:return pelse:  # inferenceio = p.clone()  # inference outputio[..., 0:2] = torch.sigmoid(io[..., 0:2]) + self.grid_xy  # xyio[..., 2:4] = torch.exp(io[..., 2:4]) * self.anchor_wh  # wh yolo methodio[..., 4:] = torch.sigmoid(io[..., 4:])  # p_conf, p_clsio[..., :4] *= self.strideif self.nC == 1:io[..., 5] = 1  # single-class model# flatten prediction, reshape from [bs, nA, nG, nG, nC] to [bs, nA * nG * nG, nC]return io.view(bs, -1, 5 + self.nC), pdef create_grids(self, img_size, nG, device='cpu'):# self.nA : len(anchors)  # number of anchors (3)# self.nC : nC  # number of classes# nG : feature map grid  13*13  26*26 52*52self.img_size = img_sizeself.stride = img_size / nGif flag_yolo_structure:print('create_grids stride : ',self.stride)# build xy offsetsgrid_x = torch.arange(nG).repeat((nG, 1)).view((1, 1, nG, nG)).float()grid_y = grid_x.permute(0, 1, 3, 2)self.grid_xy = torch.stack((grid_x, grid_y), 4).to(device)if flag_yolo_structure:print('grid_x : ',grid_x.size(),grid_x)print('grid_y : ',grid_y.size(),grid_y)print('grid_xy : ',self.grid_xy.size(),self.grid_xy)# build wh gainsself.anchor_vec = self.anchors.to(device) / self.stride # 基于 stride 的归一化# print('self.anchor_vecself.anchor_vecself.anchor_vec:',self.anchor_vec)self.anchor_wh = self.anchor_vec.view(1, self.nA, 1, 1, 2).to(device)self.nG = torch.FloatTensor([nG]).to(device)def get_yolo_layer_index(module_list):yolo_layer_index = []for index, l in enumerate(module_list):try:a = l[0].img_size and l[0].nG  # only yolo layer need img_size and nGyolo_layer_index.append(index)except:passassert len(yolo_layer_index) > 0, "can not find yolo layer"return yolo_layer_indexclass Yolov3(nn.Module):'''9个anchors,有小中大三个尺寸小  (10,13), (16,30), (33,23),中  (30,61), (62,45), (59,119),大  (116,90), (156,198), (373,326)为什么有小中大三个尺寸?答:是因为得符合长宽比,因为有的物体会偏长形,也就是长比宽高;有的物体时宽比长高;有的物体是比较偏于正方形的,也就是长宽差距不大yolov的作者,通过经验,统一出了常用的achors,长宽比例有时候人脸是有角度变化的,随着不同的角度变化,的确是不可能正对人脸,也就是长宽基本一样的情况。也就是它也会出现长比宽多,或者宽比长多,或者是长和宽的比例相差不大'''def __init__(self, num_classes=80, anchors=[(10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198), (373,326)]):super().__init__()'''anchor_mask1 : 大物体 anchor [6, 7, 8] --->anchors[6] anchors[7] anchors[8] ---> (116,90), (156,198), (373,326)anchor_mask2 : 中物体 anchor [3, 4, 5] --->anchors[3] anchors[4] anchors[5] ---> (30,61), (62,45), (59,119)anchor_mask3 : 小物体 anchor [0, 1, 2] --->anchors[0] anchors[1] anchors[2] ---> (10,13), (16,30), (33,23)'''anchor_mask1 = [i for i in range(2 * len(anchors) // 3, len(anchors), 1)]  # [6, 7, 8]anchor_mask2 = [i for i in range(len(anchors) // 3, 2 * len(anchors) // 3, 1)]  # [3, 4, 5]anchor_mask3 = [i for i in range(0, len(anchors) // 3, 1)]  # [0, 1, 2]if flag_yolo_structure:print('anchor_mask1 : ',anchor_mask1) # 大物体 anchorprint('anchor_mask2 : ',anchor_mask2) # 中物体 anchorprint('anchor_mask3 : ',anchor_mask3) # 小物体 anchor# Network# OrderedDict 是 dict 的子类,其最大特征是,它可以“维护”添加 key-value 对的顺序layer_list = []'''******      Conv2dBatchLeaky       *****op : Conv2d,BatchNorm2d,LeakyReLUinputs : in_channels, out_channels, kernel_size, stride, leaky_slope''''''******      ResBlockSum ******op : Conv2dBatchLeaky * 2 + xinputs : nchannels'''# list 0layer_list.append(OrderedDict([('0_stage1_conv', Conv2dBatchLeaky(3, 32, 3, 1, 1)),  # 416 x 416 x 32        # Convolutional("0_stage2_conv", Conv2dBatchLeaky(32, 64, 3, 2)),  # 208 x 208 x 64          # Convolutional("0_stage2_ressum1", ResBlockSum(64)),                                        # Convolutional*2 + Resiudal("0_stage3_conv", Conv2dBatchLeaky(64, 128, 3, 2)),  # 104 x 104 128          # Convolutional("0_stage3_ressum1", ResBlockSum(128)),("0_stage3_ressum2", ResBlockSum(128)),                                       # (Convolutional*2 + Resiudal)**2("0_stage4_conv", Conv2dBatchLeaky(128, 256, 3, 2)),  # 52 x 52 x 256         # Convolutional("0_stage4_ressum1", ResBlockSum(256)),("0_stage4_ressum2", ResBlockSum(256)),("0_stage4_ressum3", ResBlockSum(256)),("0_stage4_ressum4", ResBlockSum(256)),("0_stage4_ressum5", ResBlockSum(256)),("0_stage4_ressum6", ResBlockSum(256)),("0_stage4_ressum7", ResBlockSum(256)),("0_stage4_ressum8", ResBlockSum(256)),  # 52 x 52 x 256 output_feature_0      (Convolutional*2 + Resiudal)**8]))# list 1layer_list.append(OrderedDict([("1_stage5_conv", Conv2dBatchLeaky(256, 512, 3, 2)),  # 26 x 26 x 512         # Convolutional("1_stage5_ressum1", ResBlockSum(512)),("1_stage5_ressum2", ResBlockSum(512)),("1_stage5_ressum3", ResBlockSum(512)),("1_stage5_ressum4", ResBlockSum(512)),("1_stage5_ressum5", ResBlockSum(512)),("1_stage5_ressum6", ResBlockSum(512)),("1_stage5_ressum7", ResBlockSum(512)),("1_stage5_ressum8", ResBlockSum(512)),  # 26 x 26 x 512 output_feature_1     # (Convolutional*2 + Resiudal)**8]))'''******      HeadBody      ******op : Conv2dBatchLeaky * 5inputs : in_channels, out_channels'''# list 2layer_list.append(OrderedDict([("2_stage6_conv", Conv2dBatchLeaky(512, 1024, 3, 2)),  # 13 x 13 x 1024      # Convolutional("2_stage6_ressum1", ResBlockSum(1024)),("2_stage6_ressum2", ResBlockSum(1024)),("2_stage6_ressum3", ResBlockSum(1024)),("2_stage6_ressum4", ResBlockSum(1024)),  # 13 x 13 x 1024 output_feature_2 # (Convolutional*2 + Resiudal)**4("2_headbody1", HeadBody(in_channels=1024, out_channels=512)), # 13 x 13 x 512  # Convalutional Set = Conv2dBatchLeaky * 5]))# list 3layer_list.append(OrderedDict([("3_conv_1", Conv2dBatchLeaky(in_channels=512, out_channels=1024, kernel_size=3, stride=1)),("3_conv_2", nn.Conv2d(in_channels=1024, out_channels=len(anchor_mask1) * (num_classes + 5), kernel_size=1, stride=1, padding=0, bias=True)),])) # predict one# list 4layer_list.append(OrderedDict([("4_yolo", YOLOLayer([anchors[i] for i in anchor_mask1], num_classes))])) # 3*((x, y, w, h, confidence) + classes )# list 5layer_list.append(OrderedDict([("5_conv", Conv2dBatchLeaky(512, 256, 1, 1)),("5_upsample", Upsample(scale_factor=2)),]))# list 6layer_list.append(OrderedDict([("6_head_body2", HeadBody(in_channels=768, out_channels=256)) # Convalutional Set = Conv2dBatchLeaky * 5]))# list 7layer_list.append(OrderedDict([("7_conv_1", Conv2dBatchLeaky(in_channels=256, out_channels=512, kernel_size=3, stride=1)),("7_conv_2", nn.Conv2d(in_channels=512, out_channels=len(anchor_mask2) * (num_classes + 5), kernel_size=1, stride=1, padding=0, bias=True)),])) # predict two# list 8layer_list.append(OrderedDict([("8_yolo", YOLOLayer([anchors[i] for i in anchor_mask2], num_classes))])) # 3*((x, y, w, h, confidence) + classes )# list 9layer_list.append(OrderedDict([("9_conv", Conv2dBatchLeaky(256, 128, 1, 1)),("9_upsample", Upsample(scale_factor=2)),]))# list 10layer_list.append(OrderedDict([("10_head_body3", HeadBody(in_channels=384, out_channels=128))  # Convalutional Set = Conv2dBatchLeaky * 5]))# list 11layer_list.append(OrderedDict([("11_conv_1", Conv2dBatchLeaky(in_channels=128, out_channels=256, kernel_size=3, stride=1)),("11_conv_2", nn.Conv2d(in_channels=256, out_channels=len(anchor_mask3) * (num_classes + 5), kernel_size=1, stride=1, padding=0, bias=True)),])) # predict three# list 12layer_list.append(OrderedDict([("12_yolo", YOLOLayer([anchors[i] for i in anchor_mask3], num_classes))])) # 3*((x, y, w, h, confidence) + classes )# nn.ModuleList类似于pytho中的list类型,只是将一系列层装入列表,并没有实现forward()方法,因此也不会有网络模型产生的副作用self.module_list = nn.ModuleList([nn.Sequential(i) for i in layer_list])self.yolo_layer_index = get_yolo_layer_index(self.module_list)if flag_yolo_structure:print('yolo_layer : ',len(layer_list),'\n')print(self.module_list[4])print(self.module_list[8])print(self.module_list[12])# print('self.module_list  -------->>> ',self.module_list)# print('self.yolo_layer_index  -------->>> ',self.yolo_layer_index)def forward(self, x):img_size = x.shape[-1]if flag_yolo_structure:print('forward img_size : ',img_size,x.shape)output = []x = self.module_list[0](x)x_route1 = xx = self.module_list[1](x)x_route2 = xx = self.module_list[2](x)yolo_head = self.module_list[3](x)if flag_yolo_structure:print('mask1 yolo_head : ',yolo_head.size())yolo_head_out_13x13 = self.module_list[4][0](yolo_head, img_size)output.append(yolo_head_out_13x13)x = self.module_list[5](x)x = torch.cat([x, x_route2], 1)x = self.module_list[6](x)yolo_head = self.module_list[7](x)if flag_yolo_structure:print('mask2 yolo_head : ',yolo_head.size())yolo_head_out_26x26 = self.module_list[8][0](yolo_head, img_size)output.append(yolo_head_out_26x26)x = self.module_list[9](x)x = torch.cat([x, x_route1], 1)x = self.module_list[10](x)yolo_head = self.module_list[11](x)if flag_yolo_structure:print('mask3 yolo_head : ',yolo_head.size())yolo_head_out_52x52 = self.module_list[12][0](yolo_head, img_size)output.append(yolo_head_out_52x52)if self.training:return outputelse:io, p = list(zip(*output))  # inference output, training outputreturn torch.cat(io, 1), pif __name__ == "__main__":dummy_input = torch.Tensor(5, 3, 416, 416)model = Yolov3(num_classes=80)params = list(model.parameters())k = 0for i in params:l = 1for j in i.size():l *= j# print("该层的结构: {}, 参数和: {}".format(str(list(i.size())), str(l)))k = k + lprint("----------------------")print("总参数数量和: " + str(k))print("-----------yolo layer")for index in model.yolo_layer_index:print(model.module_list[index])print("-----------train")model.train()for res in model(dummy_input):print("res:", np.shape(res))print("-----------eval")model.eval()inference_out, train_out = model(dummy_input)print("inference_out:", np.shape(inference_out))for o in train_out:print("train_out:", np.shape(o))

如果您喜欢我的分享,就点个赞👍+收藏,这也是我努力更新文章的动力!!!!!


http://www.ppmy.cn/news/239779.html

相关文章

Easeui 02 tree组件.

1.添加tree组件. tree组件的位置:DataGrid and Tree(表格和树) → tree(树); 复制 tree组件到 "菜单管理"的div里面,如: 这里要动态绑定数据,所以把死数据删除,只留下一个 ul,如&am…

围棋知名AI-KataGo 下载分享

下载地址 下载链接: 围棋 提取码: 2hvp 里面有详情 这个AI 是很知名的围棋AI,并且是开源的 喜欢围棋的就去看看吧

Linux下的围棋软件,在Linux下和电脑下围棋

这几天笔记本C盘硬盘坏了,进不了Windows了,也就没法上弈城或TOM之类的对弈平台了,硬盘要过几天才能换新的,这几天就只能进Fedora core6了,由于里面一直没有装有关围棋的软件,这次趁这个机会就找了一个可以打…

围棋AI安装与测试

2016年,Alphago横跨出世,4比1战胜李世石,正式标志围棋AI战胜人类顶尖棋手(此前围棋最强的软件大致是zen和疯石,有业余强五的水平)。而后Deepmind又推出了Alphago升级版master,对人类顶尖高手达成60连胜&…

功能日臻完善的围棋打谱软件(附源代码)

功能日臻完善的围棋打谱软件(附源代码) 在3月份的时候,围棋打谱软件更新过一版,详见没有退缩的理由(每前进一步都很艰难,但很快乐)一文。到目前为止,软件的功能大为改善。不仅解决了打劫问题,还增加了悔棋、回放、研究等多项功能,并采用此软件操作,在B站发表了几个…

围棋AI kataGo下载

文章目录 一、下载地址二、安装步骤1.安装Java环境2.下载相关资源包3.运行软件4.设置参数 一、下载地址 Github开源项目:https://github.com/lightvector/KataGo 网盘链接:点击跳转 提取码:2hvp 二、安装步骤 1.安装Java环境 该软件是用…

基于java的围棋游戏设计

围棋游戏能实现联机对战,和单机版游戏对战。用Eclipse开发Java程序的相关技术,主要运用了GUI(用户图形界面)API(应用程序借口)来实现,此程序没有用到图片的引用,棋盘和棋子的设计完全是应用函数绘制,再加入复选框,按钮&#xff0c…

wfGo C# winform 围棋系统 简介

wfGo 简介 上面2图为 wfGo的单人打谱模式截图。 wfGo 是笔者大学毕设作品,这几天有时间把它整理一下,分享给大家。文章末尾给了源码的git地址。 它是C#写的一套围棋系统,主要包含了下面几个功能 单人打谱模式https://blog.csdn.net/wf82428…