深度学习每周学习总结J5（DenseNet-121 +SE 算法实战与解析

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊 | 接辅导、项目定制

0. 总结

数据导入及处理部分：本次数据导入没有使用torchvision自带的数据集，需要将原始数据进行处理包括数据导入，查看数据分类情况，定义transforms，进行数据类型转换等操作。

划分数据集：划定训练集测试集后，再使用torch.utils.data中的DataLoader()分别加载上一步处理好的训练及测试数据，查看批处理维度.

模型构建部分：DenseNet-121 + SE模块

设置超参数：在这之前需要定义损失函数，学习率（动态学习率），以及根据学习率定义优化器（例如SGD随机梯度下降），用来在训练中更新参数，最小化损失函数。

定义训练函数：函数的传入的参数有四个，分别是设置好的DataLoader(),定义好的模型，损失函数，优化器。函数内部初始化损失准确率为0，接着开始循环，使用DataLoader()获取一个批次的数据，对这个批次的数据带入模型得到预测值，然后使用损失函数计算得到损失值。接下来就是进行反向传播以及使用优化器优化参数，梯度清零放在反向传播之前或者是使用优化器优化之后都是可以的，一般是默认放在反向传播之前。

定义测试函数：函数传入的参数相比训练函数少了优化器，只需传入设置好的DataLoader(),定义好的模型，损失函数。此外除了处理批次数据时无需再设置梯度清零、返向传播以及优化器优化参数，其余部分均和训练函数保持一致。

训练过程：定义训练次数，有几次就使用整个数据集进行几次训练，初始化四个空list分别存储每次训练及测试的准确率及损失。使用model.train()开启训练模式，调用训练函数得到准确率及损失。使用model.eval()将模型设置为评估模式，调用测试函数得到准确率及损失。接着就是将得到的训练及测试的准确率及损失存储到相应list中并合并打印出来，得到每一次整体训练后的准确率及损失。

结果可视化

模型的保存，调取及使用。在PyTorch中，通常使用 torch.save(model.state_dict(), ‘model.pth’) 保存模型的参数，使用 model.load_state_dict(torch.load(‘model.pth’)) 加载参数。

需要改进优化的地方：确保模型和数据的一致性，都存到GPU或者CPU;注意numclasses不要直接用默认的1000，需要根据实际数据集改进；实例化模型也要注意numclasses这个参数；此外注意测试模型需要用（3,224,224）3表示通道数，这和tensorflow定义的顺序是不用的（224,224,3），做代码转换时需要注意。

python">import torch
import torch.nn as nn
import torchvision
from torchvision import datasets,transforms
from torch.utils.data import DataLoader
import torchvision.models as models
import torch.nn.functional as F
from collections import OrderedDict import os,PIL,pathlib
import matplotlib.pyplot as plt
import warningswarnings.filterwarnings('ignore') # 忽略警告信息plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False   # 用来正常显示负号
plt.rcParams['figure.dpi'] = 100 # 分辨率

1. 设置GPU

python">device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

2. 导入数据及处理部分

python"># 获取数据分布情况
path_dir = './data/mpox_recognize/'
path_dir = pathlib.Path(path_dir)paths = list(path_dir.glob('*'))
# classNames = [str(path).split("\\")[-1] for path in paths] # ['Bananaquit', 'Black Skimmer', 'Black Throated Bushtiti', 'Cockatoo']
classNames = [path.parts[-1] for path in paths]
classNames

['Monkeypox', 'Others']

python"># 定义transforms 并处理数据
train_transforms = transforms.Compose([transforms.Resize([224,224]),      # 将输入图片resize成统一尺寸transforms.RandomHorizontalFlip(), # 随机水平翻转transforms.ToTensor(),             # 将PIL Image 或 numpy.ndarray 装换为tensor,并归一化到[0,1]之间transforms.Normalize(              # 标准化处理 --> 转换为标准正太分布（高斯分布），使模型更容易收敛mean = [0.485,0.456,0.406],    # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。std = [0.229,0.224,0.225])
])
test_transforms = transforms.Compose([transforms.Resize([224,224]),transforms.ToTensor(),transforms.Normalize(mean = [0.485,0.456,0.406],std = [0.229,0.224,0.225])
])
total_data = datasets.ImageFolder('./data/mpox_recognize/',transform = train_transforms)
total_data

Dataset ImageFolderNumber of datapoints: 2142Root location: ./data/mpox_recognize/StandardTransform
Transform: Compose(Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=True)RandomHorizontalFlip(p=0.5)ToTensor()Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]))

python">total_data.class_to_idx

{'Monkeypox': 0, 'Others': 1}

3. 划分数据集

python"># 划分数据集
train_size = int(len(total_data) * 0.8)
test_size = len(total_data) - train_sizetrain_dataset,test_dataset = torch.utils.data.random_split(total_data,[train_size,test_size])
train_dataset,test_dataset

(<torch.utils.data.dataset.Subset at 0x18230109120>,<torch.utils.data.dataset.Subset at 0x182300d2cb0>)

python"># 定义DataLoader用于数据集的加载batch_size = 32train_dl = torch.utils.data.DataLoader(train_dataset,batch_size = batch_size,shuffle = True,num_workers = 1
)
test_dl = torch.utils.data.DataLoader(test_dataset,batch_size = batch_size,shuffle = True,num_workers = 1
)

python"># 观察数据维度
for X,y in test_dl:print("Shape of X [N,C,H,W]: ",X.shape)print("Shape of y: ", y.shape,y.dtype)break

Shape of X [N,C,H,W]:  torch.Size([32, 3, 224, 224])
Shape of y:  torch.Size([32]) torch.int64

4. 模型构建部分

SE 模块

代码解释：

Squeeze操作：使用nn.AdaptiveAvgPool2d(1)来实现全局平均池化，它将输入张量的空间维度（H x W）池化成1x1大小，保留每个通道的平均值。
Excitation操作：将池化后的输出通过两个全连接层（fc1 和 fc2）。第一个全连接层的输出维度是filter_sq，然后通过ReLU激活，再经过第二个全连接层输出1个值，最后通过Sigmoid激活函数将值压缩到[0, 1]之间，表示每个通道的权重。
Scale操作：对输入的特征图进行按通道加权操作，输出加权后的特征图。

运行示例：

该代码中创建了一个SqueezeExcitationLayer实例并用一个形状为(1, 32, 32, 32)的输入张量进行测试，输出的形状将与输入形状相同，因为SE模块是一个逐通道加权操作，不改变空间维度。

python"># import torch
# import torch.nn as nn
# import torch.nn.functional as F# class SqueezeExcitationLayer(nn.Module):
#     def __init__(self, filter_sq):
#         # filter_sq 是 Excitation 中第一个全连接层的输出通道数
#         super(SqueezeExcitationLayer, self).__init__()
#         self.filter_sq = filter_sq
#         self.global_avg_pool = nn.AdaptiveAvgPool2d(1)  # 等效于全局平均池化
#         self.fc1 = nn.Linear(1, filter_sq)  # 输入通道数是1（全局池化后的输出），输出通道数是filter_sq
#         self.relu = nn.ReLU()
#         self.fc2 = nn.Linear(filter_sq, 1)  # 最后的输出通道数为1（每个通道的权重）
#         self.sigmoid = nn.Sigmoid()#     def forward(self, x):
#         # Squeeze阶段
#         squeeze = self.global_avg_pool(x)  # Shape: (batch_size, channels, 1, 1)
#         squeeze = squeeze.view(squeeze.size(0), -1)  # 拉平成(batch_size, channels)#         # Excitation阶段
#         excitation = self.fc1(squeeze)  # Shape: (batch_size, filter_sq)
#         excitation = self.relu(excitation)
#         excitation = self.fc2(excitation)  # Shape: (batch_size, 1)
#         excitation = self.sigmoid(excitation)  # Shape: (batch_size, 1)#         # Reshape back to match input dimensions for element-wise multiplication
#         excitation = excitation.view(excitation.size(0), excitation.size(1), 1, 1)  # Shape: (batch_size, channels, 1, 1)#         # Scale input with excitation weights
#         scale = x * excitation  # Element-wise multiplication#         return scale# # 示例：创建一个SqueezeExcitation层并通过它传入一个dummy输入
# SE = SqueezeExcitationLayer(16)
# inputs = torch.zeros((1, 32, 32, 32))  # 输入张量，形状为 (batch_size, channels, height, width)
# output = SE(inputs)  # 执行前向传播
# print(output.shape)  # 输出形状

python">class SqueezeExcitationLayer(nn.Module):def __init__(self, num_input_features, filter_sq):super(SqueezeExcitationLayer, self).__init__()self.filter_sq = filter_sqself.global_avg_pool = nn.AdaptiveAvgPool2d(1)  # 等效于全局平均池化self.fc1 = nn.Linear(num_input_features, filter_sq)  # 输入特征为num_input_features，输出特征为filter_sqself.relu = nn.ReLU()self.fc2 = nn.Linear(filter_sq, num_input_features)  # 最后的输出通道数与输入的通道数相同self.sigmoid = nn.Sigmoid()def forward(self, x):# Squeeze阶段squeeze = self.global_avg_pool(x)  # Shape: (batch_size, channels, 1, 1)squeeze = squeeze.view(squeeze.size(0), -1)  # 拉平成(batch_size, channels)# Excitation阶段excitation = self.fc1(squeeze)  # Shape: (batch_size, filter_sq)excitation = self.relu(excitation)excitation = self.fc2(excitation)  # Shape: (batch_size, num_input_features)excitation = self.sigmoid(excitation)  # Shape: (batch_size, num_input_features)# Reshape back to match input dimensions for element-wise multiplicationexcitation = excitation.view(excitation.size(0), excitation.size(1), 1, 1)  # Shape: (batch_size, channels, 1, 1)# Scale input with excitation weightsscale = x * excitation  # Element-wise multiplicationreturn scale# 调用SE模块时，确保传入的参数正确
inputs = torch.zeros((1, 32, 32, 32))  # 示例输入张量，注意channels的位置
inputs = inputs.permute(0, 3, 1, 2)  # 将输入的维度从 (batch_size, height, width, channels) 转换为 (batch_size, channels, height, width)se = SqueezeExcitationLayer(32, 16)  # 32是输入通道数，16是filter_sq
output = se(inputs)
print(output.shape)

torch.Size([1, 32, 32, 32])

出现 RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x32 and 1x16) 错误是因为在 SE 模块的 fc1 层中，输入的形状和权重矩阵的形状不匹配。这个问题发生在全连接层时，通常是由于输入尺寸不符合全连接层的预期。

问题的根源：
在 SqueezeExcitationLayer 中，我们对输入进行全局平均池化后，得到的输出是 (batch_size, channels, 1, 1)。然后，试图将这个输出展平成 (batch_size, channels)，并传递给全连接层（fc1）。然而，fc1 层的输入特征数应与输入的通道数匹配。错误的根本原因是 fc1 层的输入尺寸不匹配。

解决方法：
要确保输入的形状与 fc1 层的输入特征数匹配，我们应该在 fc1 层的输入时，确保其维度正确。具体地，我们需要使用正确的输入特征大小（即 num_input_features，它应为输入张量的通道数）。

关键修改：
输入维度：在调用 SqueezeExcitationLayer 时，我们确保了输入张量的维度是 (batch_size, channels, height, width)。因为PyTorch通常处理的图像数据格式是 (batch_size, channels, height, width)，而不是 (batch_size, height, width, channels)。
fc1 层的输入：fc1 层的输入特征数应与输入张量的通道数（num_input_features）匹配。在调用 SqueezeExcitationLayer 时，确保了这一点。
这样，您应该能够顺利执行前向传播并得到正确的输出形状。

改进的DenseNET

要在你现有的DenseNet代码中加入SE模块（Squeeze-and-Excitation），我们需要对DenseNet中的每个_DenseLayer做一些修改，确保在每个DenseLayer后加入SE模块。SE模块的作用是通过学习通道重要性来调整每个通道的权重。我们可以将SE模块添加到每个_DenseLayer的输出中。

具体修改步骤：

在_DenseLayer中加入SE模块，使得每个DenseLayer的输出都经过SE模块的加权调整。
在DenseNet构造函数中，对每个_DenseLayer实例化时加入SE模块。

python"># class _DenseLayer(nn.Sequential):
#     """Basic unit of DenseBlock (using bottleneck layer) """
#     def __init__(self, num_input_features, growth_rate, bn_size, drop_rate):
#         super(_DenseLayer, self).__init__()
#         self.add_module("norm1", nn.BatchNorm2d(num_input_features))
#         self.add_module("relu1", nn.ReLU(inplace=True))
#         self.add_module("conv1", nn.Conv2d(num_input_features, bn_size*growth_rate,
#                                            kernel_size=1, stride=1, bias=False))
#         self.add_module("norm2", nn.BatchNorm2d(bn_size*growth_rate))
#         self.add_module("relu2", nn.ReLU(inplace=True))
#         self.add_module("conv2", nn.Conv2d(bn_size*growth_rate, growth_rate,
#                                            kernel_size=3, stride=1, padding=1, bias=False))
#         self.drop_rate = drop_rate#     def forward(self, x):
#         new_features = super(_DenseLayer, self).forward(x)
#         if self.drop_rate > 0:
#             new_features = F.dropout(new_features, p=self.drop_rate, training=self.training)
#         return torch.cat([x, new_features], 1)# class _DenseBlock(nn.Sequential):
#     """DenseBlock"""
#     def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate):
#         super(_DenseBlock, self).__init__()
#         for i in range(num_layers):
#             layer = _DenseLayer(num_input_features+i*growth_rate, growth_rate, bn_size,
#                                 drop_rate)
#             self.add_module("denselayer%d" % (i+1,), layer)# class _Transition(nn.Sequential):
#     """Transition layer between two adjacent DenseBlock"""
#     def __init__(self, num_input_feature, num_output_features):
#         super(_Transition, self).__init__()
#         self.add_module("norm", nn.BatchNorm2d(num_input_feature))
#         self.add_module("relu", nn.ReLU(inplace=True))
#         self.add_module("conv", nn.Conv2d(num_input_feature, num_output_features,
#                                           kernel_size=1, stride=1, bias=False))
#         self.add_module("pool", nn.AvgPool2d(2, stride=2))# class DenseNet(nn.Module):
#     "DenseNet-BC model"
#     def __init__(self, growth_rate=32, block_config=(6, 12, 24, 16), num_init_features=64,
#                  bn_size=4, compression_rate=0.5, drop_rate=0, num_classes=1000):
#         """
#         :param growth_rate: (int) number of filters used in DenseLayer, `k` in the paper
#         :param block_config: (list of 4 ints) number of layers in each DenseBlock
#         :param num_init_features: (int) number of filters in the first Conv2d
#         :param bn_size: (int) the factor using in the bottleneck layer
#         :param compression_rate: (float) the compression rate used in Transition Layer
#         :param drop_rate: (float) the drop rate after each DenseLayer
#         :param num_classes: (int) number of classes for classification
#         """
#         super(DenseNet, self).__init__()
#         # first Conv2d
#         self.features = nn.Sequential(OrderedDict([
#             ("conv0", nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False)),
#             ("norm0", nn.BatchNorm2d(num_init_features)),
#             ("relu0", nn.ReLU(inplace=True)),
#             ("pool0", nn.MaxPool2d(3, stride=2, padding=1))
#         ]))#         # DenseBlock
#         num_features = num_init_features
#         for i, num_layers in enumerate(block_config):
#             block = _DenseBlock(num_layers, num_features, bn_size, growth_rate, drop_rate)
#             self.features.add_module("denseblock%d" % (i + 1), block)
#             num_features += num_layers*growth_rate
#             if i != len(block_config) - 1:
#                 transition = _Transition(num_features, int(num_features*compression_rate))
#                 self.features.add_module("transition%d" % (i + 1), transition)
#                 num_features = int(num_features * compression_rate)#         # final bn+ReLU
#         self.features.add_module("norm5", nn.BatchNorm2d(num_features))
#         self.features.add_module("relu5", nn.ReLU(inplace=True))#         # classification layer
#         self.classifier = nn.Linear(num_features, num_classes)#         # params initialization
#         for m in self.modules():
#             if isinstance(m, nn.Conv2d):
#                 nn.init.kaiming_normal_(m.weight)
#             elif isinstance(m, nn.BatchNorm2d):
#                 nn.init.constant_(m.bias, 0)
#                 nn.init.constant_(m.weight, 1)
#             elif isinstance(m, nn.Linear):
#                 nn.init.constant_(m.bias, 0)#     def forward(self, x):
#         features = self.features(x)
#         out = F.avg_pool2d(features, 7, stride=1).view(features.size(0), -1)
#         out = self.classifier(out)
#         return out

python">import torch
import torch.nn as nn
import torch.nn.functional as F
from collections import OrderedDictclass SqueezeExcitationLayer(nn.Module):def __init__(self, num_input_features, filter_sq):super(SqueezeExcitationLayer, self).__init__()self.filter_sq = filter_sqself.global_avg_pool = nn.AdaptiveAvgPool2d(1)  # 等效于全局平均池化self.fc1 = nn.Linear(num_input_features, filter_sq)  # 输入通道数是num_input_features，输出通道数是filter_sqself.relu = nn.ReLU()self.fc2 = nn.Linear(filter_sq, num_input_features)  # 最后的输出通道数与输入的通道数相同self.sigmoid = nn.Sigmoid()def forward(self, x):# Squeeze阶段squeeze = self.global_avg_pool(x)  # Shape: (batch_size, channels, 1, 1)squeeze = squeeze.view(squeeze.size(0), -1)  # 拉平成(batch_size, channels)# Excitation阶段excitation = self.fc1(squeeze)  # Shape: (batch_size, filter_sq)excitation = self.relu(excitation)excitation = self.fc2(excitation)  # Shape: (batch_size, num_input_features)excitation = self.sigmoid(excitation)  # Shape: (batch_size, num_input_features)# Reshape back to match input dimensions for element-wise multiplicationexcitation = excitation.view(excitation.size(0), excitation.size(1), 1, 1)  # Shape: (batch_size, channels, 1, 1)# Scale input with excitation weightsscale = x * excitation  # Element-wise multiplicationreturn scaleclass _DenseLayer(nn.Sequential):"""Basic unit of DenseBlock (using bottleneck layer) """def __init__(self, num_input_features, growth_rate, bn_size, drop_rate, se_filter_sq=16):super(_DenseLayer, self).__init__()self.add_module("norm1", nn.BatchNorm2d(num_input_features))self.add_module("relu1", nn.ReLU(inplace=True))self.add_module("conv1", nn.Conv2d(num_input_features, bn_size*growth_rate,kernel_size=1, stride=1, bias=False))self.add_module("norm2", nn.BatchNorm2d(bn_size*growth_rate))self.add_module("relu2", nn.ReLU(inplace=True))self.add_module("conv2", nn.Conv2d(bn_size*growth_rate, growth_rate,kernel_size=3, stride=1, padding=1, bias=False))# 添加SE模块self.se = SqueezeExcitationLayer(growth_rate, se_filter_sq)self.drop_rate = drop_ratedef forward(self, x):new_features = super(_DenseLayer, self).forward(x)new_features = self.se(new_features)  # 将SE模块加到特征图上if self.drop_rate > 0:new_features = F.dropout(new_features, p=self.drop_rate, training=self.training)return torch.cat([x, new_features], 1)class _DenseBlock(nn.Sequential):"""DenseBlock"""def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate, se_filter_sq=16):super(_DenseBlock, self).__init__()for i in range(num_layers):layer = _DenseLayer(num_input_features+i*growth_rate, growth_rate, bn_size,drop_rate, se_filter_sq)self.add_module("denselayer%d" % (i+1,), layer)class _Transition(nn.Sequential):"""Transition layer between two adjacent DenseBlock"""def __init__(self, num_input_feature, num_output_features):super(_Transition, self).__init__()self.add_module("norm", nn.BatchNorm2d(num_input_feature))self.add_module("relu", nn.ReLU(inplace=True))self.add_module("conv", nn.Conv2d(num_input_feature, num_output_features,kernel_size=1, stride=1, bias=False))self.add_module("pool", nn.AvgPool2d(2, stride=2))class DenseNet(nn.Module):"DenseNet-BC model"def __init__(self, growth_rate=32, block_config=(6, 12, 24, 16), num_init_features=64,bn_size=4, compression_rate=0.5, drop_rate=0, num_classes=1000, se_filter_sq=16):""":param growth_rate: (int) number of filters used in DenseLayer, `k` in the paper:param block_config: (list of 4 ints) number of layers in each DenseBlock:param num_init_features: (int) number of filters in the first Conv2d:param bn_size: (int) the factor using in the bottleneck layer:param compression_rate: (float) the compression rate used in Transition Layer:param drop_rate: (float) the drop rate after each DenseLayer:param num_classes: (int) number of classes for classification:param se_filter_sq: (int) the number of filters used in SE module's fully connected layer"""super(DenseNet, self).__init__()# first Conv2dself.features = nn.Sequential(OrderedDict([ ("conv0", nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False)),("norm0", nn.BatchNorm2d(num_init_features)),("relu0", nn.ReLU(inplace=True)),("pool0", nn.MaxPool2d(3, stride=2, padding=1))]))# DenseBlocknum_features = num_init_featuresfor i, num_layers in enumerate(block_config):block = _DenseBlock(num_layers, num_features, bn_size, growth_rate, drop_rate, se_filter_sq)self.features.add_module("denseblock%d" % (i + 1), block)num_features += num_layers * growth_rateif i != len(block_config) - 1:transition = _Transition(num_features, int(num_features * compression_rate))self.features.add_module("transition%d" % (i + 1), transition)num_features = int(num_features * compression_rate)# final bn+ReLUself.features.add_module("norm5", nn.BatchNorm2d(num_features))self.features.add_module("relu5", nn.ReLU(inplace=True))# classification layerself.classifier = nn.Linear(num_features, num_classes)# params initializationfor m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight)elif isinstance(m, nn.BatchNorm2d):nn.init.constant_(m.bias, 0)nn.init.constant_(m.weight, 1)elif isinstance(m, nn.Linear):nn.init.constant_(m.bias, 0)def forward(self, x):features = self.features(x)out = F.avg_pool2d(features, 7, stride=1).view(features.size(0), -1)out = self.classifier(out)return out

代码解释：

SqueezeExcitationLayer：实现了SE模块，它根据输入通道的重要性生成一个权重（通过全局平均池化和两个全连接层）。
_DenseLayer：在每个DenseLayer后添加了SE模块，并将SE模块的输出与输入特征图相乘，从而对每个通道进行加权。
_DenseBlock：每个DenseBlock中的_DenseLayer都调用了SE模块。
DenseNet：在DenseNet类中，您可以指定se_filter_sq参数，这控制SE模块中的全连接层的大小。

这样，您的DenseNet就可以利用SE模块来提升其表示能力了。

python"># # Now, instantiate and use the model
# densenet121 = DenseNet(num_init_features=64, # init_channel=64,
#                        growth_rate=32,
#                        block_config=(6,12,24,16),
#                        num_classes=len(classNames))  # model = densenet121.to(device)
# model

python"># Now, instantiate and use the model
se_filter_sq = 16  # 可以根据需要调整SE模块的输出大小densenet121 = DenseNet(num_init_features=64,  # init_channel=64,growth_rate=32,block_config=(6, 12, 24, 16),num_classes=len(classNames),  # 根据您的分类任务设置类别数se_filter_sq=se_filter_sq  # 传递SE模块的参数
)model = densenet121.to(device)  # 将模型移动到指定的设备上
model

DenseNet((features): Sequential((conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu0): ReLU(inplace=True)(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)(denseblock1): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid())))(transition1): _Transition((norm): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace=True)(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(pool): AvgPool2d(kernel_size=2, stride=2, padding=0))(denseblock2): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer7): _DenseLayer((norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer8): _DenseLayer((norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer9): _DenseLayer((norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer10): _DenseLayer((norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer11): _DenseLayer((norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer12): _DenseLayer((norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid())))(transition2): _Transition((norm): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace=True)(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(pool): AvgPool2d(kernel_size=2, stride=2, padding=0))(denseblock3): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer7): _DenseLayer((norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer8): _DenseLayer((norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer9): _DenseLayer((norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer10): _DenseLayer((norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer11): _DenseLayer((norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer12): _DenseLayer((norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer13): _DenseLayer((norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer14): _DenseLayer((norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer15): _DenseLayer((norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer16): _DenseLayer((norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer17): _DenseLayer((norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer18): _DenseLayer((norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer19): _DenseLayer((norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer20): _DenseLayer((norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer21): _DenseLayer((norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer22): _DenseLayer((norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer23): _DenseLayer((norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer24): _DenseLayer((norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid())))(transition3): _Transition((norm): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace=True)(conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(pool): AvgPool2d(kernel_size=2, stride=2, padding=0))(denseblock4): _DenseBlock((denselayer1): _DenseLayer((norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer2): _DenseLayer((norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer3): _DenseLayer((norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer4): _DenseLayer((norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer5): _DenseLayer((norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer6): _DenseLayer((norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer7): _DenseLayer((norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer8): _DenseLayer((norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer9): _DenseLayer((norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer10): _DenseLayer((norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer11): _DenseLayer((norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer12): _DenseLayer((norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer13): _DenseLayer((norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer14): _DenseLayer((norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer15): _DenseLayer((norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid()))(denselayer16): _DenseLayer((norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu1): ReLU(inplace=True)(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu2): ReLU(inplace=True)(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(se): SqueezeExcitationLayer((global_avg_pool): AdaptiveAvgPool2d(output_size=1)(fc1): Linear(in_features=32, out_features=16, bias=True)(relu): ReLU()(fc2): Linear(in_features=16, out_features=32, bias=True)(sigmoid): Sigmoid())))(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu5): ReLU(inplace=True))(classifier): Linear(in_features=1024, out_features=2, bias=True)
)

解释：
se_filter_sq：在模型中传递了se_filter_sq参数，用于控制SE模块的内部全连接层的大小。您可以根据实验的需要调整此值。
其余的部分（num_init_features，growth_rate，block_config，num_classes等）可以根据您的需求调整。
通过这样修改，您的模型现在会正确地包括SE模块，并且能够按预期运行。

python"># 查看模型详情
import torchsummary as summary
summary.summary(model,(3,224,224))

----------------------------------------------------------------Layer (type)               Output Shape         Param #
================================================================Conv2d-1         [-1, 64, 112, 112]           9,408BatchNorm2d-2         [-1, 64, 112, 112]             128ReLU-3         [-1, 64, 112, 112]               0MaxPool2d-4           [-1, 64, 56, 56]               0BatchNorm2d-5           [-1, 64, 56, 56]             128ReLU-6           [-1, 64, 56, 56]               0Conv2d-7          [-1, 128, 56, 56]           8,192BatchNorm2d-8          [-1, 128, 56, 56]             256ReLU-9          [-1, 128, 56, 56]               0Conv2d-10           [-1, 32, 56, 56]          36,864
AdaptiveAvgPool2d-11             [-1, 32, 1, 1]               0Linear-12                   [-1, 16]             528ReLU-13                   [-1, 16]               0Linear-14                   [-1, 32]             544Sigmoid-15                   [-1, 32]               0
SqueezeExcitationLayer-16           [-1, 32, 56, 56]               0
AdaptiveAvgPool2d-17             [-1, 32, 1, 1]               0Linear-18                   [-1, 16]             528ReLU-19                   [-1, 16]               0Linear-20                   [-1, 32]             544Sigmoid-21                   [-1, 32]               0
SqueezeExcitationLayer-22           [-1, 32, 56, 56]               0BatchNorm2d-23           [-1, 96, 56, 56]             192ReLU-24           [-1, 96, 56, 56]               0Conv2d-25          [-1, 128, 56, 56]          12,288BatchNorm2d-26          [-1, 128, 56, 56]             256ReLU-27          [-1, 128, 56, 56]               0Conv2d-28           [-1, 32, 56, 56]          36,864
AdaptiveAvgPool2d-29             [-1, 32, 1, 1]               0Linear-30                   [-1, 16]             528ReLU-31                   [-1, 16]               0Linear-32                   [-1, 32]             544Sigmoid-33                   [-1, 32]               0
SqueezeExcitationLayer-34           [-1, 32, 56, 56]               0
AdaptiveAvgPool2d-35             [-1, 32, 1, 1]               0Linear-36                   [-1, 16]             528ReLU-37                   [-1, 16]               0Linear-38                   [-1, 32]             544Sigmoid-39                   [-1, 32]               0
SqueezeExcitationLayer-40           [-1, 32, 56, 56]               0BatchNorm2d-41          [-1, 128, 56, 56]             256ReLU-42          [-1, 128, 56, 56]               0Conv2d-43          [-1, 128, 56, 56]          16,384BatchNorm2d-44          [-1, 128, 56, 56]             256ReLU-45          [-1, 128, 56, 56]               0Conv2d-46           [-1, 32, 56, 56]          36,864
AdaptiveAvgPool2d-47             [-1, 32, 1, 1]               0Linear-48                   [-1, 16]             528ReLU-49                   [-1, 16]               0Linear-50                   [-1, 32]             544Sigmoid-51                   [-1, 32]               0
SqueezeExcitationLayer-52           [-1, 32, 56, 56]               0
AdaptiveAvgPool2d-53             [-1, 32, 1, 1]               0Linear-54                   [-1, 16]             528ReLU-55                   [-1, 16]               0Linear-56                   [-1, 32]             544Sigmoid-57                   [-1, 32]               0
SqueezeExcitationLayer-58           [-1, 32, 56, 56]               0BatchNorm2d-59          [-1, 160, 56, 56]             320ReLU-60          [-1, 160, 56, 56]               0Conv2d-61          [-1, 128, 56, 56]          20,480BatchNorm2d-62          [-1, 128, 56, 56]             256ReLU-63          [-1, 128, 56, 56]               0Conv2d-64           [-1, 32, 56, 56]          36,864
AdaptiveAvgPool2d-65             [-1, 32, 1, 1]               0Linear-66                   [-1, 16]             528ReLU-67                   [-1, 16]               0Linear-68                   [-1, 32]             544Sigmoid-69                   [-1, 32]               0
SqueezeExcitationLayer-70           [-1, 32, 56, 56]               0
AdaptiveAvgPool2d-71             [-1, 32, 1, 1]               0Linear-72                   [-1, 16]             528ReLU-73                   [-1, 16]               0Linear-74                   [-1, 32]             544Sigmoid-75                   [-1, 32]               0
SqueezeExcitationLayer-76           [-1, 32, 56, 56]               0BatchNorm2d-77          [-1, 192, 56, 56]             384ReLU-78          [-1, 192, 56, 56]               0Conv2d-79          [-1, 128, 56, 56]          24,576BatchNorm2d-80          [-1, 128, 56, 56]             256ReLU-81          [-1, 128, 56, 56]               0Conv2d-82           [-1, 32, 56, 56]          36,864
AdaptiveAvgPool2d-83             [-1, 32, 1, 1]               0Linear-84                   [-1, 16]             528ReLU-85                   [-1, 16]               0Linear-86                   [-1, 32]             544Sigmoid-87                   [-1, 32]               0
SqueezeExcitationLayer-88           [-1, 32, 56, 56]               0
AdaptiveAvgPool2d-89             [-1, 32, 1, 1]               0Linear-90                   [-1, 16]             528ReLU-91                   [-1, 16]               0Linear-92                   [-1, 32]             544Sigmoid-93                   [-1, 32]               0
SqueezeExcitationLayer-94           [-1, 32, 56, 56]               0BatchNorm2d-95          [-1, 224, 56, 56]             448ReLU-96          [-1, 224, 56, 56]               0Conv2d-97          [-1, 128, 56, 56]          28,672BatchNorm2d-98          [-1, 128, 56, 56]             256ReLU-99          [-1, 128, 56, 56]               0Conv2d-100           [-1, 32, 56, 56]          36,864
AdaptiveAvgPool2d-101             [-1, 32, 1, 1]               0Linear-102                   [-1, 16]             528ReLU-103                   [-1, 16]               0Linear-104                   [-1, 32]             544Sigmoid-105                   [-1, 32]               0
SqueezeExcitationLayer-106           [-1, 32, 56, 56]               0
AdaptiveAvgPool2d-107             [-1, 32, 1, 1]               0Linear-108                   [-1, 16]             528ReLU-109                   [-1, 16]               0Linear-110                   [-1, 32]             544Sigmoid-111                   [-1, 32]               0
SqueezeExcitationLayer-112           [-1, 32, 56, 56]               0BatchNorm2d-113          [-1, 256, 56, 56]             512ReLU-114          [-1, 256, 56, 56]               0Conv2d-115          [-1, 128, 56, 56]          32,768AvgPool2d-116          [-1, 128, 28, 28]               0BatchNorm2d-117          [-1, 128, 28, 28]             256ReLU-118          [-1, 128, 28, 28]               0Conv2d-119          [-1, 128, 28, 28]          16,384BatchNorm2d-120          [-1, 128, 28, 28]             256ReLU-121          [-1, 128, 28, 28]               0Conv2d-122           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-123             [-1, 32, 1, 1]               0Linear-124                   [-1, 16]             528ReLU-125                   [-1, 16]               0Linear-126                   [-1, 32]             544Sigmoid-127                   [-1, 32]               0
SqueezeExcitationLayer-128           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-129             [-1, 32, 1, 1]               0Linear-130                   [-1, 16]             528ReLU-131                   [-1, 16]               0Linear-132                   [-1, 32]             544Sigmoid-133                   [-1, 32]               0
SqueezeExcitationLayer-134           [-1, 32, 28, 28]               0BatchNorm2d-135          [-1, 160, 28, 28]             320ReLU-136          [-1, 160, 28, 28]               0Conv2d-137          [-1, 128, 28, 28]          20,480BatchNorm2d-138          [-1, 128, 28, 28]             256ReLU-139          [-1, 128, 28, 28]               0Conv2d-140           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-141             [-1, 32, 1, 1]               0Linear-142                   [-1, 16]             528ReLU-143                   [-1, 16]               0Linear-144                   [-1, 32]             544Sigmoid-145                   [-1, 32]               0
SqueezeExcitationLayer-146           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-147             [-1, 32, 1, 1]               0Linear-148                   [-1, 16]             528ReLU-149                   [-1, 16]               0Linear-150                   [-1, 32]             544Sigmoid-151                   [-1, 32]               0
SqueezeExcitationLayer-152           [-1, 32, 28, 28]               0BatchNorm2d-153          [-1, 192, 28, 28]             384ReLU-154          [-1, 192, 28, 28]               0Conv2d-155          [-1, 128, 28, 28]          24,576BatchNorm2d-156          [-1, 128, 28, 28]             256ReLU-157          [-1, 128, 28, 28]               0Conv2d-158           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-159             [-1, 32, 1, 1]               0Linear-160                   [-1, 16]             528ReLU-161                   [-1, 16]               0Linear-162                   [-1, 32]             544Sigmoid-163                   [-1, 32]               0
SqueezeExcitationLayer-164           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-165             [-1, 32, 1, 1]               0Linear-166                   [-1, 16]             528ReLU-167                   [-1, 16]               0Linear-168                   [-1, 32]             544Sigmoid-169                   [-1, 32]               0
SqueezeExcitationLayer-170           [-1, 32, 28, 28]               0BatchNorm2d-171          [-1, 224, 28, 28]             448ReLU-172          [-1, 224, 28, 28]               0Conv2d-173          [-1, 128, 28, 28]          28,672BatchNorm2d-174          [-1, 128, 28, 28]             256ReLU-175          [-1, 128, 28, 28]               0Conv2d-176           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-177             [-1, 32, 1, 1]               0Linear-178                   [-1, 16]             528ReLU-179                   [-1, 16]               0Linear-180                   [-1, 32]             544Sigmoid-181                   [-1, 32]               0
SqueezeExcitationLayer-182           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-183             [-1, 32, 1, 1]               0Linear-184                   [-1, 16]             528ReLU-185                   [-1, 16]               0Linear-186                   [-1, 32]             544Sigmoid-187                   [-1, 32]               0
SqueezeExcitationLayer-188           [-1, 32, 28, 28]               0BatchNorm2d-189          [-1, 256, 28, 28]             512ReLU-190          [-1, 256, 28, 28]               0Conv2d-191          [-1, 128, 28, 28]          32,768BatchNorm2d-192          [-1, 128, 28, 28]             256ReLU-193          [-1, 128, 28, 28]               0Conv2d-194           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-195             [-1, 32, 1, 1]               0Linear-196                   [-1, 16]             528ReLU-197                   [-1, 16]               0Linear-198                   [-1, 32]             544Sigmoid-199                   [-1, 32]               0
SqueezeExcitationLayer-200           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-201             [-1, 32, 1, 1]               0Linear-202                   [-1, 16]             528ReLU-203                   [-1, 16]               0Linear-204                   [-1, 32]             544Sigmoid-205                   [-1, 32]               0
SqueezeExcitationLayer-206           [-1, 32, 28, 28]               0BatchNorm2d-207          [-1, 288, 28, 28]             576ReLU-208          [-1, 288, 28, 28]               0Conv2d-209          [-1, 128, 28, 28]          36,864BatchNorm2d-210          [-1, 128, 28, 28]             256ReLU-211          [-1, 128, 28, 28]               0Conv2d-212           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-213             [-1, 32, 1, 1]               0Linear-214                   [-1, 16]             528ReLU-215                   [-1, 16]               0Linear-216                   [-1, 32]             544Sigmoid-217                   [-1, 32]               0
SqueezeExcitationLayer-218           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-219             [-1, 32, 1, 1]               0Linear-220                   [-1, 16]             528ReLU-221                   [-1, 16]               0Linear-222                   [-1, 32]             544Sigmoid-223                   [-1, 32]               0
SqueezeExcitationLayer-224           [-1, 32, 28, 28]               0BatchNorm2d-225          [-1, 320, 28, 28]             640ReLU-226          [-1, 320, 28, 28]               0Conv2d-227          [-1, 128, 28, 28]          40,960BatchNorm2d-228          [-1, 128, 28, 28]             256ReLU-229          [-1, 128, 28, 28]               0Conv2d-230           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-231             [-1, 32, 1, 1]               0Linear-232                   [-1, 16]             528ReLU-233                   [-1, 16]               0Linear-234                   [-1, 32]             544Sigmoid-235                   [-1, 32]               0
SqueezeExcitationLayer-236           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-237             [-1, 32, 1, 1]               0Linear-238                   [-1, 16]             528ReLU-239                   [-1, 16]               0Linear-240                   [-1, 32]             544Sigmoid-241                   [-1, 32]               0
SqueezeExcitationLayer-242           [-1, 32, 28, 28]               0BatchNorm2d-243          [-1, 352, 28, 28]             704ReLU-244          [-1, 352, 28, 28]               0Conv2d-245          [-1, 128, 28, 28]          45,056BatchNorm2d-246          [-1, 128, 28, 28]             256ReLU-247          [-1, 128, 28, 28]               0Conv2d-248           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-249             [-1, 32, 1, 1]               0Linear-250                   [-1, 16]             528ReLU-251                   [-1, 16]               0Linear-252                   [-1, 32]             544Sigmoid-253                   [-1, 32]               0
SqueezeExcitationLayer-254           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-255             [-1, 32, 1, 1]               0Linear-256                   [-1, 16]             528ReLU-257                   [-1, 16]               0Linear-258                   [-1, 32]             544Sigmoid-259                   [-1, 32]               0
SqueezeExcitationLayer-260           [-1, 32, 28, 28]               0BatchNorm2d-261          [-1, 384, 28, 28]             768ReLU-262          [-1, 384, 28, 28]               0Conv2d-263          [-1, 128, 28, 28]          49,152BatchNorm2d-264          [-1, 128, 28, 28]             256ReLU-265          [-1, 128, 28, 28]               0Conv2d-266           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-267             [-1, 32, 1, 1]               0Linear-268                   [-1, 16]             528ReLU-269                   [-1, 16]               0Linear-270                   [-1, 32]             544Sigmoid-271                   [-1, 32]               0
SqueezeExcitationLayer-272           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-273             [-1, 32, 1, 1]               0Linear-274                   [-1, 16]             528ReLU-275                   [-1, 16]               0Linear-276                   [-1, 32]             544Sigmoid-277                   [-1, 32]               0
SqueezeExcitationLayer-278           [-1, 32, 28, 28]               0BatchNorm2d-279          [-1, 416, 28, 28]             832ReLU-280          [-1, 416, 28, 28]               0Conv2d-281          [-1, 128, 28, 28]          53,248BatchNorm2d-282          [-1, 128, 28, 28]             256ReLU-283          [-1, 128, 28, 28]               0Conv2d-284           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-285             [-1, 32, 1, 1]               0Linear-286                   [-1, 16]             528ReLU-287                   [-1, 16]               0Linear-288                   [-1, 32]             544Sigmoid-289                   [-1, 32]               0
SqueezeExcitationLayer-290           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-291             [-1, 32, 1, 1]               0Linear-292                   [-1, 16]             528ReLU-293                   [-1, 16]               0Linear-294                   [-1, 32]             544Sigmoid-295                   [-1, 32]               0
SqueezeExcitationLayer-296           [-1, 32, 28, 28]               0BatchNorm2d-297          [-1, 448, 28, 28]             896ReLU-298          [-1, 448, 28, 28]               0Conv2d-299          [-1, 128, 28, 28]          57,344BatchNorm2d-300          [-1, 128, 28, 28]             256ReLU-301          [-1, 128, 28, 28]               0Conv2d-302           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-303             [-1, 32, 1, 1]               0Linear-304                   [-1, 16]             528ReLU-305                   [-1, 16]               0Linear-306                   [-1, 32]             544Sigmoid-307                   [-1, 32]               0
SqueezeExcitationLayer-308           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-309             [-1, 32, 1, 1]               0Linear-310                   [-1, 16]             528ReLU-311                   [-1, 16]               0Linear-312                   [-1, 32]             544Sigmoid-313                   [-1, 32]               0
SqueezeExcitationLayer-314           [-1, 32, 28, 28]               0BatchNorm2d-315          [-1, 480, 28, 28]             960ReLU-316          [-1, 480, 28, 28]               0Conv2d-317          [-1, 128, 28, 28]          61,440BatchNorm2d-318          [-1, 128, 28, 28]             256ReLU-319          [-1, 128, 28, 28]               0Conv2d-320           [-1, 32, 28, 28]          36,864
AdaptiveAvgPool2d-321             [-1, 32, 1, 1]               0Linear-322                   [-1, 16]             528ReLU-323                   [-1, 16]               0Linear-324                   [-1, 32]             544Sigmoid-325                   [-1, 32]               0
SqueezeExcitationLayer-326           [-1, 32, 28, 28]               0
AdaptiveAvgPool2d-327             [-1, 32, 1, 1]               0Linear-328                   [-1, 16]             528ReLU-329                   [-1, 16]               0Linear-330                   [-1, 32]             544Sigmoid-331                   [-1, 32]               0
SqueezeExcitationLayer-332           [-1, 32, 28, 28]               0BatchNorm2d-333          [-1, 512, 28, 28]           1,024ReLU-334          [-1, 512, 28, 28]               0Conv2d-335          [-1, 256, 28, 28]         131,072AvgPool2d-336          [-1, 256, 14, 14]               0BatchNorm2d-337          [-1, 256, 14, 14]             512ReLU-338          [-1, 256, 14, 14]               0Conv2d-339          [-1, 128, 14, 14]          32,768BatchNorm2d-340          [-1, 128, 14, 14]             256ReLU-341          [-1, 128, 14, 14]               0Conv2d-342           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-343             [-1, 32, 1, 1]               0Linear-344                   [-1, 16]             528ReLU-345                   [-1, 16]               0Linear-346                   [-1, 32]             544Sigmoid-347                   [-1, 32]               0
SqueezeExcitationLayer-348           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-349             [-1, 32, 1, 1]               0Linear-350                   [-1, 16]             528ReLU-351                   [-1, 16]               0Linear-352                   [-1, 32]             544Sigmoid-353                   [-1, 32]               0
SqueezeExcitationLayer-354           [-1, 32, 14, 14]               0BatchNorm2d-355          [-1, 288, 14, 14]             576ReLU-356          [-1, 288, 14, 14]               0Conv2d-357          [-1, 128, 14, 14]          36,864BatchNorm2d-358          [-1, 128, 14, 14]             256ReLU-359          [-1, 128, 14, 14]               0Conv2d-360           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-361             [-1, 32, 1, 1]               0Linear-362                   [-1, 16]             528ReLU-363                   [-1, 16]               0Linear-364                   [-1, 32]             544Sigmoid-365                   [-1, 32]               0
SqueezeExcitationLayer-366           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-367             [-1, 32, 1, 1]               0Linear-368                   [-1, 16]             528ReLU-369                   [-1, 16]               0Linear-370                   [-1, 32]             544Sigmoid-371                   [-1, 32]               0
SqueezeExcitationLayer-372           [-1, 32, 14, 14]               0BatchNorm2d-373          [-1, 320, 14, 14]             640ReLU-374          [-1, 320, 14, 14]               0Conv2d-375          [-1, 128, 14, 14]          40,960BatchNorm2d-376          [-1, 128, 14, 14]             256ReLU-377          [-1, 128, 14, 14]               0Conv2d-378           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-379             [-1, 32, 1, 1]               0Linear-380                   [-1, 16]             528ReLU-381                   [-1, 16]               0Linear-382                   [-1, 32]             544Sigmoid-383                   [-1, 32]               0
SqueezeExcitationLayer-384           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-385             [-1, 32, 1, 1]               0Linear-386                   [-1, 16]             528ReLU-387                   [-1, 16]               0Linear-388                   [-1, 32]             544Sigmoid-389                   [-1, 32]               0
SqueezeExcitationLayer-390           [-1, 32, 14, 14]               0BatchNorm2d-391          [-1, 352, 14, 14]             704ReLU-392          [-1, 352, 14, 14]               0Conv2d-393          [-1, 128, 14, 14]          45,056BatchNorm2d-394          [-1, 128, 14, 14]             256ReLU-395          [-1, 128, 14, 14]               0Conv2d-396           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-397             [-1, 32, 1, 1]               0Linear-398                   [-1, 16]             528ReLU-399                   [-1, 16]               0Linear-400                   [-1, 32]             544Sigmoid-401                   [-1, 32]               0
SqueezeExcitationLayer-402           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-403             [-1, 32, 1, 1]               0Linear-404                   [-1, 16]             528ReLU-405                   [-1, 16]               0Linear-406                   [-1, 32]             544Sigmoid-407                   [-1, 32]               0
SqueezeExcitationLayer-408           [-1, 32, 14, 14]               0BatchNorm2d-409          [-1, 384, 14, 14]             768ReLU-410          [-1, 384, 14, 14]               0Conv2d-411          [-1, 128, 14, 14]          49,152BatchNorm2d-412          [-1, 128, 14, 14]             256ReLU-413          [-1, 128, 14, 14]               0Conv2d-414           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-415             [-1, 32, 1, 1]               0Linear-416                   [-1, 16]             528ReLU-417                   [-1, 16]               0Linear-418                   [-1, 32]             544Sigmoid-419                   [-1, 32]               0
SqueezeExcitationLayer-420           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-421             [-1, 32, 1, 1]               0Linear-422                   [-1, 16]             528ReLU-423                   [-1, 16]               0Linear-424                   [-1, 32]             544Sigmoid-425                   [-1, 32]               0
SqueezeExcitationLayer-426           [-1, 32, 14, 14]               0BatchNorm2d-427          [-1, 416, 14, 14]             832ReLU-428          [-1, 416, 14, 14]               0Conv2d-429          [-1, 128, 14, 14]          53,248BatchNorm2d-430          [-1, 128, 14, 14]             256ReLU-431          [-1, 128, 14, 14]               0Conv2d-432           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-433             [-1, 32, 1, 1]               0Linear-434                   [-1, 16]             528ReLU-435                   [-1, 16]               0Linear-436                   [-1, 32]             544Sigmoid-437                   [-1, 32]               0
SqueezeExcitationLayer-438           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-439             [-1, 32, 1, 1]               0Linear-440                   [-1, 16]             528ReLU-441                   [-1, 16]               0Linear-442                   [-1, 32]             544Sigmoid-443                   [-1, 32]               0
SqueezeExcitationLayer-444           [-1, 32, 14, 14]               0BatchNorm2d-445          [-1, 448, 14, 14]             896ReLU-446          [-1, 448, 14, 14]               0Conv2d-447          [-1, 128, 14, 14]          57,344BatchNorm2d-448          [-1, 128, 14, 14]             256ReLU-449          [-1, 128, 14, 14]               0Conv2d-450           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-451             [-1, 32, 1, 1]               0Linear-452                   [-1, 16]             528ReLU-453                   [-1, 16]               0Linear-454                   [-1, 32]             544Sigmoid-455                   [-1, 32]               0
SqueezeExcitationLayer-456           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-457             [-1, 32, 1, 1]               0Linear-458                   [-1, 16]             528ReLU-459                   [-1, 16]               0Linear-460                   [-1, 32]             544Sigmoid-461                   [-1, 32]               0
SqueezeExcitationLayer-462           [-1, 32, 14, 14]               0BatchNorm2d-463          [-1, 480, 14, 14]             960ReLU-464          [-1, 480, 14, 14]               0Conv2d-465          [-1, 128, 14, 14]          61,440BatchNorm2d-466          [-1, 128, 14, 14]             256ReLU-467          [-1, 128, 14, 14]               0Conv2d-468           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-469             [-1, 32, 1, 1]               0Linear-470                   [-1, 16]             528ReLU-471                   [-1, 16]               0Linear-472                   [-1, 32]             544Sigmoid-473                   [-1, 32]               0
SqueezeExcitationLayer-474           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-475             [-1, 32, 1, 1]               0Linear-476                   [-1, 16]             528ReLU-477                   [-1, 16]               0Linear-478                   [-1, 32]             544Sigmoid-479                   [-1, 32]               0
SqueezeExcitationLayer-480           [-1, 32, 14, 14]               0BatchNorm2d-481          [-1, 512, 14, 14]           1,024ReLU-482          [-1, 512, 14, 14]               0Conv2d-483          [-1, 128, 14, 14]          65,536BatchNorm2d-484          [-1, 128, 14, 14]             256ReLU-485          [-1, 128, 14, 14]               0Conv2d-486           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-487             [-1, 32, 1, 1]               0Linear-488                   [-1, 16]             528ReLU-489                   [-1, 16]               0Linear-490                   [-1, 32]             544Sigmoid-491                   [-1, 32]               0
SqueezeExcitationLayer-492           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-493             [-1, 32, 1, 1]               0Linear-494                   [-1, 16]             528ReLU-495                   [-1, 16]               0Linear-496                   [-1, 32]             544Sigmoid-497                   [-1, 32]               0
SqueezeExcitationLayer-498           [-1, 32, 14, 14]               0BatchNorm2d-499          [-1, 544, 14, 14]           1,088ReLU-500          [-1, 544, 14, 14]               0Conv2d-501          [-1, 128, 14, 14]          69,632BatchNorm2d-502          [-1, 128, 14, 14]             256ReLU-503          [-1, 128, 14, 14]               0Conv2d-504           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-505             [-1, 32, 1, 1]               0Linear-506                   [-1, 16]             528ReLU-507                   [-1, 16]               0Linear-508                   [-1, 32]             544Sigmoid-509                   [-1, 32]               0
SqueezeExcitationLayer-510           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-511             [-1, 32, 1, 1]               0Linear-512                   [-1, 16]             528ReLU-513                   [-1, 16]               0Linear-514                   [-1, 32]             544Sigmoid-515                   [-1, 32]               0
SqueezeExcitationLayer-516           [-1, 32, 14, 14]               0BatchNorm2d-517          [-1, 576, 14, 14]           1,152ReLU-518          [-1, 576, 14, 14]               0Conv2d-519          [-1, 128, 14, 14]          73,728BatchNorm2d-520          [-1, 128, 14, 14]             256ReLU-521          [-1, 128, 14, 14]               0Conv2d-522           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-523             [-1, 32, 1, 1]               0Linear-524                   [-1, 16]             528ReLU-525                   [-1, 16]               0Linear-526                   [-1, 32]             544Sigmoid-527                   [-1, 32]               0
SqueezeExcitationLayer-528           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-529             [-1, 32, 1, 1]               0Linear-530                   [-1, 16]             528ReLU-531                   [-1, 16]               0Linear-532                   [-1, 32]             544Sigmoid-533                   [-1, 32]               0
SqueezeExcitationLayer-534           [-1, 32, 14, 14]               0BatchNorm2d-535          [-1, 608, 14, 14]           1,216ReLU-536          [-1, 608, 14, 14]               0Conv2d-537          [-1, 128, 14, 14]          77,824BatchNorm2d-538          [-1, 128, 14, 14]             256ReLU-539          [-1, 128, 14, 14]               0Conv2d-540           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-541             [-1, 32, 1, 1]               0Linear-542                   [-1, 16]             528ReLU-543                   [-1, 16]               0Linear-544                   [-1, 32]             544Sigmoid-545                   [-1, 32]               0
SqueezeExcitationLayer-546           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-547             [-1, 32, 1, 1]               0Linear-548                   [-1, 16]             528ReLU-549                   [-1, 16]               0Linear-550                   [-1, 32]             544Sigmoid-551                   [-1, 32]               0
SqueezeExcitationLayer-552           [-1, 32, 14, 14]               0BatchNorm2d-553          [-1, 640, 14, 14]           1,280ReLU-554          [-1, 640, 14, 14]               0Conv2d-555          [-1, 128, 14, 14]          81,920BatchNorm2d-556          [-1, 128, 14, 14]             256ReLU-557          [-1, 128, 14, 14]               0Conv2d-558           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-559             [-1, 32, 1, 1]               0Linear-560                   [-1, 16]             528ReLU-561                   [-1, 16]               0Linear-562                   [-1, 32]             544Sigmoid-563                   [-1, 32]               0
SqueezeExcitationLayer-564           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-565             [-1, 32, 1, 1]               0Linear-566                   [-1, 16]             528ReLU-567                   [-1, 16]               0Linear-568                   [-1, 32]             544Sigmoid-569                   [-1, 32]               0
SqueezeExcitationLayer-570           [-1, 32, 14, 14]               0BatchNorm2d-571          [-1, 672, 14, 14]           1,344ReLU-572          [-1, 672, 14, 14]               0Conv2d-573          [-1, 128, 14, 14]          86,016BatchNorm2d-574          [-1, 128, 14, 14]             256ReLU-575          [-1, 128, 14, 14]               0Conv2d-576           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-577             [-1, 32, 1, 1]               0Linear-578                   [-1, 16]             528ReLU-579                   [-1, 16]               0Linear-580                   [-1, 32]             544Sigmoid-581                   [-1, 32]               0
SqueezeExcitationLayer-582           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-583             [-1, 32, 1, 1]               0Linear-584                   [-1, 16]             528ReLU-585                   [-1, 16]               0Linear-586                   [-1, 32]             544Sigmoid-587                   [-1, 32]               0
SqueezeExcitationLayer-588           [-1, 32, 14, 14]               0BatchNorm2d-589          [-1, 704, 14, 14]           1,408ReLU-590          [-1, 704, 14, 14]               0Conv2d-591          [-1, 128, 14, 14]          90,112BatchNorm2d-592          [-1, 128, 14, 14]             256ReLU-593          [-1, 128, 14, 14]               0Conv2d-594           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-595             [-1, 32, 1, 1]               0Linear-596                   [-1, 16]             528ReLU-597                   [-1, 16]               0Linear-598                   [-1, 32]             544Sigmoid-599                   [-1, 32]               0
SqueezeExcitationLayer-600           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-601             [-1, 32, 1, 1]               0Linear-602                   [-1, 16]             528ReLU-603                   [-1, 16]               0Linear-604                   [-1, 32]             544Sigmoid-605                   [-1, 32]               0
SqueezeExcitationLayer-606           [-1, 32, 14, 14]               0BatchNorm2d-607          [-1, 736, 14, 14]           1,472ReLU-608          [-1, 736, 14, 14]               0Conv2d-609          [-1, 128, 14, 14]          94,208BatchNorm2d-610          [-1, 128, 14, 14]             256ReLU-611          [-1, 128, 14, 14]               0Conv2d-612           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-613             [-1, 32, 1, 1]               0Linear-614                   [-1, 16]             528ReLU-615                   [-1, 16]               0Linear-616                   [-1, 32]             544Sigmoid-617                   [-1, 32]               0
SqueezeExcitationLayer-618           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-619             [-1, 32, 1, 1]               0Linear-620                   [-1, 16]             528ReLU-621                   [-1, 16]               0Linear-622                   [-1, 32]             544Sigmoid-623                   [-1, 32]               0
SqueezeExcitationLayer-624           [-1, 32, 14, 14]               0BatchNorm2d-625          [-1, 768, 14, 14]           1,536ReLU-626          [-1, 768, 14, 14]               0Conv2d-627          [-1, 128, 14, 14]          98,304BatchNorm2d-628          [-1, 128, 14, 14]             256ReLU-629          [-1, 128, 14, 14]               0Conv2d-630           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-631             [-1, 32, 1, 1]               0Linear-632                   [-1, 16]             528ReLU-633                   [-1, 16]               0Linear-634                   [-1, 32]             544Sigmoid-635                   [-1, 32]               0
SqueezeExcitationLayer-636           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-637             [-1, 32, 1, 1]               0Linear-638                   [-1, 16]             528ReLU-639                   [-1, 16]               0Linear-640                   [-1, 32]             544Sigmoid-641                   [-1, 32]               0
SqueezeExcitationLayer-642           [-1, 32, 14, 14]               0BatchNorm2d-643          [-1, 800, 14, 14]           1,600ReLU-644          [-1, 800, 14, 14]               0Conv2d-645          [-1, 128, 14, 14]         102,400BatchNorm2d-646          [-1, 128, 14, 14]             256ReLU-647          [-1, 128, 14, 14]               0Conv2d-648           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-649             [-1, 32, 1, 1]               0Linear-650                   [-1, 16]             528ReLU-651                   [-1, 16]               0Linear-652                   [-1, 32]             544Sigmoid-653                   [-1, 32]               0
SqueezeExcitationLayer-654           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-655             [-1, 32, 1, 1]               0Linear-656                   [-1, 16]             528ReLU-657                   [-1, 16]               0Linear-658                   [-1, 32]             544Sigmoid-659                   [-1, 32]               0
SqueezeExcitationLayer-660           [-1, 32, 14, 14]               0BatchNorm2d-661          [-1, 832, 14, 14]           1,664ReLU-662          [-1, 832, 14, 14]               0Conv2d-663          [-1, 128, 14, 14]         106,496BatchNorm2d-664          [-1, 128, 14, 14]             256ReLU-665          [-1, 128, 14, 14]               0Conv2d-666           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-667             [-1, 32, 1, 1]               0Linear-668                   [-1, 16]             528ReLU-669                   [-1, 16]               0Linear-670                   [-1, 32]             544Sigmoid-671                   [-1, 32]               0
SqueezeExcitationLayer-672           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-673             [-1, 32, 1, 1]               0Linear-674                   [-1, 16]             528ReLU-675                   [-1, 16]               0Linear-676                   [-1, 32]             544Sigmoid-677                   [-1, 32]               0
SqueezeExcitationLayer-678           [-1, 32, 14, 14]               0BatchNorm2d-679          [-1, 864, 14, 14]           1,728ReLU-680          [-1, 864, 14, 14]               0Conv2d-681          [-1, 128, 14, 14]         110,592BatchNorm2d-682          [-1, 128, 14, 14]             256ReLU-683          [-1, 128, 14, 14]               0Conv2d-684           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-685             [-1, 32, 1, 1]               0Linear-686                   [-1, 16]             528ReLU-687                   [-1, 16]               0Linear-688                   [-1, 32]             544Sigmoid-689                   [-1, 32]               0
SqueezeExcitationLayer-690           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-691             [-1, 32, 1, 1]               0Linear-692                   [-1, 16]             528ReLU-693                   [-1, 16]               0Linear-694                   [-1, 32]             544Sigmoid-695                   [-1, 32]               0
SqueezeExcitationLayer-696           [-1, 32, 14, 14]               0BatchNorm2d-697          [-1, 896, 14, 14]           1,792ReLU-698          [-1, 896, 14, 14]               0Conv2d-699          [-1, 128, 14, 14]         114,688BatchNorm2d-700          [-1, 128, 14, 14]             256ReLU-701          [-1, 128, 14, 14]               0Conv2d-702           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-703             [-1, 32, 1, 1]               0Linear-704                   [-1, 16]             528ReLU-705                   [-1, 16]               0Linear-706                   [-1, 32]             544Sigmoid-707                   [-1, 32]               0
SqueezeExcitationLayer-708           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-709             [-1, 32, 1, 1]               0Linear-710                   [-1, 16]             528ReLU-711                   [-1, 16]               0Linear-712                   [-1, 32]             544Sigmoid-713                   [-1, 32]               0
SqueezeExcitationLayer-714           [-1, 32, 14, 14]               0BatchNorm2d-715          [-1, 928, 14, 14]           1,856ReLU-716          [-1, 928, 14, 14]               0Conv2d-717          [-1, 128, 14, 14]         118,784BatchNorm2d-718          [-1, 128, 14, 14]             256ReLU-719          [-1, 128, 14, 14]               0Conv2d-720           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-721             [-1, 32, 1, 1]               0Linear-722                   [-1, 16]             528ReLU-723                   [-1, 16]               0Linear-724                   [-1, 32]             544Sigmoid-725                   [-1, 32]               0
SqueezeExcitationLayer-726           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-727             [-1, 32, 1, 1]               0Linear-728                   [-1, 16]             528ReLU-729                   [-1, 16]               0Linear-730                   [-1, 32]             544Sigmoid-731                   [-1, 32]               0
SqueezeExcitationLayer-732           [-1, 32, 14, 14]               0BatchNorm2d-733          [-1, 960, 14, 14]           1,920ReLU-734          [-1, 960, 14, 14]               0Conv2d-735          [-1, 128, 14, 14]         122,880BatchNorm2d-736          [-1, 128, 14, 14]             256ReLU-737          [-1, 128, 14, 14]               0Conv2d-738           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-739             [-1, 32, 1, 1]               0Linear-740                   [-1, 16]             528ReLU-741                   [-1, 16]               0Linear-742                   [-1, 32]             544Sigmoid-743                   [-1, 32]               0
SqueezeExcitationLayer-744           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-745             [-1, 32, 1, 1]               0Linear-746                   [-1, 16]             528ReLU-747                   [-1, 16]               0Linear-748                   [-1, 32]             544Sigmoid-749                   [-1, 32]               0
SqueezeExcitationLayer-750           [-1, 32, 14, 14]               0BatchNorm2d-751          [-1, 992, 14, 14]           1,984ReLU-752          [-1, 992, 14, 14]               0Conv2d-753          [-1, 128, 14, 14]         126,976BatchNorm2d-754          [-1, 128, 14, 14]             256ReLU-755          [-1, 128, 14, 14]               0Conv2d-756           [-1, 32, 14, 14]          36,864
AdaptiveAvgPool2d-757             [-1, 32, 1, 1]               0Linear-758                   [-1, 16]             528ReLU-759                   [-1, 16]               0Linear-760                   [-1, 32]             544Sigmoid-761                   [-1, 32]               0
SqueezeExcitationLayer-762           [-1, 32, 14, 14]               0
AdaptiveAvgPool2d-763             [-1, 32, 1, 1]               0Linear-764                   [-1, 16]             528ReLU-765                   [-1, 16]               0Linear-766                   [-1, 32]             544Sigmoid-767                   [-1, 32]               0
SqueezeExcitationLayer-768           [-1, 32, 14, 14]               0BatchNorm2d-769         [-1, 1024, 14, 14]           2,048ReLU-770         [-1, 1024, 14, 14]               0Conv2d-771          [-1, 512, 14, 14]         524,288AvgPool2d-772            [-1, 512, 7, 7]               0BatchNorm2d-773            [-1, 512, 7, 7]           1,024ReLU-774            [-1, 512, 7, 7]               0Conv2d-775            [-1, 128, 7, 7]          65,536BatchNorm2d-776            [-1, 128, 7, 7]             256ReLU-777            [-1, 128, 7, 7]               0Conv2d-778             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-779             [-1, 32, 1, 1]               0Linear-780                   [-1, 16]             528ReLU-781                   [-1, 16]               0Linear-782                   [-1, 32]             544Sigmoid-783                   [-1, 32]               0
SqueezeExcitationLayer-784             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-785             [-1, 32, 1, 1]               0Linear-786                   [-1, 16]             528ReLU-787                   [-1, 16]               0Linear-788                   [-1, 32]             544Sigmoid-789                   [-1, 32]               0
SqueezeExcitationLayer-790             [-1, 32, 7, 7]               0BatchNorm2d-791            [-1, 544, 7, 7]           1,088ReLU-792            [-1, 544, 7, 7]               0Conv2d-793            [-1, 128, 7, 7]          69,632BatchNorm2d-794            [-1, 128, 7, 7]             256ReLU-795            [-1, 128, 7, 7]               0Conv2d-796             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-797             [-1, 32, 1, 1]               0Linear-798                   [-1, 16]             528ReLU-799                   [-1, 16]               0Linear-800                   [-1, 32]             544Sigmoid-801                   [-1, 32]               0
SqueezeExcitationLayer-802             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-803             [-1, 32, 1, 1]               0Linear-804                   [-1, 16]             528ReLU-805                   [-1, 16]               0Linear-806                   [-1, 32]             544Sigmoid-807                   [-1, 32]               0
SqueezeExcitationLayer-808             [-1, 32, 7, 7]               0BatchNorm2d-809            [-1, 576, 7, 7]           1,152ReLU-810            [-1, 576, 7, 7]               0Conv2d-811            [-1, 128, 7, 7]          73,728BatchNorm2d-812            [-1, 128, 7, 7]             256ReLU-813            [-1, 128, 7, 7]               0Conv2d-814             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-815             [-1, 32, 1, 1]               0Linear-816                   [-1, 16]             528ReLU-817                   [-1, 16]               0Linear-818                   [-1, 32]             544Sigmoid-819                   [-1, 32]               0
SqueezeExcitationLayer-820             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-821             [-1, 32, 1, 1]               0Linear-822                   [-1, 16]             528ReLU-823                   [-1, 16]               0Linear-824                   [-1, 32]             544Sigmoid-825                   [-1, 32]               0
SqueezeExcitationLayer-826             [-1, 32, 7, 7]               0BatchNorm2d-827            [-1, 608, 7, 7]           1,216ReLU-828            [-1, 608, 7, 7]               0Conv2d-829            [-1, 128, 7, 7]          77,824BatchNorm2d-830            [-1, 128, 7, 7]             256ReLU-831            [-1, 128, 7, 7]               0Conv2d-832             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-833             [-1, 32, 1, 1]               0Linear-834                   [-1, 16]             528ReLU-835                   [-1, 16]               0Linear-836                   [-1, 32]             544Sigmoid-837                   [-1, 32]               0
SqueezeExcitationLayer-838             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-839             [-1, 32, 1, 1]               0Linear-840                   [-1, 16]             528ReLU-841                   [-1, 16]               0Linear-842                   [-1, 32]             544Sigmoid-843                   [-1, 32]               0
SqueezeExcitationLayer-844             [-1, 32, 7, 7]               0BatchNorm2d-845            [-1, 640, 7, 7]           1,280ReLU-846            [-1, 640, 7, 7]               0Conv2d-847            [-1, 128, 7, 7]          81,920BatchNorm2d-848            [-1, 128, 7, 7]             256ReLU-849            [-1, 128, 7, 7]               0Conv2d-850             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-851             [-1, 32, 1, 1]               0Linear-852                   [-1, 16]             528ReLU-853                   [-1, 16]               0Linear-854                   [-1, 32]             544Sigmoid-855                   [-1, 32]               0
SqueezeExcitationLayer-856             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-857             [-1, 32, 1, 1]               0Linear-858                   [-1, 16]             528ReLU-859                   [-1, 16]               0Linear-860                   [-1, 32]             544Sigmoid-861                   [-1, 32]               0
SqueezeExcitationLayer-862             [-1, 32, 7, 7]               0BatchNorm2d-863            [-1, 672, 7, 7]           1,344ReLU-864            [-1, 672, 7, 7]               0Conv2d-865            [-1, 128, 7, 7]          86,016BatchNorm2d-866            [-1, 128, 7, 7]             256ReLU-867            [-1, 128, 7, 7]               0Conv2d-868             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-869             [-1, 32, 1, 1]               0Linear-870                   [-1, 16]             528ReLU-871                   [-1, 16]               0Linear-872                   [-1, 32]             544Sigmoid-873                   [-1, 32]               0
SqueezeExcitationLayer-874             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-875             [-1, 32, 1, 1]               0Linear-876                   [-1, 16]             528ReLU-877                   [-1, 16]               0Linear-878                   [-1, 32]             544Sigmoid-879                   [-1, 32]               0
SqueezeExcitationLayer-880             [-1, 32, 7, 7]               0BatchNorm2d-881            [-1, 704, 7, 7]           1,408ReLU-882            [-1, 704, 7, 7]               0Conv2d-883            [-1, 128, 7, 7]          90,112BatchNorm2d-884            [-1, 128, 7, 7]             256ReLU-885            [-1, 128, 7, 7]               0Conv2d-886             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-887             [-1, 32, 1, 1]               0Linear-888                   [-1, 16]             528ReLU-889                   [-1, 16]               0Linear-890                   [-1, 32]             544Sigmoid-891                   [-1, 32]               0
SqueezeExcitationLayer-892             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-893             [-1, 32, 1, 1]               0Linear-894                   [-1, 16]             528ReLU-895                   [-1, 16]               0Linear-896                   [-1, 32]             544Sigmoid-897                   [-1, 32]               0
SqueezeExcitationLayer-898             [-1, 32, 7, 7]               0BatchNorm2d-899            [-1, 736, 7, 7]           1,472ReLU-900            [-1, 736, 7, 7]               0Conv2d-901            [-1, 128, 7, 7]          94,208BatchNorm2d-902            [-1, 128, 7, 7]             256ReLU-903            [-1, 128, 7, 7]               0Conv2d-904             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-905             [-1, 32, 1, 1]               0Linear-906                   [-1, 16]             528ReLU-907                   [-1, 16]               0Linear-908                   [-1, 32]             544Sigmoid-909                   [-1, 32]               0
SqueezeExcitationLayer-910             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-911             [-1, 32, 1, 1]               0Linear-912                   [-1, 16]             528ReLU-913                   [-1, 16]               0Linear-914                   [-1, 32]             544Sigmoid-915                   [-1, 32]               0
SqueezeExcitationLayer-916             [-1, 32, 7, 7]               0BatchNorm2d-917            [-1, 768, 7, 7]           1,536ReLU-918            [-1, 768, 7, 7]               0Conv2d-919            [-1, 128, 7, 7]          98,304BatchNorm2d-920            [-1, 128, 7, 7]             256ReLU-921            [-1, 128, 7, 7]               0Conv2d-922             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-923             [-1, 32, 1, 1]               0Linear-924                   [-1, 16]             528ReLU-925                   [-1, 16]               0Linear-926                   [-1, 32]             544Sigmoid-927                   [-1, 32]               0
SqueezeExcitationLayer-928             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-929             [-1, 32, 1, 1]               0Linear-930                   [-1, 16]             528ReLU-931                   [-1, 16]               0Linear-932                   [-1, 32]             544Sigmoid-933                   [-1, 32]               0
SqueezeExcitationLayer-934             [-1, 32, 7, 7]               0BatchNorm2d-935            [-1, 800, 7, 7]           1,600ReLU-936            [-1, 800, 7, 7]               0Conv2d-937            [-1, 128, 7, 7]         102,400BatchNorm2d-938            [-1, 128, 7, 7]             256ReLU-939            [-1, 128, 7, 7]               0Conv2d-940             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-941             [-1, 32, 1, 1]               0Linear-942                   [-1, 16]             528ReLU-943                   [-1, 16]               0Linear-944                   [-1, 32]             544Sigmoid-945                   [-1, 32]               0
SqueezeExcitationLayer-946             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-947             [-1, 32, 1, 1]               0Linear-948                   [-1, 16]             528ReLU-949                   [-1, 16]               0Linear-950                   [-1, 32]             544Sigmoid-951                   [-1, 32]               0
SqueezeExcitationLayer-952             [-1, 32, 7, 7]               0BatchNorm2d-953            [-1, 832, 7, 7]           1,664ReLU-954            [-1, 832, 7, 7]               0Conv2d-955            [-1, 128, 7, 7]         106,496BatchNorm2d-956            [-1, 128, 7, 7]             256ReLU-957            [-1, 128, 7, 7]               0Conv2d-958             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-959             [-1, 32, 1, 1]               0Linear-960                   [-1, 16]             528ReLU-961                   [-1, 16]               0Linear-962                   [-1, 32]             544Sigmoid-963                   [-1, 32]               0
SqueezeExcitationLayer-964             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-965             [-1, 32, 1, 1]               0Linear-966                   [-1, 16]             528ReLU-967                   [-1, 16]               0Linear-968                   [-1, 32]             544Sigmoid-969                   [-1, 32]               0
SqueezeExcitationLayer-970             [-1, 32, 7, 7]               0BatchNorm2d-971            [-1, 864, 7, 7]           1,728ReLU-972            [-1, 864, 7, 7]               0Conv2d-973            [-1, 128, 7, 7]         110,592BatchNorm2d-974            [-1, 128, 7, 7]             256ReLU-975            [-1, 128, 7, 7]               0Conv2d-976             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-977             [-1, 32, 1, 1]               0Linear-978                   [-1, 16]             528ReLU-979                   [-1, 16]               0Linear-980                   [-1, 32]             544Sigmoid-981                   [-1, 32]               0
SqueezeExcitationLayer-982             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-983             [-1, 32, 1, 1]               0Linear-984                   [-1, 16]             528ReLU-985                   [-1, 16]               0Linear-986                   [-1, 32]             544Sigmoid-987                   [-1, 32]               0
SqueezeExcitationLayer-988             [-1, 32, 7, 7]               0BatchNorm2d-989            [-1, 896, 7, 7]           1,792ReLU-990            [-1, 896, 7, 7]               0Conv2d-991            [-1, 128, 7, 7]         114,688BatchNorm2d-992            [-1, 128, 7, 7]             256ReLU-993            [-1, 128, 7, 7]               0Conv2d-994             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-995             [-1, 32, 1, 1]               0Linear-996                   [-1, 16]             528ReLU-997                   [-1, 16]               0Linear-998                   [-1, 32]             544Sigmoid-999                   [-1, 32]               0
SqueezeExcitationLayer-1000             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-1001             [-1, 32, 1, 1]               0Linear-1002                   [-1, 16]             528ReLU-1003                   [-1, 16]               0Linear-1004                   [-1, 32]             544Sigmoid-1005                   [-1, 32]               0
SqueezeExcitationLayer-1006             [-1, 32, 7, 7]               0BatchNorm2d-1007            [-1, 928, 7, 7]           1,856ReLU-1008            [-1, 928, 7, 7]               0Conv2d-1009            [-1, 128, 7, 7]         118,784BatchNorm2d-1010            [-1, 128, 7, 7]             256ReLU-1011            [-1, 128, 7, 7]               0Conv2d-1012             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-1013             [-1, 32, 1, 1]               0Linear-1014                   [-1, 16]             528ReLU-1015                   [-1, 16]               0Linear-1016                   [-1, 32]             544Sigmoid-1017                   [-1, 32]               0
SqueezeExcitationLayer-1018             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-1019             [-1, 32, 1, 1]               0Linear-1020                   [-1, 16]             528ReLU-1021                   [-1, 16]               0Linear-1022                   [-1, 32]             544Sigmoid-1023                   [-1, 32]               0
SqueezeExcitationLayer-1024             [-1, 32, 7, 7]               0BatchNorm2d-1025            [-1, 960, 7, 7]           1,920ReLU-1026            [-1, 960, 7, 7]               0Conv2d-1027            [-1, 128, 7, 7]         122,880BatchNorm2d-1028            [-1, 128, 7, 7]             256ReLU-1029            [-1, 128, 7, 7]               0Conv2d-1030             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-1031             [-1, 32, 1, 1]               0Linear-1032                   [-1, 16]             528ReLU-1033                   [-1, 16]               0Linear-1034                   [-1, 32]             544Sigmoid-1035                   [-1, 32]               0
SqueezeExcitationLayer-1036             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-1037             [-1, 32, 1, 1]               0Linear-1038                   [-1, 16]             528ReLU-1039                   [-1, 16]               0Linear-1040                   [-1, 32]             544Sigmoid-1041                   [-1, 32]               0
SqueezeExcitationLayer-1042             [-1, 32, 7, 7]               0BatchNorm2d-1043            [-1, 992, 7, 7]           1,984ReLU-1044            [-1, 992, 7, 7]               0Conv2d-1045            [-1, 128, 7, 7]         126,976BatchNorm2d-1046            [-1, 128, 7, 7]             256ReLU-1047            [-1, 128, 7, 7]               0Conv2d-1048             [-1, 32, 7, 7]          36,864
AdaptiveAvgPool2d-1049             [-1, 32, 1, 1]               0Linear-1050                   [-1, 16]             528ReLU-1051                   [-1, 16]               0Linear-1052                   [-1, 32]             544Sigmoid-1053                   [-1, 32]               0
SqueezeExcitationLayer-1054             [-1, 32, 7, 7]               0
AdaptiveAvgPool2d-1055             [-1, 32, 1, 1]               0Linear-1056                   [-1, 16]             528ReLU-1057                   [-1, 16]               0Linear-1058                   [-1, 32]             544Sigmoid-1059                   [-1, 32]               0
SqueezeExcitationLayer-1060             [-1, 32, 7, 7]               0BatchNorm2d-1061           [-1, 1024, 7, 7]           2,048ReLU-1062           [-1, 1024, 7, 7]               0Linear-1063                    [-1, 2]           2,050
================================================================
Total params: 7,080,258
Trainable params: 7,080,258
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 311.15
Params size (MB): 27.01
Estimated Total Size (MB): 338.73
----------------------------------------------------------------

5. 设置超参数：定义损失函数，学习率，以及根据学习率定义优化器等

python"># loss_fn = nn.CrossEntropyLoss() # 创建损失函数# learn_rate = 1e-3 # 初始学习率
# def adjust_learning_rate(optimizer,epoch,start_lr):
#     # 每两个epoch 衰减到原来的0.98
#     lr = start_lr * (0.92 ** (epoch//2))
#     for param_group in optimizer.param_groups:
#         param_group['lr'] = lr# optimizer = torch.optim.Adam(model.parameters(),lr=learn_rate)

python"># 调用官方接口示例
loss_fn = nn.CrossEntropyLoss()learn_rate = 1e-4
lambda1 = lambda epoch:(0.92**(epoch//2))optimizer = torch.optim.Adam(model.parameters(),lr = learn_rate)
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer,lr_lambda=lambda1) # 选定调整方法

6. 训练函数

python"># 训练函数
def train(dataloader,model,loss_fn,optimizer):size = len(dataloader.dataset) # 训练集大小num_batches = len(dataloader) # 批次数目train_loss,train_acc = 0,0for X,y in dataloader:X,y = X.to(device),y.to(device)# 计算预测误差pred = model(X)loss = loss_fn(pred,y)# 反向传播optimizer.zero_grad()loss.backward()optimizer.step()# 记录acc与losstrain_acc += (pred.argmax(1)==y).type(torch.float).sum().item()train_loss += loss.item()train_acc /= sizetrain_loss /= num_batchesreturn train_acc,train_loss

7. 测试函数

python"># 测试函数
def test(dataloader,model,loss_fn):size = len(dataloader.dataset)num_batches = len(dataloader)test_acc,test_loss = 0,0with torch.no_grad():for X,y in dataloader:X,y = X.to(device),y.to(device)# 计算losspred = model(X)loss = loss_fn(pred,y)test_acc += (pred.argmax(1)==y).type(torch.float).sum().item()test_loss += loss.item()test_acc /= sizetest_loss /= num_batchesreturn test_acc,test_loss

8. 正式训练

python">import copyepochs = 40train_acc = []
train_loss = []
test_acc = []
test_loss = []best_acc = 0.0for epoch in range(epochs):# 更新学习率——使用自定义学习率时使用# adjust_learning_rate(optimizer,epoch,learn_rate)model.train()epoch_train_acc,epoch_train_loss = train(train_dl,model,loss_fn,optimizer)scheduler.step() # 更新学习率——调用官方动态学习率时使用model.eval()epoch_test_acc,epoch_test_loss = test(test_dl,model,loss_fn)# 保存最佳模型到 best_modelif epoch_test_acc > best_acc:best_acc = epoch_test_accbest_model = copy.deepcopy(model)train_acc.append(epoch_train_acc)train_loss.append(epoch_train_loss)test_acc.append(epoch_test_acc)test_loss.append(epoch_test_loss)# 获取当前学习率lr = optimizer.state_dict()['param_groups'][0]['lr']template = ('Epoch:{:2d},Train_acc:{:.1f}%,Train_loss:{:.3f},Test_acc:{:.1f}%,Test_loss:{:.3f},Lr:{:.2E}')print(template.format(epoch+1,epoch_train_acc*100,epoch_train_loss,epoch_test_acc*100,epoch_test_loss,lr))print('Done')

Epoch: 1,Train_acc:65.3%,Train_loss:0.633,Test_acc:68.5%,Test_loss:0.595,Lr:1.00E-04
Epoch: 2,Train_acc:70.1%,Train_loss:0.578,Test_acc:71.3%,Test_loss:0.562,Lr:9.20E-05
Epoch: 3,Train_acc:72.3%,Train_loss:0.542,Test_acc:75.5%,Test_loss:0.510,Lr:9.20E-05
Epoch: 4,Train_acc:74.0%,Train_loss:0.502,Test_acc:80.0%,Test_loss:0.457,Lr:8.46E-05
Epoch: 5,Train_acc:76.6%,Train_loss:0.474,Test_acc:76.7%,Test_loss:0.488,Lr:8.46E-05
Epoch: 6,Train_acc:79.3%,Train_loss:0.434,Test_acc:79.0%,Test_loss:0.440,Lr:7.79E-05
Epoch: 7,Train_acc:80.9%,Train_loss:0.423,Test_acc:83.0%,Test_loss:0.397,Lr:7.79E-05
Epoch: 8,Train_acc:83.3%,Train_loss:0.375,Test_acc:78.1%,Test_loss:0.433,Lr:7.16E-05
Epoch: 9,Train_acc:83.6%,Train_loss:0.360,Test_acc:82.5%,Test_loss:0.374,Lr:7.16E-05
Epoch:10,Train_acc:84.8%,Train_loss:0.333,Test_acc:88.3%,Test_loss:0.320,Lr:6.59E-05
Epoch:11,Train_acc:88.1%,Train_loss:0.294,Test_acc:87.4%,Test_loss:0.337,Lr:6.59E-05
Epoch:12,Train_acc:87.3%,Train_loss:0.293,Test_acc:84.6%,Test_loss:0.364,Lr:6.06E-05
Epoch:13,Train_acc:89.1%,Train_loss:0.257,Test_acc:88.6%,Test_loss:0.269,Lr:6.06E-05
Epoch:14,Train_acc:90.3%,Train_loss:0.238,Test_acc:84.6%,Test_loss:0.356,Lr:5.58E-05
Epoch:15,Train_acc:91.2%,Train_loss:0.210,Test_acc:84.4%,Test_loss:0.328,Lr:5.58E-05
Epoch:16,Train_acc:91.8%,Train_loss:0.202,Test_acc:89.3%,Test_loss:0.279,Lr:5.13E-05
Epoch:17,Train_acc:93.3%,Train_loss:0.165,Test_acc:89.3%,Test_loss:0.277,Lr:5.13E-05
Epoch:18,Train_acc:93.5%,Train_loss:0.168,Test_acc:89.5%,Test_loss:0.324,Lr:4.72E-05
Epoch:19,Train_acc:93.7%,Train_loss:0.173,Test_acc:87.9%,Test_loss:0.293,Lr:4.72E-05
Epoch:20,Train_acc:93.8%,Train_loss:0.156,Test_acc:90.7%,Test_loss:0.249,Lr:4.34E-05
Epoch:21,Train_acc:95.2%,Train_loss:0.122,Test_acc:89.3%,Test_loss:0.266,Lr:4.34E-05
Epoch:22,Train_acc:96.2%,Train_loss:0.123,Test_acc:90.7%,Test_loss:0.270,Lr:4.00E-05
Epoch:23,Train_acc:95.9%,Train_loss:0.124,Test_acc:89.5%,Test_loss:0.290,Lr:4.00E-05
Epoch:24,Train_acc:96.0%,Train_loss:0.118,Test_acc:91.4%,Test_loss:0.296,Lr:3.68E-05
Epoch:25,Train_acc:95.2%,Train_loss:0.131,Test_acc:91.4%,Test_loss:0.248,Lr:3.68E-05
Epoch:26,Train_acc:95.7%,Train_loss:0.113,Test_acc:90.4%,Test_loss:0.306,Lr:3.38E-05
Epoch:27,Train_acc:97.6%,Train_loss:0.077,Test_acc:93.7%,Test_loss:0.226,Lr:3.38E-05
Epoch:28,Train_acc:96.6%,Train_loss:0.089,Test_acc:91.8%,Test_loss:0.286,Lr:3.11E-05
Epoch:29,Train_acc:97.3%,Train_loss:0.084,Test_acc:92.8%,Test_loss:0.243,Lr:3.11E-05
Epoch:30,Train_acc:96.6%,Train_loss:0.093,Test_acc:91.8%,Test_loss:0.227,Lr:2.86E-05
Epoch:31,Train_acc:97.4%,Train_loss:0.075,Test_acc:93.7%,Test_loss:0.236,Lr:2.86E-05
Epoch:32,Train_acc:97.6%,Train_loss:0.073,Test_acc:92.1%,Test_loss:0.246,Lr:2.63E-05
Epoch:33,Train_acc:97.8%,Train_loss:0.066,Test_acc:93.0%,Test_loss:0.223,Lr:2.63E-05
Epoch:34,Train_acc:98.4%,Train_loss:0.053,Test_acc:92.1%,Test_loss:0.265,Lr:2.42E-05
Epoch:35,Train_acc:98.4%,Train_loss:0.056,Test_acc:91.6%,Test_loss:0.250,Lr:2.42E-05
Epoch:36,Train_acc:98.2%,Train_loss:0.062,Test_acc:92.5%,Test_loss:0.301,Lr:2.23E-05
Epoch:37,Train_acc:97.6%,Train_loss:0.068,Test_acc:93.5%,Test_loss:0.236,Lr:2.23E-05
Epoch:38,Train_acc:98.1%,Train_loss:0.049,Test_acc:91.8%,Test_loss:0.244,Lr:2.05E-05
Epoch:39,Train_acc:98.9%,Train_loss:0.043,Test_acc:94.2%,Test_loss:0.216,Lr:2.05E-05
Epoch:40,Train_acc:98.7%,Train_loss:0.045,Test_acc:92.8%,Test_loss:0.245,Lr:1.89E-05
Done

9. 结果可视化

python">epochs_range = range(epochs)plt.figure(figsize = (12,3))plt.subplot(1,2,1)
plt.plot(epochs_range,train_acc,label = 'Training Accuracy')
plt.plot(epochs_range,test_acc,label = 'Test Accuracy')
plt.legend(loc = 'lower right')
plt.title('Training and Validation Accuracy')plt.subplot(1,2,2)
plt.plot(epochs_range,train_loss,label = 'Test Accuracy')
plt.plot(epochs_range,test_loss,label = 'Test Loss')
plt.legend(loc = 'lower right')
plt.title('Training and validation Loss')
plt.show()

在这里插入图片描述

10. 模型的保存

python"># 自定义模型保存
# 状态字典保存
torch.save(model.state_dict(),'./模型参数/J5_densenet121&SE_model_state_dict.pth') # 仅保存状态字典# 定义模型用来加载参数
best_model = DenseNet(num_init_features=64,  # init_channel=64,growth_rate=32,block_config=(6, 12, 24, 16),num_classes=len(classNames),  # 根据您的分类任务设置类别数se_filter_sq=se_filter_sq  # 传递SE模块的参数
).to(device)best_model.load_state_dict(torch.load('./模型参数/J5_densenet121&SE_model_state_dict.pth')) # 加载状态字典到模型

<All keys matched successfully>

11. 使用训练好的模型进行预测

python"># 指定路径图片预测
from PIL import Image
import torchvision.transforms as transformsclasses = list(total_data.class_to_idx) # classes = list(total_data.class_to_idx)def predict_one_image(image_path,model,transform,classes):test_img = Image.open(image_path).convert('RGB')# plt.imshow(test_img) # 展示待预测的图片test_img = transform(test_img)img = test_img.to(device).unsqueeze(0)model.eval()output = model(img)print(output) # 观察模型预测结果的输出数据_,pred = torch.max(output,1)pred_class = classes[pred]print(f'预测结果是:{pred_class}')

python"># 预测训练集中的某张照片
predict_one_image(image_path='./data/mpox_recognize/Monkeypox/M01_01_04.jpg',model = model,transform = test_transforms,classes = classes)

tensor([[ 2.6228, -3.6656]], device='cuda:0', grad_fn=<AddmmBackward0>)
预测结果是:Monkeypox

python">classes

['Monkeypox', 'Others']

python">