深度学习 - PyTorch简介

基础知识

1. PyTorch简介

PyTorch的特点和优势：
- 动态计算图、易用性、强大的社区支持、与NumPy兼容。

安装和环境配置：
安装和验证PyTorch：

pip install torch torchvision

验证安装：

import torch
print(torch.__version__)

运行结果：

1.9.0  # 具体版本可能不同

配置虚拟环境（推荐使用venv或conda）：

# 创建虚拟环境
python -m venv myenv
# 激活虚拟环境
source myenv/bin/activate  # on macOS/Linux
myenv\Scripts\activate  # on Windows

基本数据类型和张量操作：

import torch# 从列表创建张量
a = torch.tensor([1, 2, 3], dtype=torch.int32)
print(a)

运行结果：

tensor([1, 2, 3], dtype=torch.int32)

2. 张量操作

创建张量：

import torch# 创建全零张量
b = torch.zeros(2, 3)
print(b)# 创建全一张量
c = torch.ones(2, 3)
print(c)# 创建随机张量
d = torch.rand(2, 3)
print(d)# 从NumPy数组创建张量
import numpy as np
e = torch.tensor(np.array([1, 2, 3]), dtype=torch.float32)
print(e)

运行结果：

tensor([[0., 0., 0.],[0., 0., 0.]])
tensor([[1., 1., 1.],[1., 1., 1.]])
tensor([[0.4387, 0.4984, 0.5247],[0.2885, 0.3548, 0.9963]])
tensor([1., 2., 3.])

基本张量操作：

# 张量加法
f = a + torch.tensor([4, 5, 6], dtype=torch.int32)
print(f)# 张量乘法
g = b * 3
print(g)# 索引
h = d[1, 2]
print(h)# 切片
i = d[:, 1]
print(i)# 张量形状
shape = d.shape
print(shape)# 重新调整张量形状
j = d.view(3, 2)
print(j)

运行结果：

tensor([5, 7, 9], dtype=torch.int32)
tensor([[0., 0., 0.],[0., 0., 0.]])
tensor(0.9963)
tensor([0.4984, 0.3548])
torch.Size([2, 3])
tensor([[0.4387, 0.4984],[0.5247, 0.2885],[0.3548, 0.9963]])

广播机制：

a = torch.tensor([1, 2, 3])
b = torch.tensor([[1], [2], [3]])
c = a + b
print(c)

运行结果：

tensor([[2, 3, 4],[3, 4, 5],[4, 5, 6]])

GPU支持与CUDA：

# 检查GPU是否可用
if torch.cuda.is_available():device = torch.device("cuda")
else:device = torch.device("cpu")# 将张量移动到GPU
a = a.to(device)
print(a)
print(a.device)

运行结果：

tensor([1, 2, 3], device='cuda:0')  # 如果有GPU
cuda:0
# 或者
tensor([1, 2, 3])  # 如果没有GPU
cpu

常见问题及解决方案

安装问题：无法安装PyTorch。
- 解决方案：确保你使用的Python版本兼容PyTorch，并且已更新pip。尝试使用官方推荐的安装命令。
```
pip install torch torchvision
```
CUDA不可用：无法检测到GPU。
- 解决方案：检查CUDA和cuDNN的安装。确保你的PyTorch版本与CUDA版本兼容。
```
import torch
print(torch.cuda.is_available())  # 应该返回True
```
运行结果：
```
True  # 表示CUDA可用
False  # 表示CUDA不可用
```
张量形状不匹配：操作中的张量形状不匹配。
- 解决方案：使用.view()或.reshape()调整张量形状，使其匹配。
```
a = torch.rand(2, 3)
b = a.view(3, 2)
print(b)
```
运行结果：
```
tensor([[0.9500, 0.2225],[0.6182, 0.6810],[0.5123, 0.9282]])
```
内存不足：在GPU上运行大模型时，内存不足。
- 解决方案：尝试减少批量大小，或者使用分布式训练。
```
batch_size = 16  # 尝试减少批量大小
```
梯度计算问题：梯度爆炸或消失。
- 解决方案：使用梯度裁剪技术，或者尝试不同的优化器。
```
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=2.0)
```
模型过拟合：训练误差低，但验证误差高。
- 解决方案：增加正则化（如Dropout），或者使用数据增强技术。
```
model = torch.nn.Sequential(torch.nn.Linear(784, 256),torch.nn.ReLU(),torch.nn.Dropout(0.5),torch.nn.Linear(256, 10)
)
```
训练速度慢：模型训练速度慢。
- 解决方案：确保使用了GPU，加大批量大小，或者使用混合精度训练。
```
scaler = torch.cuda.amp.GradScaler()
```
数据加载缓慢：数据加载成为瓶颈。
- 解决方案：使用多线程数据加载器，并确保数据已预处理好。
```
train_loader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True, num_workers=4
)
```
张量转换问题：从NumPy到张量转换时出现问题。
- 解决方案：确保数据类型一致，并使用.astype()方法进行类型转换。
```
import numpy as np
a = np.array([1, 2, 3], dtype=np.float32)
b = torch.from_numpy(a)
print(b)
```
运行结果：
```
tensor([1., 2., 3.])
```
模型保存与加载问题：模型保存后加载失败。
- 解决方案：使用torch.save和torch.load进行模型保存和加载，确保路径正确。
```
torch.save(model.state_dict(), 'model.pth')
model.load_state_dict(torch.load('model.pth'))
```