1.Python赋值操作的原理
在python中,x= something, 这样的赋值操作,准确的理解是:给存储something建立一个索引x (即存储地址), x通过访问something的存储内容,获得something的值。
在下面代码中:
# 情况1: 原变量地址不变,但是内容发生修改
a = torch.rand(4)
print('before a', a, id(a))
b = a
print('before b', b, id(a))
a[2] = 5
print('after a', a, id(a))
print('after b', b, id(b))
输出:
before a tensor([0.7519, 0.1700, 0.7580, 0.2318]) 4613605632
before b tensor([0.7519, 0.1700, 0.7580, 0.2318]) 4613605632
after a tensor([0.7519, 0.1700, 5.0000, 0.2318]) 4613605632
after b tensor([0.7519, 0.1700, 5.0000, 0.2318]) 4613605632
解析:从上面id信息可以看出,在我们设置b = a 后,a,b均指向同一个内容。当对a[2]进行修改时,该地址内容变化后,对应a,b均发生变化。这是经典的使用赋值方法的浅拷贝方法。
# 情况2:原始变量指向新地址
a = torch.rand(4)
print('before a', a, id(a))
b = a
print('before b', b, id(a))
a = torch.rand(4)
print('after a', a, id(a))
print('after b', b, id(b))
before a tensor([0.7991, 0.6592, 0.4349, 0.6903]) 4615172560
before b tensor([0.7991, 0.6592, 0.4349, 0.6903]) 4615172560
after a tensor([0.4795, 0.3145, 0.6954, 0.3496]) 4615149584
after b tensor([0.7991, 0.6592, 0.4349, 0.6903]) 4615172560
解析:b=a时,b指向a最初所指向的地址内容,后面重新对a的指向地址和内容进行更新,实际仅影响a的指向,对b没有影响。
2. copy()
copy()是python中常见的一个函数,属于浅拷贝的一种,但是在复制过程中会出现两种情况:
- 第一种情况:当复制的对象种无复杂子对象的时候,原来值的改变不影响浅复制的值,同时原来值的id和浅拷贝的id不一致;
- 第二种情况:当复制的对象种存在复杂子对象的时候,如果改变其中复杂子对象的值,浅复制的值也会发生改变;但是改变其他非复杂对象值,则不会影响赋值的值。主要原因是在 copy()中,将复杂子类使用一个公共镜像存储起来,当镜像改变了之后,另一个使用该镜像的值也发生改变。
import copy
a = [[1, 2], 3, 4]
print('before a', a, id(a))
b = copy.copy(a)
print('before b', b, id(b))
a[0][0] = 0
print('after a', a, id(a))
print('after b', b, id(b))
a[2] = 5
print('after a', a, id(a))
print('after b', b, id(b))
before a [[1, 2], 3, 4] 4615194240
before b [[1, 2], 3, 4] 4614455104
after a [[0, 2], 3, 4] 4615194240
after b [[0, 2], 3, 4] 4614455104
after a [[0, 2], 3, 5] 4615194240
after b [[0, 2], 3, 4] 4614455104
3. deepcopy()
采用deepcopy函数进行赋值时,将被复制对象完全再复制一遍,生成一个新的独立的个体。
from copy import deepcopy
a = [[1, 2], 3, 4]
print('before a', a, id(a))
b = deepcopy(a)
print('before b', b, id(b))
a[0][0] = 0
print('after a', a, id(a))
print('after b', b, id(b))
a[2] = 5
print('after a', a, id(a))
print('after b', b, id(b))
before a [[1, 2], 3, 4] 4614452352
before b [[1, 2], 3, 4] 4614564032
after a [[0, 2], 3, 4] 4614452352
after b [[1, 2], 3, 4] 4614564032
after a [[0, 2], 3, 5] 4614452352
after b [[1, 2], 3, 4] 4614564032
4. PyTorch中的深拷贝与浅拷贝
4.1. inplace=True
inplace=True的意思是原地操作,例如 x = x + 5, 就是对x进行原地操作,这样操作能够节省内存。
4.2. .Tensor, .tensor, .from_numpy, .as_tensor的区别
.Tensor和.tensor是深拷贝,在内存中创建一个额外的数据副本,不共享内存,所以不受数组改变的影响。
.from_numpy和as_tensor是浅拷贝,在内存中共享数据。
import numpy as np
import torcha = np.array([1, 2, 3, 4])
a1 = torch.from_numpy(a)
a2 = torch.Tensor(a)
a3 = torch.tensor(a)
a4 = torch.as_tensor(a)
print('before a', a, id(a))
print('before a1', a1, id(a1))
print('before a2', a2, id(a2))
print('before a3', a3, id(a3))
print('before a4', a4, id(a4))a[1] = 0
print('after a', a, id(a))
print('after a1', a1, id(a1))
print('after a2', a2, id(a2))
print('after a3', a3, id(a3))
print('after a4', a4, id(a4))
before a [1 2 3 4] 4615260944
before a1 tensor([1, 2, 3, 4]) 4615062928
before a2 tensor([1., 2., 3., 4.]) 4614685696
before a3 tensor([1, 2, 3, 4]) 4615208048
before a4 tensor([1, 2, 3, 4]) 4615436944
after a [1 0 3 4] 4615260944
after a1 tensor([1, 0, 3, 4]) 4615062928
after a2 tensor([1., 2., 3., 4.]) 4614685696
after a3 tensor([1, 2, 3, 4]) 4615208048
after a4 tensor([1, 0, 3, 4]) 4615436944
从这里来看,a和a1, a4两个变量相互影响。
4.3. .detach()和.clone()
.clone() 是深拷贝,创建新的存储地址,而不是引用保存旧的tensor,在梯度回传的时候,clone()充当中间变量,会将梯度传给源张量进行叠加,但是本身不保存其grad,值为None.
.detach()是浅拷贝,新的tesnor会脱离计算图,不回牵涉梯度计算。
import torchx = torch.tensor([2.0, 3.0, 4.0], requires_grad=True)
clone_x = x.clone()
detach_x = x.detach()
clone_detach_x = x.clone().detach()y = 2*x + 10
y.backward(torch.FloatTensor([1.0, 2.0, 1.0]))print(x.grad)
print(clone_x.requires_grad)
print(clone_x.grad)
print(detach_x.requires_grad)
print(clone_detach_x.requires_grad)
tensor([2., 4., 2.])
True
None
False
False
/var/folders/ll/nwlhsbw14ds8cs419pf18fn80000gn/T/ipykernel_20906/929467759.py:13: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at /Users/runner/work/_temp/anaconda/conda-bld/pytorch_1670525473998/work/build/aten/src/ATen/core/TensorBody.h:485.)print(clone_x.grad)
4.4. contiguous函数
在pytorch中,很多操作都用到浅拷贝的思路,只是重新定义下标与元素的对应关系。
import torchx = torch.randn(3,2)
y = torch.transpose(x, 0, 1)
print('before')
print('x:', x)
print('y:', y)print('after')
y[0, 0] = 12
print('x:', x)
print('y:', y)
使用contiguous()之后,
import torchx = torch.randn(3,2)
y = torch.transpose(x, 0, 1).contiguous()
print('before')
print('x:', x)
print('y:', y)print('after')
y[0, 0] = 12
print('x:', x)
print('y:', y)