PaperWithCode 八大数据集模型排名:https://paperswithcode.com/task/crowd-counting
搜索关键词
“人群计数”(crowd counting,crowd指的是人而不是拥挤的人;Counting People);“人流计数”;“人流量统计”(;flow statistics;pedestrian volume statistics)、“客流统计”、“客流量统计”、“人流计数器”(Person counter)、“客流计数器”;crowd density
“实时行人分析”,Crowd Counting+Localization=Crowd Analysis;分析行人一些特征Pedestrian Attribute Recognition
技术实现:目标检测(object detection)、多目标跟踪(object tracking)、去重
(万能代码)CommissarMa/Crowd_counting_from_scratch
https://github.com/CommissarMa/Crowd_counting_from_scratch
加载数据集需要用到Torch,但我我的ubuntu电脑没有装Torch,我不知道转哪个版本的ubuntu,方便后面的十几个模型的使用
(S1)加载数据集,制作density map的ground truth(shanghai tech这个数据集的ground truth不需要你制作,官方是提供的)
它提供的代码,是没有train.py和test.py这些代码
dataloader_example.py提供了数据集加载的代码
fdst_densitymap_prepare.py提供了对fsdt这个数据集的数据加载和ground truth制作
三种高斯核来制作ground truth的density map的方法“k_nearest_gaussian_kernel.py”,“same_gaussian_kernel.py”
(S2)数据集和ground truth喂进作者设计好的模型,开始训练
/crowd_model/mcnn_model.py文件中提供了网络结构,前向转播,最后的预测值。但是没有反向传播
没有反向传播和参数更新的代码
(S3)拿着训练好的模型,把测试集的图片喂进去,输出一个密度图,积分求和,得出人数和ground truth对比,得出损失值score
(万能代码)C3F
把C3F这个代码的思路和逻辑搞懂(看他的知乎博客),尽量把代码跑起来:https://zhuanlan.zhihu.com/p/65650998
科普中文博文:https://zhuanlan.zhihu.com/p/65650998
框架网址:https://github.com/gjy3035/C-3-Framework
web端人群计数标注工具:https://github.com/Elin24/cclabeler
第一步-配置环境Python 3.xPytorch1.0 (some networks only support 0.4):http://pytorch.org.(有些网络只可以用0.4版本的torch)otherlibsin requirements.txt, run pipinstall-r requirements.txt.torch,torchvision,tensorboardX,tensorboard,tensorflow,easydict,pandas,numpy,scipy第二步,下载数据集,把代码中数据集的位置修改为 我存数据集的位置。运行代码,生成密度图 你说的数据集下载在了我电脑的哪个位置:/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/open_dataset把这个位置,写进生成density map的代码中
填写方式参考我修改后的文件——.\C-3-Framework\datasets\SHHA\preapre_SHHA.m
preapre_SHHA.m(搞定)——F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\C-3-Framework\datasets\data\768x1024RGB-k15-s4\shanghaitech_part_A
preapre_SHHB.m(搞定)——F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\C-3-Framework\datasets\data\768x1024RGB-k15-s4\shanghaitech_part_B
prepare_GCC.m——(没搞定)GCC这个数据集,一个数据集3-6个G,太大了,最最后再用
preapre_QNRF.m——(搞定)——test——F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\C-3-Framework\datasets\UCF-qnrf\1024x1024_mod16\test
——train——F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\C-3-Framework\datasets\UCF-qnrf-processed\train
preapre_UCF_CC_50.m(搞定)——这个数据集除了有density map的生成还有segmentation
——他这个MATLAB代码要求的是你的数据集必须提供ann.mat,但是我拿到的那个数据集是 有很多的场景 scence,每个场景有很多的视频,没有ann.mat的标注文件——看作者提供的那个one drive网盘把作者提供的数据集下载下来
——搞定,但是没有运行.m程序,也没去找这个数据集,是直接下载作者制作好的ground truth
——F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\C-3-Framework\datasets\UCF\UCF_CC_50
preapre_WE.m(搞定)——WE=World Expo'10数据集,这个数据集我没下载下来,直接用的作者提供的ground truth
把这些处理好的ground truth文件夹放进这个文件夹里
/ProcessedData
.文件夹结构如下。每个数据集文件夹的名字也要改成下面这样
+-- C-3-Framework | +-- datasets | +-- misc | +-- ...... +-- ProcessedData | +-- shanghaitech_part_A | +-- shanghaitech_part_B | +-- UCF_CC_50 | +-- UCF-QNRF-1024x1024-mod16 | +-- WE_blurred
Pretrained Model
Some Counting Networks (such as VGG, CSRNet and so on) adopt the pre-trained models on ImageNet. You can download them from TorchVision
Place the processed model to
~/.cache/torch/checkpoints/
(only for linux OS).
win10里面也存在这个位置,(C:\Users\hasee\.cache\torch\hub\checkpoints),但是他上面提供的这个网址https://github.com/pytorch/vision/tree/main/torchvision/models进去是网络的.py代码,只有网络结果,并没有预训练模型的参数,有必要下载吗?
win10的另一个虚拟环境(r-reticulate)需要我把模型文件放在这个位置(C:\Users\hasee\.torch\models)
这个模型文件下载下来放在上面win10那个地址里面https://download.pytorch.org/models/resnet101-5d3b4d8f.pth
Training
set the parameters in
config.py
and./datasets/XXX/setting.py
(if you want to reproduce our results, you are recommonded to use our parameters in./results_reports
).run
python train.py
.报错如下
我猜测是人家GPU的块数多,我的GPU块数少(只有1块)。人家拿着四块GPU并行的训练模型,我的不行
Traceback (most recent call last): File "train.py", line 62, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 25, in __init__ self.net = CrowdCounter(cfg.GPU_ID,self.net_name).cuda() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\models\CC.py", line 29, in __init__ self.CCN = torch.nn.DataParallel(self.CCN, device_ids=gpus).cuda() File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\nn\parallel\data_parallel.py", line 142, in __init__ _check_balance(self.device_ids) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\nn\parallel\data_parallel.py", line 23, in _check_balance dev_props = _get_devices_properties(device_ids) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\_utils.py", line 458, in _get_devices_properties return [_get_device_attr(lambda m: m.get_device_properties(i)) for i in device_ids] File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\_utils.py", line 458, in <listcomp> return [_get_device_attr(lambda m: m.get_device_properties(i)) for i in device_ids] File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\_utils.py", line 441, in _get_device_attr return get_member(torch.cuda) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\_utils.py", line 458, in <lambda> return [_get_device_attr(lambda m: m.get_device_properties(i)) for i in device_ids] File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\cuda\__init__.py", line 299, in get_device_properties raise AssertionError("Invalid device id") AssertionError: Invalid device id
在这个文件夹中F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\config.py 中的line31中改成下面这样就从两块GPU改成了一块
#__C.GPU_ID = [0,1] __C.GPU_ID = [0]
改后继续运行,出现下面这个报错——我怀疑是pytorch的版本太高了,导致以前有的一些函数接口现在不能用了所以报错
D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\optim\lr_scheduler.py:136: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html
(Try1):用官方提供的代码,尝试装一下
pytorch1.0.0,( 8 December 2018发布 )conda install pytorch==1.0.0 torchvision==0.2.1 -c pytorch(网络问题,很久都下载不下来,这个包装不上)————从官网直接下的压缩包速度快一些https://anaconda.org/pytorch/pytorch/files?version=1.0.0但是虚拟环境的名字:(r-reticulate),Python版本3.6.13
torchvision=0.2.2whl文件下载地址https://pypi.org/project/torchvision/0.2.2/#files
这句命令安装——pip install "文件路径"(千万记得不要用
conda install torchvision=0。2。2 -c pytorch
来安装,因为这句代码会自动把你的pytorch安装到最新的版本1.10.1,你的1.0.0的pytorch就毁了)版本采用Jul 26, 2019之前最新的一个版本
tensorboardX
找pytorch1.0.0发布以前的版本( 8 December 2018发布 ),用tensorboardX==1.4,(Released: Aug 9, 2018)
tensorboard(没装)
tensorflow(没装)
easydict=1.9————pip install easydict==1.9
pip install 文件名.tar.gz
或者解压,目录里有setup.py文件
切换到该目录( cmd 或者直接在打开的文件夹路径(上方)输入 cmd 即可)
执行命令python setup.py install
pandas==0.24.2, Mar 14,2019pip install pandas==0.24.2
numpy==推荐这个,实际装pandas的时候nump一并装了1.16.0,Jan 14,2019实际这个版本'1.19.5'i
scipy==1.2.1,Feb 9, 2019又出现下面的报错,好像是读取文件、保存文件相关的错误
——读一下框架的blog,看看人家的设计思想
——我怀疑是shutil.copytree(file, dst_file)想要向一个已经有这个文件夹的地方再复制一次这个,可以通过修改函数里面的参数或者加一个if做一次过滤来解决
shutil.copytree(file, dst_file)函数的官方文档有写下面这段话,告诉我们,如果dirs_exist_ok 这个参数使用默认的false,那么如果复制到的那个位置的文件夹已经存在了,不会报错
https://docs.python.org/3/library/shutil.html
If dirs_exist_ok is false (the default) and dst already exists, a FileExistsError is raised. If dirs_exist_ok is true, the copying operation will continue if it encounters existing directories, and files within the dst tree will be overwritten by corresponding files from the src tree.
但是仅仅Python3.11.1的这个函数里shutil.copytree()才有这个参数dirs_exist_ok ,我目前用的是3.6.13版本没有这个参数
(1)查看3.11版本的python他的这个参数是怎么写进去的,如果关联、依赖的函数不是太多我也模仿的加入我系统的代码里面
我把Python=3.6.13原来的“shutil.py”文件备份了一份,放在桌面
把所有和copytree()相关的函数都按照Python3.11.0的方式修改一遍
(2)看看人家加入这个参数是如何加的,是用的if else判断还是try exception?
(3)实在在shutil.copytree(file, dst_file)函数增加不了这个功能。可以用try exption或者if else来跳过这个报错]
一种可能是你不是Linux,文件的地址的斜杠的样子和win10不一样
——去Linux里面配一个版本号完全相同的虚拟环境,试一试,不行的话再想办法
另一种可能是你需要加一个if判断,如果文件存在就不再创建了
Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 62, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 52, in __init__ self.writer, self.log_txt = logger(self.exp_path, self.exp_name, self.pwd, 'exp', resume=cfg.RESUME) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\misc\utils.py", line 76, in logger copy_cur_env(work_dir, exp_path+ '/' + exp_name + '/code', exception) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\misc\utils.py", line 249, in copy_cur_env shutil.copytree(file, dst_file) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\shutil.py", line 321, in copytree os.makedirs(dst) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\os.py", line 220, in makedirs mkdir(name, mode) FileExistsError: [WinError 183] 当文件已存在时,无法创建该文件。: './exp/12-08_16-20_SHHB_Res101_SFCN_1e-05/code\\.git'
然后继续运行,遇到下面这个报错
File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\shutil.py", line 343, in _copytree raise Error(errors) shutil.Error: [('F:\\FILES_OF_ALBERT\\IT_paid_class\\graduation_thesis\\model_innov\\2021_C-3-Framework\\C-3-Framework\\.git\\objects\\pack\\pack-f045ed913ad90b5201f2b6628b5474cac03e5607.idx', './exp/12-08_20-16_SHHB_Res101_SFCN_1e-05/code\\.git\\objects\\pack\\pack-f045ed913ad90b5201f2b6628b5474cac03e5607.idx', "[Errno 13] Permission denied: './exp/12-08_20-16_SHHB_Res101_SFCN_1e-05/code\\\\.git\\\\objects\\\\pack\\\\pack-f045ed913ad90b5201f2b6628b5474cac03e5607.idx'"), ('F:\\FILES_OF_ALBERT\\IT_paid_class\\graduation_thesis\\model_innov\\2021_C-3-Framework\\C-3-Framework\\.git\\objects\\pack\\pack-f045ed913ad90b5201f2b6628b5474cac03e5607.pack', './exp/12-08_20-16_SHHB_Res101_SFCN_1e-05/code\\.git\\objects\\pack\\pack-f045ed913ad90b5201f2b6628b5474cac03e5607.pack', "[Errno 13] Permission denied: './exp/12-08_20-16_SHHB_Res101_SFCN_1e-05/code\\\\.git\\\\objects\\\\pack\\\\pack-f045ed913ad90b5201f2b6628b5474cac03e5607.pack'")]
要用管理员权限打开一个cmd运行这个代码
要么吧文件夹的属性里这个只读去掉,
上面这个错误修复以后,继续运行遇到了 cuda内存不够的报错
Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 64, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 26, in __init__ self.net = CrowdCounter(cfg.GPU_ID,self.net_name).cuda() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\models\CC.py", line 31, in __init__ fatal fatal self.CCN=self.CCN.cuda() : File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\nn\modules\module.py", line 260, in cuda : Memory allocation failureMemory allocation failure return self._apply(lambda t: t.cuda(device)) Traceback (most recent call last): File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply Traceback (most recent call last): File "<string>", line 1, in <module> module._apply(fn) File "<string>", line 1, in <module> fatal File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main : exitcode = _main(fd) module._apply(fn) exitcode = _main(fd) Memory allocation failure File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\nn\modules\module.py", line 193, in _apply File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) param.data = fn(param.data) prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare Traceback (most recent call last): File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\nn\modules\module.py", line 260, in <lambda> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "<string>", line 1, in <module> return self._apply(lambda t: t.cuda(device)) _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main RuntimeError: CUDA error: out of memory
——把代码每个batch读进去的数据量压到最低——看看还会不会超内存?-还是会超
——尝试过来,似乎每个batch读进去的只有一张图片(没有看到有关batch size的描述,只有在config.py的line71有__C.VISIBLE_NUM_IMGS = 1似乎是说一次读进去一张照片),但是仅仅一张图片+各种计算--就让4G的显存不够用
——我看了train.py,有前向传播,也有反向传播(trainer.py里面,class Trainer()的 def __init__()有用到Adam优化就是反向传播),但是train.py随后只用了Trainer().forward()--但是forward这个函数里面运用了 class Trainer()里面所有的函数包括train() validate_V1(),没有模型保存--这个真的是在训练吗?
——用的什么数据集ShanghaiTech_partB、什么模型-Res101_SFCN?
——目前怀疑是数据集那个位置,一次读进去了几十张照片作为一个batch读进去的数据。因为如果一次读进去的只有一张图片,在SHHBz中一张图片的大小为160KB左右,density map的csv文件的大小为2MB左右,resnet101的预训练模型参数pth文件大小为170MB,加上杂七杂八的计算数据应该不可能能占用的数据超过4GB——什么地方用到了这个代码-train.py的line27引用了
from datasets.SHHB.loading_data import loading_data from datasets.SHHB.setting import cfg_data
——在什么地方改(F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\datasets\SHHB\setting.py 的line20),下面这样改,6张变1张。
# __C_SHHB.TRAIN_BATCH_SIZE = 6 #imgs __C_SHHB.TRAIN_BATCH_SIZE = 1 #imgs #__C_SHHB.VAL_BATCH_SIZE = 6 # __C_SHHB.VAL_BATCH_SIZE = 1 #
报出新的错
Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main Traceback (most recent call last): Traceback (most recent call last): exitcode = _main(fd) File "<string>", line 1, in <module> File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main prepare(preparation_data) exitcode = _main(fd) exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main _fixup_main_from_path(data['init_main_from_path']) prepare(preparation_data) prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) run_name="__mp_main__") _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> mod_name, mod_spec, pkg_name, script_name) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> cc_trainer.forward() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward cc_trainer.forward() exec(code, run_globals) self.train() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train self.train() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train cc_trainer.forward() for i, data in enumerate(self.train_loader, 0): for i, data in enumerate(self.train_loader, 0): File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ self.train() return _DataLoaderIter(self) return _DataLoaderIter(self) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ for i, data in enumerate(self.train_loader, 0): w.start() w.start() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) return _DataLoaderIter(self) self._popen = self._Popen(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) w.start() return _default_context.get_context().Process._Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) self._popen = self._Popen(self) return Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ prep_data = spawn.get_preparation_data(process_obj._name) return _default_context.get_context().Process._Popen(process_obj) prep_data = spawn.get_preparation_data(process_obj._name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data _check_not_importing_main() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main return Popen(process_obj) _check_not_importing_main() is not going to be frozen to produce an executable.''') File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main prep_data = spawn.get_preparation_data(process_obj._name) is not going to be frozen to produce an executable.''') File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. _check_not_importing_main() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> cc_trainer.forward() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward self.train() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train for i, data in enumerate(self.train_loader, 0): File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ return _DataLoaderIter(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ w.start() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ prep_data = spawn.get_preparation_data(process_obj._name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data _check_not_importing_main() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> cc_trainer.forward() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward self.train() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train for i, data in enumerate(self.train_loader, 0): File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ return _DataLoaderIter(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ w.start() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ prep_data = spawn.get_preparation_data(process_obj._name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data _check_not_importing_main() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> cc_trainer.forward() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward self.train() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train for i, data in enumerate(self.train_loader, 0): File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ return _DataLoaderIter(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ w.start() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ prep_data = spawn.get_preparation_data(process_obj._name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data _check_not_importing_main() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. Traceback (most recent call last): Traceback (most recent call last): File "<string>", line 1, in <module> File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\runpy.py", line 85, in _run_code exec(code, run_globals) exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\train.py", line 65, in <module> cc_trainer.forward() cc_trainer.forward() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 66, in forward self.train() self.train() File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train File "F:\FILES_OF_ALBERT\IT_paid_class\graduation_thesis\model_innov\2021_C-3-Framework\C-3-Framework\trainer.py", line 87, in train for i, data in enumerate(self.train_loader, 0): for i, data in enumerate(self.train_loader, 0): File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__ return _DataLoaderIter(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ return _DataLoaderIter(self) w.start() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start w.start() self._popen = self._Popen(self) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\process.py", line 105, in start File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen self._popen = self._Popen(self) return _default_context.get_context().Process._Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 223, in _Popen File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen return _default_context.get_context().Process._Popen(process_obj) return Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\context.py", line 322, in _Popen File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ return Popen(process_obj) prep_data = spawn.get_preparation_data(process_obj._name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data prep_data = spawn.get_preparation_data(process_obj._name) File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 143, in get_preparation_data _check_not_importing_main() _check_not_importing_main() File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main File "D:\Programing_File\Anaconda3\envs\r-reticulate\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main is not going to be frozen to produce an executable.''') is not going to be frozen to produce an executable.''') RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.
win10这个我大体看了报错信息,很难解决,我看不出是什么原因造成的这个错误。————我想试试在linux上跑会不会有这个错误。(大概率以后代码要在linux上跑,不如先演练一遍)——如果linux上还是有这个错误,我自己再想办法解决,不行就问老师,让老师给改报错
配置linux 代码环境
——装Python
Python: 代码编写的时间Jul 2, 2019,安装pythonPython 3.4.10(released on 18 March 2019)https://www.python.org/downloads/
——最后实际安装的python3.4.5,因为这个版本在anconda的服务器上有,3.5.10那个没有
conda create -n py34tc100 python=3.4——装pytorch=1.0.0
https://anaconda.org/pytorch/pytorch/files?version=1.0.0
与pytorch1.0.0匹配的最低版本的linux系统的python3.x是3.5
目前的python版本=3.4.5
查出来目前的cuda版本=10.1.168
Cudann 版本号如何查询?——查询到我才知道我是匹配
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
查出来的cudnn的版本号是7.6.5
linux端官方提供的可以下载的pytorch=1.0.0的安装包
linux-64/pytorch-1.0.0-py3.7_cuda8.0.61_cudnn7.1.2_1.tar.bz2
linux-64/pytorch-1.0.0-py3.6_cuda8.0.61_cudnn7.1.2_1.tar.bz2
linux-64/pytorch-1.0.0-py3.5_cuda8.0.61_cudnn7.1.2_1.tar.bz2
linux-64/pytorch-1.0.0-py2.7_cuda8.0.61_cudnn7.1.2_1.tar.bz2| linux-64/pytorch-1.0.0-py3.7_cuda10.0.130_cudnn7.4.1_1.tar.bz2
linux-64/pytorch-1.0.0-py3.6_cuda10.0.130_cudnn7.4.1_1.tar.bz2
linux-64/pytorch-1.0.0-py3.5_cuda10.0.130_cudnn7.4.1_1.tar.bz2
linux-64/pytorch-1.0.0-py2.7_cuda10.0.130_cudnn7.4.1_1.tar.bz2
linux-64/pytorch-1.0.0-py2.7_cuda9.0.176_cudnn7.4.1_1.tar.bz2linux-64/pytorch-1.0.0-py3.5_cuda9.0.176_cudnn7.4.1_1.tar.bz2
linux-64/pytorch-1.0.0-py3.7_cuda9.0.176_cudnn7.4.1_1.tar.bz2
linux-64/pytorch-1.0.0-py3.6_cuda9.0.176_cudnn7.4.1_1.tar.bz2
结论3:python要装什么号的,对应的发行日期是几年几月几号,是否在代码公布日期之前?
CF3代码发布在GitHub上的时间:Jul 2, 2019
pytorch1.0.0要求的python版本最低要求3.5(发行日期,3.5.0= Sept. 13, 2015)
推荐的安装python在哪个发行时间以后?大多数代码是2018年以后发布的,建议安装2017年那年发布的
结论,安装Python 3.6.0 发布时间:Dec. 23, 2016
https://www.python.org/downloads/
删除环境py34tc100
开一个虚拟环境装python=3.6.0
condacreate --name my_first_env python=3.6.0
装pytorch官网的命令:https://pytorch.org/get-started/previous-versions/
conda install pytorch==1.0.0 torchvision==0.2.1 cuda100 -c pytorch
这个命令实际下载的包是这个 pytorch-1.0.0-py3.6_cuda10.0.130_cudnn7.4.1_1
装torchvision——上面已经装了
tensorboardX
找pytorch1.0.0发布以前的版本( 8 December 2018发布 ),用tensorboardX==1.4,(Released: Aug 9, 2018)
conda install tensorboardX==1.4
easydict=1.9————
pip install easydict==1.9
pandas==0.24.2, Mar 14,2019
pip install pandas==0.24.2
numpy==跟着pandas装了,版本是1.19.2
scipy==1.2.1,Feb 9, 2019
pip install scipy==1.2.1
预训练模型下载放在指定位置
把这个与训练模型(https://download.pytorch.org/models/resnet101-5d3b4d8f.pth)放在linux环境的这个目录下面 /home/albert/.torch/models/resnet101-5d3b4d8f.pth
运行代码
修改文件夹位置:——/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/SHHB/prepareSHHB——.m文件不需要改运行报了下面这个错
/2021_C-3-Framework/C-3-Framework/misc/utils.py,line 253, in copy_cur_env shutil.copytree(file, dst_file, dirs_exist_ok=True),把这个参数删掉dirs_exist_ok
Traceback (most recent call last): File "train.py", line 64, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 53, in __init__ self.writer, self.log_txt = logger(self.exp_path, self.exp_name, self.pwd, 'exp', resume=cfg.RESUME) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py", line 77, in logger copy_cur_env(work_dir, exp_path+ '/' + exp_name + '/code', exception) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py", line 253, in copy_cur_env shutil.copytree(file, dst_file, dirs_exist_ok=True) TypeError: copytree() got an unexpected keyword argument 'dirs_exist_ok'
出现下面这个错误,这个是python3.6.0的一个错误,升级到python3.7这个错误就被解决了
这个错误的报错提示没有显示是我提供的代码哪个位置错了,引发的这个错误,看来是python内部的错误
python 3.7.0的发行时间和是否在代码提交时间之前?
python 3.7的发行时间:June 27, 2018
代码提交时间:Jul 2, 2019
是的,python3.7发行是在代码提交时间之前,可以安装
Exception in thread Thread-2: Traceback (most recent call last): File "/home/albert/anaconda3/envs/py360tc100/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/albert/anaconda3/envs/py360tc100/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/home/albert/anaconda3/envs/py360tc100/lib/python3.6/multiprocessing/resource_sharer.py", line 139, in _serve signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG)) File "/home/albert/anaconda3/envs/py360tc100/lib/python3.6/signal.py", line 60, in pthread_sigmask sigs_set = _signal.pthread_sigmask(how, mask) ValueError: signal number 32 out of range Exception in thread Thread-2:
安装一个python3.7.0的虚拟环境,解决上面这个报错
conda create --name py370tc100 python=3.7.0
安装torch1.0.0
这个网站https://pypi.org/project/torch/1.0.0/#files。下载torch-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
pip install torch-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
安装其他包
tensorboardX
conda install tensorboardX==1.4
easydict=1.9————
pip install easydict==1.9
pandas==0.24.2, Mar 14,2019
pip install pandas==0.24.2
numpy==跟着pandas装了,版本是1.19.2
scipy==1.2.1,Feb 9, 2019
pip install scipy==1.2.1
运行代码
python train.py
报错下面这样
Traceback (most recent call last): File "train.py", line 27, in <module> from datasets.SHHB.loading_data import loading_data File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/SHHB/loading_data.py", line 1, in <module> import torchvision.transforms as standard_transforms File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/__init__.py", line 2, in <module> from torchvision import datasets File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/datasets/__init__.py", line 9, in <module> from .fakedata import FakeData File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/datasets/fakedata.py", line 3, in <module> from .. import transforms File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/transforms/__init__.py", line 1, in <module> from .transforms import * File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 16, in <module> from . import functional as F File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 5, in <module> from PIL import Image, ImageOps, ImageEnhance, PILLOW_VERSION ImportError: cannot import name 'PILLOW_VERSION' from 'PIL' (/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/PIL/__init__.py)
解决方案:在这个位置的这个文件/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torchvision/transforms/functional.py",把图中位置PILLOW_VERSION改为__version__
还有更彻底的改法在这个网址里https://blog.csdn.net/weixin_45021364/article/details/104600802;https://stackoverflow.com/questions/59659146/could-not-import-pillow-version-from-pil
查询ubuntu版本,根据这个版本查询GPU容量占用的可视化工具安装
又报错,据说是GPU空间不够?
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 66, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 94, in train pred_map = self.net(img, gt_map) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/CC.py", line 39, in forward density_map = self.CCN(img) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/SCC_Model/Res101_SFCN.py", line 46, in forward x = self.own_reslayer_3(x) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/SCC_Model/Res101_SFCN.py", line 124, in forward out = self.conv3(out) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 320, in forward self.padding, self.dilation, self.groups) RuntimeError: CUDA out of memory. Tried to allocate 48.00 MiB (GPU 0; 3.95 GiB total capacity; 3.44 GiB already allocated; 12.94 MiB free; 541.00 KiB cached)
(最后确认)再去逐行读一下代码,确定一个batch读进去的图片数量是这个位置,这个文件,已经设定为一张
换成照片数为60以后,报shutil这个错
Traceback (most recent call last): File "train.py", line 64, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 53, in __init__ self.writer, self.log_txt = logger(self.exp_path, self.exp_name, self.pwd, 'exp', resume=cfg.RESUME) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py", line 77, in logger copy_cur_env(work_dir, exp_path+ '/' + exp_name + '/code', exception) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py", line 251, in copy_cur_env shutil.copytree(file, dst_file) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/shutil.py", line 315, in copytree os.makedirs(dst) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/os.py", line 221, in makedirs mkdir(name, mode) FileExistsError: [Errno 17] File exists: './exp/12-14_15-25_SHHB_Res101_SFCN_1e-05/code/.git'
两步走
—1—shutil调用地方加一个参数 existing_ok=True
打开这个位置,这个文件件,/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py,line251,上面这行改成下面这行。目标位置有了文件就不会重复复制过去了
#shutil.copytree(file, dst_file) shutil.copytree(file, dst_file, dirs_exist_ok=True)
—2—把python 3.11.0的shutil的hutil.copytree(file, dst_file)这个函数拿过来
这个位置/home/albert/anaconda3/envs/py370tc100/lib/python3.7/shutil.py
把shutil.py文件备份一份
要改成的样子python官方文档有公布shutil.py的代码,去查出来https://docs.python.org/3/library/shutil.html
粘贴过去的东西长这样
def _copytree(entries, src, dst, symlinks, ignore, copy_function, ignore_dangling_symlinks, dirs_exist_ok=False): if ignore is not None: ignored_names = ignore(os.fspath(src), [x.name for x in entries]) else: ignored_names = set() os.makedirs(dst, exist_ok=dirs_exist_ok) errors = [] use_srcentry = copy_function is copy2 or copy_function is copy for srcentry in entries: if srcentry.name in ignored_names: continue srcname = os.path.join(src, srcentry.name) dstname = os.path.join(dst, srcentry.name) srcobj = srcentry if use_srcentry else srcname try: is_symlink = srcentry.is_symlink() if is_symlink and os.name == 'nt': # Special check for directory junctions, which appear as # symlinks but we want to recurse. lstat = srcentry.stat(follow_symlinks=False) if lstat.st_reparse_tag == stat.IO_REPARSE_TAG_MOUNT_POINT: is_symlink = False if is_symlink: linkto = os.readlink(srcname) if symlinks: # We can't just leave it to `copy_function` because legacy # code with a custom `copy_function` may rely on copytree # doing the right thing. os.symlink(linkto, dstname) copystat(srcobj, dstname, follow_symlinks=not symlinks) else: # ignore dangling symlink if the flag is on if not os.path.exists(linkto) and ignore_dangling_symlinks: continue # otherwise let the copy occur. copy2 will raise an error if srcentry.is_dir(): copytree(srcobj, dstname, symlinks, ignore, copy_function, ignore_dangling_symlinks, dirs_exist_ok) else: copy_function(srcobj, dstname) elif srcentry.is_dir(): copytree(srcobj, dstname, symlinks, ignore, copy_function, ignore_dangling_symlinks, dirs_exist_ok) else: # Will raise a SpecialFileError for unsupported file types copy_function(srcobj, dstname) # catch the Error from the recursive copytree so that we can # continue with other files except Error as err: errors.extend(err.args[0]) except OSError as why: errors.append((srcname, dstname, str(why))) try: copystat(src, dst) except OSError as why: # Copying file access times may fail on Windows if getattr(why, 'winerror', None) is None: errors.append((src, dst, str(why))) if errors: raise Error(errors) return dst def copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False, dirs_exist_ok=False): """Recursively copy a directory tree and return the destination directory. If exception(s) occur, an Error is raised with a list of reasons. If the optional symlinks flag is true, symbolic links in the source tree result in symbolic links in the destination tree; if it is false, the contents of the files pointed to by symbolic links are copied. If the file pointed by the symlink doesn't exist, an exception will be added in the list of errors raised in an Error exception at the end of the copy process. You can set the optional ignore_dangling_symlinks flag to true if you want to silence this exception. Notice that this has no effect on platforms that don't support os.symlink. The optional ignore argument is a callable. If given, it is called with the `src` parameter, which is the directory being visited by copytree(), and `names` which is the list of `src` contents, as returned by os.listdir(): callable(src, names) -> ignored_names Since copytree() is called recursively, the callable will be called once for each directory that is copied. It returns a list of names relative to the `src` directory that should not be copied. The optional copy_function argument is a callable that will be used to copy each file. It will be called with the source path and the destination path as arguments. By default, copy2() is used, but any function that supports the same signature (like copy()) can be used. If dirs_exist_ok is false (the default) and `dst` already exists, a `FileExistsError` is raised. If `dirs_exist_ok` is true, the copying operation will continue if it encounters existing directories, and files within the `dst` tree will be overwritten by corresponding files from the `src` tree. """ sys.audit("shutil.copytree", src, dst) with os.scandir(src) as itr: entries = list(itr) return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks, ignore=ignore, copy_function=copy_function, ignore_dangling_symlinks=ignore_dangling_symlinks, dirs_exist_ok=dirs_exist_ok)
然后报错说 'sys' has no attribute 'audit',你去shutil.py去把 这句话注释掉“ sys.audit("shutil.copytree", src, dst)”
Traceback (most recent call last): File "train.py", line 64, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 53, in __init__ self.writer, self.log_txt = logger(self.exp_path, self.exp_name, self.pwd, 'exp', resume=cfg.RESUME) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py", line 77, in logger copy_cur_env(work_dir, exp_path+ '/' + exp_name + '/code', exception) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/misc/utils.py", line 253, in copy_cur_env shutil.copytree(file, dst_file, dirs_exist_ok=True) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/shutil.py", line 372, in copytree sys.audit("shutil.copytree", src, dst) AttributeError: module 'sys' has no attribute 'audit'
又一次报错是GPU容量不够
60张图片的时候是下面这样
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 66, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 94, in train pred_map = self.net(img, gt_map) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/CC.py", line 39, in forward density_map = self.CCN(img) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/SCC_Model/Res101_SFCN.py", line 44, in forward x = self.frontend(x) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 320, in forward self.padding, self.dilation, self.groups) RuntimeError: CUDA out of memory. Tried to allocate 2.81 GiB (GPU 0; 3.95 GiB total capacity; 867.61 MiB already allocated; 2.60 GiB free; 788.00 KiB cached)
1张图片的时候是这样
RuntimeError: CUDA out of memory. Tried to allocate 48.00 MiB (GPU 0; 3.95 GiB total capacity; 3.44 GiB already allocated; 12.94 MiB free; 541.00 KiB cached)
2张
RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 3.95 GiB total capacity; 3.45 GiB already allocated; 2.94 MiB free; 685.00 KiB cached)
6张图的时候这样
RuntimeError: CUDA out of memory. Tried to allocate 288.00 MiB (GPU 0; 3.95 GiB total capacity; 3.24 GiB already allocated; 218.94 MiB free; 775.00 KiB cached)
12张图的时候这样
、RuntimeError: CUDA out of memory. Tried to allocate 576.00 MiB (GPU 0; 3.95 GiB total capacity; 2.96 GiB already allocated; 478.94 MiB free; 27.77 MiB cached)
60张图片的时候是下面这样
RuntimeError: CUDA out of memory. Tried to allocate 2.81 GiB (GPU 0; 3.95 GiB total capacity; 867.61 MiB already allocated; 2.60 GiB free; 788.00 KiB cached)
400张图的时候这样
RuntimeError: CUDA out of memory. Tried to allocate 3.52 GiB (GPU 0; 3.95 GiB total capacity; 147.61 MiB already allocated; 3.31 GiB free; 788.00 KiB cached)
亲戚
图数 | 计划分配的内存 | 自由的内存 | |
1 | 48MB | 12.94MB | |
2 | 96MB | 2.94MB | |
6 | 288MB | 218MB | |
12 | 576MB | 478MB | |
400 | 3.5G | 3.31G |
如果有数据集
数据集 | img | density |
SHHB | 200kB | 2.2M |
UCF50 | 48kB-300kB | 900kB-5.1,MB |
UCF-QNRF | 70kB-290kB | 1.1MB-6MB |
WE | 23kB-71kB- | 800kB-1.3MB |
换一下其他比较小的数据集试一试
换成WE试了一下
这个位置/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework,这个文件config.py,line13,改成WE
#__C.DATASET = 'SHHB' __C.DATASET = 'WE'
运行,得到下面这个报错
Traceback (most recent call last): File "train.py", line 64, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 40, in __init__ self.train_loader, self.val_loader, self.restore_transform = dataloader() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/WE/loading_data.py", line 30, in loading_data train_set = WE(cfg_data.DATA_PATH+'/train', 'train',main_transform=train_main_transform, img_transform=img_transform, gt_transform=gt_transform) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/WE/WE.py", line 18, in __init__ self.data_files = [filename for filename in os.listdir(self.img_path) \ FileNotFoundError: [Errno 2] No such file or directory: '/media/D/DataSet/CC/WE_2019/train/img'
看样子是找不到数据集存在什么位置
我们去这个位置“
/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/WE
”的“setting.py”的line10
# __C_WE.DATA_PATH = '/media/D/DataSet/CC/WE_2019' __C_WE.DATA_PATH = '../ProcessedData/WE_blurred'
再次报GPU内存不够的错
RuntimeError: CUDA out of memory. Tried to allocate 126.00 MiB (GPU 0; 3.95 GiB total capacity; 3.34 GiB already allocated; 100.94 MiB free; 773.00 KiB cached)
把WE的每次读取图片数降低到1
文件位置“
/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/WE
//”的“setting.py
” line21
#__C_WE.TRAIN_BATCH_SIZE = 12 #imgs __C_WE.TRAIN_BATCH_SIZE = 1 #__C_WE.VAL_BATCH_SIZE = 8 # __C_WE.VAL_BATCH_SIZE = 1
为了不让代码跑太久,我把epoch设定为1
位置“
model_innov/2021_C-3-Framework/C-3-Framework”
config.py
line43
#__C.MAX_EPOCH = 200 __C.MAX_EPOCH = 1
再次运行,可以运行了——成了
Res101_SFCN的预训练模型的名字resnet101-5d3b4d8f.pth
训练出来的模型存储的位置:
/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/exp/12-14_21-22_WE_Res101_SFCN_1e-05/all_ep_1_mae_19.3_mse_0.0.pth
colab上显存大于4GB可以用colab
换模型MCNN用WE数据集训练一下
位置:
/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/config.py
的Line26
# __C.NET = 'Res101_SFCN' # net selection: MCNN, AlexNet, VGG, VGG_DECODER, Res50, CSRNet, SANet __C.NET = MCNN
运行结果
用WE数据集在AlexNet模型上训练一下
从这个网址把预训练模型下载下来https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth 把文件放在这个位置
/home/albert/.torch/models
WE数据集训练VGG模型
预训练模型文件https://download.pytorch.org/models/vgg16-397923af.pth
在WE数据集上训练VGG_DECODER模型
在WE数据集上训练Res50模型
预训练模型代码:https://download.pytorch.org/models/resnet50-19c8e357.pth
WE数据集上跑CSRNet
WE数据集上跑SANet
报错
Traceback (most recent call last): File "train.py", line 64, in <module> cc_trainer = Trainer(loading_data,cfg_data,pwd) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer_for_M2TCC.py", line 31, in __init__ self.net = CrowdCounter(cfg.GPU_ID,self.net_name,loss_1_fn,loss_2_fn).cuda() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/M2TCC.py", line 13, in __init__ from M2TCC_Model.SANet import SANet as net ModuleNotFoundError: No module named 'M2TCC_Model'
在models文件夹中的M2TCC.py的line13,在导入模块的前面加一个model..因为这段代码是在这个位置/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/train.py,这里需要先进入models这行文件夹,才可以再进入M2TCC_Model文件夹,最后进入SANet.py,所以在前面加了models.
if model_name == 'SANet': # newly modified 2022-12-17-8:44 #from M2TCC_Model.SANet import SANet as net from models.M2TCC_Model.SANet import SANet as net
trainer_for_M2TCC.py的line193,加一个[0],因为self.net.loss返回的是一个只有一个元素的元组,加个[0]就可以把元组内的元素取出来,然后才可以用item把东西取出来
#losses.update(self.net.loss.item(),i_sub) losses.update(self.net.loss[0].item(),i_sub)
然后就运行成功了
可用的模型
Word Expo数据集 | ||||||
模型 | 一个batch | total train time | 最终保存的模型文件大小 | 预训练模型大小 | GPU占用百分比 | MAE |
Res101_SFCN | 0.95s | 3147s | 155MB | 180MB | 19.33 | |
MCNN | 0.13s | 364s | 540kB | 无 | 35.54 | |
AlexNet | 0.03s | 12s | 10MB | 245MB | 19.88 | |
VGG | 0.36s | 760s | 30MB | 553MB | 1879M.100% | 14.17 |
VGG_DECODER | 0.4s | 850s | 34MB | 不知道是不是用的VGG | 1989M 100% | 64.78 |
Res50 | 0.24s | 800s | 100MB | 100MB | 1625M 99% | 20.14 |
CSRNet | 0.73s | 1500s | 65MB | 可能用了,不知道具体是哪个 | 2073MB 100% | 16.07 |
SANet | 0.4s | 970s | 5.6MB | 可能用了,不知道具体是哪个 | 1633MB 100% | 38.3 |
——用TensorBoardX查看一下训练曲线
因为没有安装tensorboard报错,要求你安装
你现在tensorboardX安装的是什么版本?
——1.4 (发布时间:(2018-08-09))
——如何查询已经安装的tensorboardX的版本是什么?——实话告诉你,没找能查询版本号的方式——这个tensorboardX对应的应该安装的tensorboard应该装哪个版本号的?——装tensorboard必须安装tensorflow不如直接装tensoflow顺便就把tensorboard也装了
查询tensorboard版本的命令:from tensorboard import version; print(version.VERSION)
tensorflow这个包,开始支持python3.7是从1.1.3.1(2019-02-27)开始的,之前的版本不支持,你也装不上
pip install tensorflow==1.13.1
可以运行了,但是做出的图没啥用——我只跑了一个epoch,肯定看不出来,epoch多了自然能看出来
——你把几个运行一个batch时间最短的模型,尝试把一个batch的图片调高到3张、4张,一直到能运行的最大张数,我想看看4GB的显存能不能塞满。——提高训练效率
改这个位置
/2021_C-3-Framework/C-3-Framework/datasets/WE/setting.py
__C_WE.RESUME_MODEL = ''#model path #__C_WE.TRAIN_BATCH_SIZE = 6 #imgs __C_WE.TRAIN_BATCH_SIZE = 6 #__C_WE.VAL_BATCH_SIZE = 6 # __C_WE.VAL_BATCH_SIZE = 6
这个位置选择用什么模型
2021_C-3-Framework/C-3-Framework/config.py
用AlexNet,他是里面size最小,一个批次处理最快的
__C.NET = 'AlexNet'
batchsize | 显存占用4GB | ||
AlexNet | 6 | 947MB/4GB | |
AlexNet | 24 | 2151MB | |
AlexNet | 48 | 3297MB | |
55 | 3555MB | ||
60 | 3923MB | ||
70 | 3515MB | ||
80 | 4025MB | ||
85 | 不报错,电脑卡住,运行不了 |
这说明了一点,之前说的GPU占用为100%是错的,显示占用多少MB才是实际占用的,填满4个GB才算最大化利用
——其余几个数据集,尝试用模型比较小的数据集跑一下,看看能不能跑?——跑出来,说明一个数据集能用了(模型小的可以直接跑,说明我的代码没有问题,是显存不够而已;模型大的去租赁服务器跑就行了)
AlexNet模型下,batch_size=1
换个数据集,在
2021_C-3-Framework/C-3-Framework/config.py
__C.DATASET = 'WE'
SHHA报错
我也不知道这个mat文件该
Traceback (most recent call last): File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp> samples = collate_fn([dataset[i] for i in batch_indices]) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/SHHA/SHHA.py", line 27, in __getitem__ img, den = self.read_image_and_gt(fname) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/datasets/SHHA/SHHA.py", line 44, in read_image_and_gt den = sio.loadmat(os.path.join(self.gt_path,os.path.splitext(fname)[0] + '.mat')) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/scipy/io/matlab/mio.py", line 207, in loadmat MR, file_opened = mat_reader_factory(file_name, appendmat, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/scipy/io/matlab/mio.py", line 62, in mat_reader_factory byte_stream, file_opened = _open_file(file_name, appendmat) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/scipy/io/matlab/mio.py", line 37, in _open_file return open(file_like, 'rb'), True FileNotFoundError: [Errno 2] No such file or directory: '../ProcessedData/shanghaitech_part_A/train/den/254.mat'
找到原因了:SHHA.py这个文件的line47,注释前两行,打开最后一样就读取csv文件了
#den = sio.loadmat(os.path.join(self.gt_path,os.path.splitext(fname)[0] + '.mat')) #den = den['map'] den = pd.read_csv(os.path.join(self.gt_path,os.path.splitext(fname)[0] + '.csv'), sep=',',header=None).values
又报错说维度不一致
咋能算着算着就不一致了呢?我不理解了
我怀疑他拿的ground truth和预测的图不是同一张图所以size不匹配
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 66, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/trainer.py", line 94, in train pred_map = self.net(img, gt_map) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/CC.py", line 40, in forward self.loss_mse= self.build_loss(density_map.squeeze(), gt_map.squeeze()) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2021_C-3-Framework/C-3-Framework/models/CC.py", line 44, in build_loss loss_mse = self.loss_mse_fn(density_map, gt_data) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 435, in forward return F.mse_loss(input, target, reduction=self.reduction) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/functional.py", line 2155, in mse_loss expanded_input, expanded_target = torch.broadcast_tensors(input, target) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/functional.py", line 50, in broadcast_tensors return torch._C._VariableFunctions.broadcast_tensors(tensors) RuntimeError: The size of tensor a (688) must match the size of tensor b (683) at non-singleton dimension 0
UCF50
AlexNet+数据集 | |
WE | OK |
SHHB | OK |
SHHA | 报错,我没修好 |
UCF50 | OK |
QNRF | OK |
Mall | 没有处理好的数据 |
UCSD | 没有处理好的数据 |
——这八个模型,看一会理论描述,对照着看一会代码,(1)代码都看懂了,以后就可以复用了;(2)别人的模型的代码都看懂了,我才能在别人模型基础上做改进
逐行读代码,做到 模型理论和代码一一对应
WE数据集下,那三个模型试一下,
目前已有的代码实现了这些模型做crowd counting:MCNN,CSRNet,'Res101_SFCN,, AlexNet, VGG, VGG_DECODER, Res50, SANet
我想知道一下你的MCNN代码是如何实现的?
train.py ——> 下面这些模型全部都走trainer.py-'MCNN', 'AlexNet', 'VGG', 'VGG_DECODER', 'Res50', 'Res101', 'CSRNet','Res101_SFCN' ——>是MCNN ——> trainer.py ——> 难道所有的的这些模型的代码都是一样的?
(万能代码)2019_Video-Crowd-Counting
zzpustc/Video-Crowd-Counting
目前这个代码被存放在我电脑的这个位置:/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting
Since current datasets are basic based on single image, there are only Mall, UCSD, FDST are video based
FDST
配环境
要求的环境是这样,但是建议用python3.7先跑一下,不行再换3.6。要求安装的包大概是tensorboard,easydict,这一类的——不用换用python3.7那个即可
conda create -n cascadecnn python=3.6 pip install -r requirements.txt
下载数据集
视频人群计数用的数据集是 Mall and FDST.
用代码预处理数据集
他这个代码不完整,他提到的这两个文件夹,他并没有提供
我自己新建文件夹把原有文件夹里这些文件移动进去,自己去创个文件夹
你去仿照出C3F的代码,人家datasets里面别人有什么你就用组建什么
config.py的line28标注了预训练模型应该存储在这个位置'./best_pre_model/Mall_pre_model.pth'
‘
把FDST数据集下载下来址(https://github.com/sweetyy83/Lstn_fdst_dataset)
用MATLAB软件的这个代码PrepareFDST.m
把FDST数据集处理一遍
MATLAB添加一个包jsonlab-
1.5
,因为代码里有用到这个包
链接:https://pan.baidu.com/s/1dZBi5j04dMLW3huxY8uhvQ 提取码:o909 把jsonlab-1.5放在这个位置F:\Programming_software\MATLAB\toolbox 在MATLAB命令行执行 addpath('F:\Programming_software\MATLAB\toolbox\jsonlab-1.5') savepath 刷新缓存 rehash toolboxcache 然后在MATLAB代码的line3更改路径如下 addpath('F:\Programming_software\MATLAB\toolbox\jsonlab-1.5')
在同一文件夹下建立下列文件夹
\FDST\train\den;\FDST\test\den
在PrepareFDST.m
里面的line8,train改成test就能测试集也制作一遍
你要把用MATLAB处理过FDST数据集,放到这个位置datasets/ProcessedData/FDST/train/
然后运行
python train.py
报错
Traceback (most recent call last): File "train.py", line 64, in <module> cc_trainer = Trainer(loading_data('DA'),cfg_data,pwd) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 25, in __init__ self.net = CrowdCounter(cfg.GPU_ID,self.net_name).cuda() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/CC.py", line 31, in __init__ self.CCN = torch.nn.DataParallel(self.CCN, device_ids=gpus).cuda() File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 131, in __init__ _check_balance(self.device_ids) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 18, in _check_balance dev_props = [torch.cuda.get_device_properties(i) for i in device_ids] File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 18, in <listcomp> dev_props = [torch.cuda.get_device_properties(i) for i in device_ids] File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/cuda/__init__.py", line 301, in get_device_properties raise AssertionError("Invalid device id") AssertionError: Invalid device id
去config.py的line37
#__C.GPU_ID = [0,1,2,3] # sigle gpu: [0], [1] ...; multi gpus: [0,1] __C.GPU_ID = [0]
再次运行,再次报错
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 69, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 102, in train pred_map = self.net(img, gt_map, img_p) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/CC.py", line 41, in forward density_map = self.CCN(img,img_p) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py", line 124, in forward x_p = self.stn(x_p) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py", line 99, in stn xs = xs.view(-1, 10*11*16) # for Mall, it is 10*11*16, for FDST it's 10*7*16 RuntimeError: shape '[-1, 1760]' is invalid for input of size 8960
我的感觉是数据的维度不匹配所以报错
这个位置这个文件/models/SCC_Model/CascadeCNN.py
的line99,改成作者介绍的维度
# line99 #xs = xs.view(-1, 10*11*16) # for Mall, it is 10*11*16, for FDST it's 10*7*16 xs = xs.view(-1, 10*7*16) # line114 #xs = xs.view(-1, 10*11*16) xs = xs.view(-1, 10*7*16)
依旧报错,还是维度不匹配
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 69, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 102, in train pred_map = self.net(img, gt_map, img_p) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/CC.py", line 41, in forward density_map = self.CCN(img,img_p) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py", line 133, in forward x_p = self.stn(x_p) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py", line 104, in stn theta = self.fc_loc(xs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 67, in forward return F.linear(input, self.weight, self.bias) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/functional.py", line 1352, in linear ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t()) RuntimeError: size mismatch, m1: [1 x 1120], m2: [1760 x 32] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:266
在/models/SCC_Model/CascadeCNN.py
的line26,维度改成这样
self.fc_loc = nn.Sequential( #nn.Linear(10 * 11 * 16, 32), nn.Linear(10 * 7 * 16, 32),
继续报错
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 69, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 102, in train pred_map = self.net(img, gt_map, img_p) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/CC.py", line 41, in forward density_map = self.CCN(img,img_p) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py", line 138, in forward x_p1 = self.stn_early(x_p1) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py", line 124, in stn_early theta = self.fc_loc_1(xs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 67, in forward return F.linear(input, self.weight, self.bias) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/nn/functional.py", line 1352, in linear ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t()) RuntimeError: size mismatch, m1: [1 x 1120], m2: [1760 x 32] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:266
在/models/SCC_Model/CascadeCNN.py
的line46,维度改成这样。这样的话输入的维度和神经网络的维度就匹配了
self.fc_loc_1 = nn.Sequential( # nn.Linear(10 * 11 * 16, 32), nn.Linear(10 * 7 * 16, 32),
然后就可以跑通了
做一些轻微的优化
max_epoch改成1
config.py line45 #__C.MAX_EPOCH = 200 __C.MAX_EPOCH = 1
batch_size改成4
#__C_MALL.TRAIN_BATCH_SIZE = 1 #imgs __C_MALL.TRAIN_BATCH_SIZE = 4 #__C_MALL.VAL_BATCH_SIZE = 1 # __C_MALL.VAL_BATCH_SIZE = 4
pretrain_model把VGG下载下来,并允许使用VGG网络的预训练模型参数
/models/SCC_Model/CascadeCNN.py line9 # model_path = '../PyTorch_Pretrained/vgg16-397923af.pth' model_path = './home/albert/.torch/models/vgg16-397923af.pth'
运行,报错
Traceback (most recent call last):
File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 69, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 93, in train for i, data in enumerate(self.train_loader, 0): File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in __next__ batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in <listcomp> batch = self.collate_fn([self.dataset[i] for i in indices]) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/datasets/FDST/FDST.py", line 37, in __getitem__ pname = self.data_files[index] IndexError: list index out of range
——打印出来,看看index是什么样,我是怎么超的,不超的话数字应该是哪个?,怎么改?——限制一个epoch只跑10个数据——什么地方跑的那个循环,什么地方决定了先这张图再那张图?
循环的地方位于“trainer.py”的line97
在trianer.py的line99加入下面这行代码,让数据集在吃了20个数据以后停止训练
if i == 20:break
然后运行报了这个错
Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 87, in forward torch.save(self.net.cpu().state_dict(),"./weights/Pre_model_{}.pth".format(epoch+1)) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/serialization.py", line 218, in save return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol)) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/serialization.py", line 141, in _with_file_like f = open(f, mode) FileNotFoundError: [Errno 2] No such file or directory: './weights/Pre_model_1.pth'
应该是这个文件夹不存在/weights/
,模型保存不下来。在train.py同一路径下新建这个文件夹weights。再次运行
正常运行了,没报错
把train的数据换成200条数据
trainer.py, line101 if i == 200:break
可以正常运行
去掉读取数据的限制,整个数据集读取
又开始报错,在跑数据,跑到4500的时候报这个错
[ep 1][it 4460][loss 0.0030][lr 0.1000][0.20s] [cnt: gt: 27.0 pred: 24.70] [ep 1][it 4470][loss 0.0041][lr 0.1000][0.20s] [cnt: gt: 36.0 pred: 36.27] [ep 1][it 4480][loss 0.0049][lr 0.1000][0.21s] [cnt: gt: 42.9 pred: 40.18] [ep 1][it 4490][loss 0.0032][lr 0.1000][0.15s] [ep 1][it 4500][loss 0.0052][lr 0.1000][0.12s] [cnt: gt: 41.0 pred: 37.46] Traceback (most recent call last): File "train.py", line 65, in <module> cc_trainer.forward() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 70, in forward self.train() File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/trainer.py", line 97, in train for i, data in enumerate(self.train_loader, 0): File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in __next__ batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/albert/anaconda3/envs/py370tc100/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in <listcomp> batch = self.collate_fn([self.dataset[i] for i in indices]) File "/media/F:/FILES_OF_ALBERT/IT_paid_class/graduation_thesis/model_innov/2019_Video-Crowd-Counting/Video-Crowd-Counting/datasets/FDST/FDST.py", line 39, in __getitem__ pname = self.data_files[index] IndexError: list index out of range
再跑一次,我看看这次停在多少条个数据集那里
这个位置Video-Crowd-Counting/datasets/FDST/FDST.py 的line39我把index打印出来了,发现index是一个介于0-5000的随机数字
我目前的怀疑是FDST制作ground truth的时候,没有使用整个数据集,而是从里面有选择性的挑选了一部分来作为数据集,这样的话就会出现,你给的index比数据集中数据的数量多,list out of range的错误————去MATLAB代码那里做调整使得制作ground truth的时候拿数据集的全量数据集来制作——存在这个问题,但是 list index out of range不是这个问题,,是index从1开始导致的
原始数据集
train:1,2,3,6,7,8,11,12,13,16,17,18,21,22,23,26,27,28,,31,32,33,36,37,38,41,42,43,46,47,48,51,52,53,56,57,,58,61,62
加工过的数据集
train:6(1-150),7(1-150),8,47,48,51,52,53,56,57,58,61,62,
——他确实是在加工原始数据集的过程中丢弃了原有的一部分数据集,像上面红色编号的这些。但是我看了他的MATLAB代码,没有找到哪行代码是用来选择使用哪部分数据集、丢弃哪部分数据集的。我也很疑惑,他是怎么做到的。
报错前最后一个indecent是5250有没有可能是因为,index是从0开始的,最后一个index应该是5249?——把index减去1,是这样,按照这样改了以后就不报错了
/Video-Crowd-Counting/datasets/FDST/FDST.py的line46 给index-1
# pname = self.data_files[index] pname = self.data_files[index-1] # modified by Albert # fname = self.data_files[index-1] # 为啥要减1我不知道 fname = self.data_files[index-2] # 我的问题是,为什么pname是data_files[index-1],为甚fname是fname = self.data_files[index-2]?
确实不报错了,但是模型似乎没有任何用处了,损失值很高,就跟模型没训练过一样
print(index) 3102 print(index) train time: 1747.25s ==================== Best model: 0 MAE: 100000.0 MSE: 100000.0
我尝试换成这样改,还是把index减一,让它从0开始;前面那个减一给他恢复
# 原来没有注释,现在注释掉。不然最后一张图片会超index # if num == 0: # index += 1 pname = self.data_files[index] # modified by Albert # pname = self.data_files[index-1] # modified by Albert fname = self.data_files[index-1] # 为啥要减1我不知道 # fname = self.data_files[index-2]
看看结果如何?依旧是和没建模一样,损失值高的离谱
[cnt: gt: 39.8 pred: 36.79] [ep 1][it 1270][loss 0.0037][lr 0.1000][0.72s] [cnt: gt: 30.0 pred: 28.50] [ep 1][it 1280][loss 0.0036][lr 0.1000][0.50s] [cnt: gt: 26.0 pred: 22.78] [ep 1][it 1290][loss 0.0045][lr 0.1000][0.77s] [cnt: gt: 37.9 pred: 35.58] [ep 1][it 1300][loss 0.0037][lr 0.1000][0.85s] [cnt: gt: 18.0 pred: 18.69] [ep 1][it 1310][loss 0.0038][lr 0.1000][0.79s] [cnt: gt: 22.0 pred: 22.20] train time: 1780.58s ==================== Best model: 0 MAE: 100000.0 MSE: 100000.0 训练一个epoch是30分钟
确认一下FDST的train和test和你制作后的变换是否一一匹配?
使用try ... continue,如果遇到 list out of range的错误,自动index减一,避免这个错误引发的报错
try: pname = self.data_files[index] fname = self.data_files[index-1] # 为啥要减1我不知道 # fname = self.data_files[index-2] # modified by Albert except IndexError: print("IndexError is raised and the index number is ",index) pname = self.data_files[index-1] fname = self.data_files[index - 2] # 为啥要减1我不知道
再次运行,还是损失值很高。不知道如何处理
[ep 1][it 1240][loss 0.0049][lr 0.1000][0.86s] [cnt: gt: 29.0 pred: 24.67] [ep 1][it 1250][loss 0.0036][lr 0.1000][0.87s] [cnt: gt: 30.0 pred: 26.48] [ep 1][it 1260][loss 0.0046][lr 0.1000][0.90s] [cnt: gt: 39.9 pred: 36.57] [ep 1][it 1270][loss 0.0038][lr 0.1000][0.87s] [cnt: gt: 30.0 pred: 30.45] [ep 1][it 1280][loss 0.0036][lr 0.1000][0.86s] [cnt: gt: 26.0 pred: 24.42] [ep 1][it 1290][loss 0.0043][lr 0.1000][0.90s] [cnt: gt: 37.7 pred: 32.65] [ep 1][it 1300][loss 0.0036][lr 0.1000][0.84s] [cnt: gt: 18.0 pred: 17.51] [ep 1][it 1310][loss 0.0038][lr 0.1000][0.86s] [cnt: gt: 22.0 pred: 20.35] train time: 1740.76s ==================== Best model: 0 MAE: 100000.0 MSE: 100000.0 我现在很怀疑 gt:22是指的实际的人数,pred指的是预测的人数
——有没有可能是训练的epoch太少,还没收敛?有可能,官方的训练epoch数是30
训练个30epoch,15小时,看看最后的损失值
config.py line30
#__C.MAX_EPOCH = 1 __C.MAX_EPOCH = 30
训练结束MAE MSE真的降低下来了
[ep 30][it 1250][loss 0.0036][lr 0.0865][0.91s] [cnt: gt: 29.0 pred: 31.48] [ep 30][it 1260][loss 0.0040][lr 0.0865][0.92s] [cnt: gt: 36.0 pred: 33.92] [ep 30][it 1270][loss 0.0038][lr 0.0865][0.90s] [cnt: gt: 32.0 pred: 27.08] [ep 30][it 1280][loss 0.0024][lr 0.0865][0.94s] [cnt: gt: 24.0 pred: 28.90] [ep 30][it 1290][loss 0.0034][lr 0.0865][0.91s] [cnt: gt: 53.0 pred: 49.13] [ep 30][it 1300][loss 0.0034][lr 0.0865][0.91s] [cnt: gt: 39.0 pred: 38.39] [ep 30][it 1310][loss 0.0037][lr 0.0865][0.92s] [cnt: gt: 23.0 pred: 22.75] train time: 1765.95s ==================== Best model: 10 MAE: 1.7196840407986116 MSE: 2.1560562334881412
他这个模型是“真的”video crowd counting吗?丢进去一短视频的录像能用这个模型数出人数吗
目前的问题是,模型损失值居高不下,不收敛
——逐行查看,看看有没有被注释,可以用的其他代码
——VGG预训练模型加载的那句话注释掉,训练 一个epoch看看分数——我这次没有注释,分数降下了;是不是注释这句话,第一个epoch损失值就降下来了呢?(因为我怀疑vgg-16的预训练参数并不适用于我这个人群计数的场景)
这个位置这个文件/Video-Crowd-Counting/models/SCC_Model/CascadeCNN.py line11注释掉
# model_path = '../PyTorch_Pretrained/vgg16-397923af.pth' # model_path = './home/albert/.torch/models/vgg16-397923af.pth'
config.py的line49改30为1
__C.MAX_EPOCH = 1
刚开始训练的情况
[ep 1][it 10][loss 0.0191][lr 0.1000][0.45s] [cnt: gt: 38.0 pred: 149.47] [ep 1][it 20][loss 0.0130][lr 0.1000][0.76s] [cnt: gt: 27.0 pred: 56.28] [ep 1][it 30][loss 0.0102][lr 0.1000][0.82s] [cnt: gt: 35.0 pred: 30.14] [ep 1][it 40][loss 0.0088][lr 0.1000][0.81s] [cnt: gt: 29.0 pred: 65.56] [ep 1][it 50][loss 0.0082][lr 0.1000][0.80s] [cnt: gt: 35.0 pred: 20.24] [ep 1][it 60][loss 0.0079][lr 0.1000][0.87s] [cnt: gt: 22.0 pred: 12.62] [ep 1][it 70][loss 0.0069][lr 0.1000][0.71s] [cnt: gt: 31.0 pred: 20.80] [ep 1][it 80][loss 0.0075][lr 0.1000][0.73s] [cnt: gt: 27.0 pred: 2.85] [ep 1][it 90][loss 0.0068][lr 0.1000][0.72s] [cnt: gt: 30.0 pred: 6.35] [ep 1][it 100][loss 0.0055][lr 0.1000][0.45s] [cnt: gt: 11.0 pred: 10.80] [ep 1][it 110][loss 0.0054][lr 0.1000][0.61s] [cnt: gt: 38.0 pred: 39.72] [ep 1][it 120][loss 0.0069][lr 0.1000][0.59s] [cnt: gt: 46.0 pred: 41.28] [ep 1][it 130][loss 0.0078][lr 0.1000][0.65s] [cnt: gt: 36.0 pred: 23.80] [ep 1][it 140][loss 0.0056][lr 0.1000][0.77s] [cnt: gt: 23.0 pred: 14.88] [ep 1][it 150][loss 0.0064][lr 0.1000][0.46s] [cnt: gt: 32.0 pred: 23.16] [ep 1][it 160][loss 0.0062][lr 0.1000][0.60s] [cnt: gt: 25.0 pred: 15.43] [ep 1][it 170][loss 0.0055][lr 0.1000][0.61s] [cnt: gt: 31.0 pred: 21.56] [ep 1][it 180][loss 0.0073][lr 0.1000][0.57s] [cnt: gt: 38.0 pred: 22.98]
一个epoch训练出来损失值依然很高。说明不运用预训练模型参数对于加速训练是没有用的,一个epoch依然不收敛
[ep 1][it 1260][loss 0.0048][lr 0.1000][0.89s] [cnt: gt: 39.9 pred: 31.28] [ep 1][it 1270][loss 0.0039][lr 0.1000][0.78s] [cnt: gt: 30.0 pred: 27.76] [ep 1][it 1280][loss 0.0036][lr 0.1000][0.77s] [cnt: gt: 26.0 pred: 24.10] [ep 1][it 1290][loss 0.0043][lr 0.1000][0.86s] [cnt: gt: 37.7 pred: 37.51] [ep 1][it 1300][loss 0.0037][lr 0.1000][0.83s] [cnt: gt: 18.0 pred: 18.61] [ep 1][it 1310][loss 0.0037][lr 0.1000][0.87s] [cnt: gt: 22.0 pred: 19.71] train time: 1806.04s ==================== Best model: 0 MAE: 100000.0 MSE: 100000.0
——有没有可能是作者损失函数在最后打印的那个定义错了?
——用test.py测一下,拿着训练出的这些模型,看看每个模型多少分数?
——如果FDST不能跑,那么换成Mall数据集跑出来什么样,文件路径、尺寸等等都修改了,是否还训练结束了损失分数特别高?
——我怀疑目前训练的数据集和ground truth之间的匹配关系是错的,(比如A图片和B图片的标注来对应),导致怎么训练损失值都是很高的。——逐行读代码,看看能不能找到什么地方做的匹配。——通过目前的分数看不是这样,但是具体是怎匹配的,我还不知道
/Video-Crowd-Counting/datasets/FDST 的line48和line52为什么要 index 和index-1来表示picture和frame? 是不是二者完全是驴唇不对马嘴的不相干的数据
——逐行读懂single image crowd counting的代码
——回来再尝试解决 video crowd counting的代码的问题
——这个video CC的代码解决不了,那么去找其他的video crowd counting的代码,尝试无报错跑通
Mall
用MATLAB处理一下Ma什么地方ll的原始数据集
修改读取数据集的文件夹 文件目录位置
修改数据的维度
运行
这里
多查几篇论文看看他们是否提供代码,把提供代码的这些模型逐一列出来,逐一跑。
运行代码
(万能代码)gjy3035/NWPU-Crowd-Sample-Code
gjy3035/NWPU-Crowd-Sample-Code
数据集:https://gjy3035.github.io/NWPU-Crowd-Sample-Code/
(万能代码)surajdakua/Crowd-Counting-Using-Pytorch
https://github.com/surajdakua/Crowd-Counting-Using-Pytorch
https://www.analyticsvidhya.com/blog/2019/02/building-crowd-counting-model-python/
(万能代码)miao0913/C3-framework-trees
1miao0913/C3-framework-trees
对于数据预处理、训练测试全部都有jupyter notebook做讲解,很好用
CUDA要求的版本低,得装双cuda
运行代码的指导https://medium.com/@kaamyakpant_67666/building-a-crowd-counting-model-using-deep-learning-e8b8e925674e
(1)Creating density map—— follow the make_dataset.ipynb
to generate the ground truth. It shall take some time to generate the dynamic ground truth
Python: 2.7 PyTorch: 0.4.0 CUDA: 9.2
(万能代码)MRJTM/crowd_counting
https://github.com/MRJTM/crowd_counting
(万能代码)2020_gaoguangshuai/survey-for-crowd-counting
2020_cite=125_未发表_Gao——Cnn-based density estimation and crowd counting: A survey
我们在NWPU数据集的验证集中提供了一些主流算法,(1)如何生成密度图的代码和如何生成人数预测结果的代码,(3)还提供了评估预测好坏的工具
gaoguangshuai/survey-for-crowd-counting
百度Paddle下FairMOT模型
先把百度这个现成的代码给我运行起来:https://aistudio.baidu.com/aistudio/projectdetail/2421822
tensorFlow和pytorch官方有没有人群计数crowd counting的官方示例——谷歌英文博客有没有博客既有解释、又有代码的,个人用pytorch和tensorflow实现的
-
代码怎么跑说的很详细(jupyer notebook):https://aistudio.baidu.com/aistudio/projectdetail/2421822
-
写的比下面这个都更加详细:https://aistudio.baidu.com/aistudio/projectdetail/4171185
-
PaddlePaddle/PaddleDetectionhttps://github.com/PaddlePaddle/paddledetection#%E4%BA%
-
PP-Human实时行人分析全流程实战:写的特别详细,怎么安装、怎么运行写的都很好:https://aistudio.baidu.com/aistudio/projectdetail/3842982 ——;
-
实时行人分析 PP-Human——https://toscode.gitee.com/zyt111/PaddleDetection/tree/release/2.4/deploy/pphuman
-
使用百度AI实现视频的人流量统计(静态+动态)代码及效果演示 https://blog.csdn.net/weixin_419
-
基于PaddleDetection实现人流量统计人体检测https://blog.csdn.net/m0_63642362/article/details/121434604
(文章写的很详细,各部分都很完善,非常好!)
人流量统计任务需要在
(1)检测到目标的类别和位置信息的同时
(2)识别出帧与帧间的关联信息,确保视频中的同一个人不会被多次识别并计数。本案例选取PaddleDetection目标跟踪算法中的FairMOT模型来解决人流量统计问题。
(3)FairMOT以Anchor Free的CenterNet检测器为基础,
(4)深浅层特征融合使得检测和ReID任务各自获得所需要的特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。
(5)针对拍摄角度不同(平角或俯角)以及人员疏密程度,在本案例设计了不同的训练方法:
-
针对人员相对稀疏的场景: 基于Caltech Pedestrian、CityPersons、CHUK-SYSU、PRW、ETHZ、MOT16和MOT17数据集进行训练,对场景中的行人进行全身检测和跟踪。 如 图2 所示,模型会对场景中检测到的行人进行标识,并在左上角显示出该帧场景下的行人数量,实现人流量统计。
-
针对人员相对密集的场景: 人与人之间的遮挡问题会非常严重,这时如果选择对行人整体检测,会导致漏检率升高。因此,本场景中使用人头跟踪方法。基于HT-21数据集进行训练,对场景中的行人进行人头检测和跟踪,对人流量的统计基于检测到的人头进行计数,如 图3 所示。
模型选择
PaddleDetection对于多目标追踪算法主要提供了三种模型,DeepSORT、JDE和FairMOT。
-
DeepSORT (Deep Cosine Metric Learning SORT) (1)扩展了原有的 SORT (Simple Online and Realtime Tracking) 算法,(2)增加了一个CNN模型用于在检测器限定的人体部分图像中提取特征,在深度外观描述的基础上整合外观信息,将检出的目标分配和更新到已有的对应轨迹上即进行一个ReID重识别任务。DeepSORT所需的检测框可以由任意一个检测器来生成,然后读入保存的检测结果和视频图片即可进行跟踪预测。ReID模型此处选择 PaddleClas 提供的
PCB+Pyramid ResNet101
模型。 -
JDE (Joint Detection and Embedding) 是(1)在一个单一的共享神经网络中同时学习目标检测任务(Anchor Base的YOLOv3检测器)和embedding任务(ReID分支学习embedding),(2)并同时输出检测结果和对应的外观embedding匹配的算法。一个模型有两个输出,所以训练过程被构建为一个多任务联合学习问题。这样做的好处:兼顾精度和速度。
-
FairMOT 【最后用这个】(1)以Anchor Free的CenterNet检测器为基础,(2)克服了Anchor-Based的检测框架中anchor和特征不对齐问题,(3)深浅层特征融合使得检测和ReID任务各自获得所需要的特征,(4)并且使用低维度ReID特征,(5)提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。综合精度和速度,这里我们选择了FairMOT算法进行人流量统计/人体检测。
优先运行C3F引用的那些代码
-
rbgirshick/py-faster-rcnn
-
zijundeng/pytorch-semantic-segmentation
-
leeyeehoo/CSRNet-pytorch
-
BIGKnight/SANet_implementation
-
gjy3035/enet.pytorch
-
gjy3035/GCC-SFCN
-
gjy3035/PCC-Net(论文尚未发表,因此暂未公开源码)
2021_SASNet(可运行)
-
代码地址——TencentYoutuResearch/CrowdCounting-SASNet ;数据集ShanghaiTech(拿到了):https://drive.google.com/drive/folders/17WobgYjekLTq3QIRW3wPyNByq9NJTmZ9?usp=sharing
-
运行代码的方法:
# 1.先把SASNet_ROOT从github上下载下来,解压并重命名为SASNet_ROOT(这是作者规定的)
# 先准备好数据(Generating the density maps for the data:) # 原始数据默认放在 \home\teddy\UCF-QNRF_ECCV18,你得把数据放过去 # 整理好的数据存在F:\home\teddy\UCF-Train-Val-Test python prepare_dataset.py --data_path ./datas/part_A_final python prepare_dataset.py --data_path ./datas/part_B_final # 运行代码完成训练。注意用的是python这个命令而不是python3 python main.py --data_path ./datas/part_A_final --model_path ./models/SHHA.pth python main.py --data_path ./datas/part_B_final --model_path ./models/SHHB.pth
-
一开始是报错说cuda空间不够应该是加载了数据集和模型以后没空间训练了;
Traceback (most recent call last): File "main.py", line 105, in <module> main(args) File "main.py", line 81, in main pred_map = model(img) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_SASNet\SASNet_ROOT\model.py", line 173, in forward x1_density = self.density_head1(x1_out) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\nn\modules\container.py", line 117, in forward input = module(input) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_SASNet\SASNet_ROOT\model.py", line 239, in forward return torch.cat(outputs, 1) RuntimeError: CUDA out of memory. Tried to allocate 942.00 MiB (GPU 0; 4.00 GiB total capacity; 1.63 GiB already allocated; 0 bytes free; 3.05 GiB reserved in total by PyTorch)
(解决方案1:)怎么能够让读进cuda里面的图片和模型不那么大?比如修改batch_size,变得小一点,一次训练读进的图片少一点.main.py代码的这些位置,我尽可能改到最小了,但是还是cuda大小不够
#—————— 作者原来设定的batch_size是4,我GPU只有4个G,图片读进去以后,无法做训练计算了,所以我改成了1张 #—————— 但是改成了1张以后,还是GPU不够用,log_para和 parser.add_argument('--batch_size', type=int, default=1, help='batch size in training') # ——————log_para原来是1000,我改成了1 parser.add_argument('--log_para', type=int, default=10000, help='magnify the target density map') # ——————block_size原来是32,我改成了160 parser.add_argument('--block_size', type=int, default=160, help='patch size for feature level selection')
(解决方案2:)GPU不够大读不进去,那用CPU版本的pytorch做啊,他要求设定设备为CPU版,但是我不会设定(可以根据网上的教程设定一下,很简单的)
D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\numpy\.libs\libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\numpy\.libs\libopenblas.QVLO2T66WEPI7JZ63PS3HMOHFEY472BC.gfortran-win_amd64.dll stacklevel=1) Traceback (most recent call last): File "main.py", line 112, in <module> main(args) File "main.py", line 69, in main model.load_state_dict(torch.load(args.model_path)) File "D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\torch\serialization.py", line 713, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\torch\serialization.py", line 930, in _legacy_load result = unpickler.load() File "D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\torch\serialization.py", line 876, in persistent_load wrap_storage=restore_location(obj, location), File "D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\torch\serialization.py", line 175, in default_restore_location result = fn(storage, location) File "D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\torch\serialization.py", line 152, in _cuda_deserialize device = validate_cuda_device(location) File "D:\Programing_File\Anaconda3\envs\paddle\lib\site-packages\torch\serialization.py", line 136, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
(解决方案3:)用澳科大的超算或者点头的超算,cuda容量6G 8G应该就不会这样了
学校超算可以登录,但是我不会把自己的文件批量从本地传到服务器.
代码和数据集做成压缩包了,但是传不上去(一直显示在上传,但是一直没传上去)
学校的这个服务器,不允许我安装东西,修改环境,任何包都不允许我安装(目前在检查是否可以安装conda)
在购买的服务器上把所有数据放上去,就跑通了。训练的过程不显示训练进度,最后出一个分数
把本地代码改成CPU版本运行,尝试在本地运行
2018_CSRNet
2018_Cite=1004_CVPR_Li——CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
官方实现:leeyeehoo/CSRNet-pytorch
上面这个代码运行有bug,解决方案:xr0927/chapter5-learning_CSRNet
其他人实现:
1CommissarMa/CSRNet-pytorch
keras实现:DiaoXY/CSRnet
jupyter做的keras:RTalha/CROWD-COUNTING-USING-CSRNET
做了优化:karanjsingh/Improved-CSRNet
jupytert做的: krutikabapat/Crowd_Counting
jupyter:Neerajj9/CSRNet-keras
jupyter : dattatrayshinde/oc_sd
教你怎么debug:Bazingaliu/learning_CSRNet
Saritus/Crowd-Counter
CS3244-AY2021-SEM-1/csrnet-tensorflow
1CS3244-AY2021-SEM-1/csrnet-pytorch
2016_MCNN
2016_CIte=1484_Zhang__Single Image Crowd Counting via Multi Column Convolutional Neural Network
官方:https://github.com/svishwa/crowdcount-mcnn
其他人实现:
CommissarMa/MCNN-pytorch
https://github.com/CommissarMa/Crowd_counting_from_scratch/blob/master/crowd_model/mcnn_model.py
CS3244-AY2021-SEM-1/mcnn-pytorch
1svishwa/crowdcount-mcnn
1mindspore-ai/models
2021_P2PNet(数据)
(数据集整理的格式不对+放错位置了)
-
github上有其他人做的,用其他方式
-
数据集:(1)NWPU-Crowd(有)(用的这个) (2)SHTechPartA,PartB(有)(3)UCF_CC_50(有)(4)UCF_QNRF(有)————https://pan.baidu.com/s/1c2eLEE7leN0jz-fM38zyIQ ;——github地址:TencentYoutuResearch/CrowdCounting-P2PNet
-
卡在哪?报错信息:
他说num_sample=0所以报错。但是(1)我放在文件夹里面的数据集几百个,不是0个。(2)他这个num_sample的参数怎么设定,我一时半会也没从代码里看出来怎么回事--有可能是我数据集放的位置不对,他没数出我有多少张图片
P2PNet我猜原因是:
Traceback (most recent call last): File "train.py", line 222, in <module> main(args) File "train.py", line 123, in main sampler_train = torch.utils.data.RandomSampler(train_set) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\utils\data\sampler.py", line 104, in __init__ "value, but got num_samples={}".format(self.num_samples)) ValueError: num_samples should be a positive integer value, but got num_samples=0
原因找到了:这个位置、这个文件是空的所以报错“P2PNET_ROOT\new_public_density_data\shanghai_tech_part_a_train.list”。调用这个文件的代码文件是这个位置、这个文件“P2PNET_ROOT\crowd_datasets\SHHA\SHHA.py”。你需要了解调用的文件是什么格式的,才可以自己把“shanghai_tech_part_a_train.list”做出来并且让他可以运行
这两个文件我猜测应该是这样写的,配置完了,出了新的错误
新的报错是
初步看是由于python或者pytorch的版本问题而出错的,可以尝试把环境配置成和作者要求的一样试一试(我后面配置了和作者要求的版本完全一样的环境,还是报错,后面说了)
Start training Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> import torch File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> raise err OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies. Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> import torch File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> raise err OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies. Traceback (most recent call last): File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path Traceback (most recent call last): pkg_name=pkg_name, script_name=fname) File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main mod_name, mod_spec, pkg_name, script_name) exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code Traceback (most recent call last): File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main Traceback (most recent call last): exec(code, run_globals) File "<string>", line 1, in <module> prepare(preparation_data) File "<string>", line 1, in <module> File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main import torch exitcode = _main(fd) _fixup_main_from_path(data['init_main_from_path']) exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main raise err Traceback (most recent call last): prepare(preparation_data) run_name="__mp_main__") prepare(preparation_data) OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies. File "<string>", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main _fixup_main_from_path(data['init_main_from_path']) pkg_name=pkg_name, script_name=fname) _fixup_main_from_path(data['init_main_from_path']) exitcode = _main(fd) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main run_name="__mp_main__") mod_name, mod_spec, pkg_name, script_name) run_name="__mp_main__") prepare(preparation_data) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare pkg_name=pkg_name, script_name=fname) exec(code, run_globals) pkg_name=pkg_name, script_name=fname) _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path mod_name, mod_spec, pkg_name, script_name) import torch mod_name, mod_spec, pkg_name, script_name) run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path exec(code, run_globals) raise err exec(code, run_globals) pkg_name=pkg_name, script_name=fname) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies. File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code import torch import torch mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code raise err raise err exec(code, run_globals) OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies. OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies. File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\train.py", line 7, in <module> import torch File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> raise err OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies. Averaged stats: Traceback (most recent call last): File "train.py", line 228, in <module> main(args) File "train.py", line 167, in main args.clip_max_norm) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\engine.py", line 120, in train_one_epoch print("Averaged stats:", metric_logger) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\util\misc.py", line 186, in __str__ "{}: {}".format(name, str(meter)) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\util\misc.py", line 85, in __str__ median=self.median, File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\util\misc.py", line 64, in median return d.median().item() RuntimeError: median cannot be called with empty tensor
作者要求的环境:python=3.6.5,torch=1.5.0;我目前用的环境python=3.7.11,torch=1.7.1
我新开的了一个虚拟环境“py3.6.5_torch1.5”,python=3.6.5
确认python版本是不是3.6.5——是
然后装一个pytorch=1.5.0——是的,只不过是cpu版本的(一个电脑安装多个版本的cuda:https://www.cnblogs.com/yuyingblogs/p/16323438.html)
# 我的cuda版本是11.0;代码用了cuda,所以我得安装GPU版本的pytorch conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=11.0 -c pytorch
用这个新配置的环境运行了一下,有下面这个报错
显示你的程序读进去的是一个空的tensor,说明数据的位置还是不对,所以错了
Traceback (most recent call last): File "train.py", line 226, in <module> main(args) File "train.py", line 165, in main args.clip_max_norm) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\engine.py", line 120, in train_one_epoch print("Averaged stats:", metric_logger) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\util\misc.py", line 190, in __str__ "{}: {}".format(name, str(meter)) File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\util\misc.py", line 89, in __str__ median=self.median, File "F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\模型创新\2021_P2PNet\P2PNET_ROOT\util\misc.py", line 68, in median return d.median().item() RuntimeError: median cannot be called with empty tensor
是否考虑在数据集下面加一个scene文件夹,是不是文件夹格式不对造成的?——我尝试把文件整理成下面这样,但是运行了,还是报上面那个错误
train/scene01/img01.jpg train/scene01/img01.txt train/scene01/img02.jpg train/scene01/img02.txt ... train/scene02/img01.jpg train/scene02/img01.txt
我觉得上面出错应该是我用的运行命令不对造成的,你试试这个能不能运行
python train.py --data_root $DATA_ROOT \ --dataset_file SHHA \ --epochs 3500 \ --lr_drop 3500 \ --output_dir ./logs \ --checkpoints_dir ./weights \ --tensorboard_dir ./logs \ --lr 0.0001 \ --lr_backbone 0.00001 \ --batch_size 8 \ --eval_freq 1 \ --gpu_id 0
再去阅读代码,查看数据从哪里调用的?
train.py里面Line126的“train_set, val_set = loading_data(args.data_root)”我把args.data_root打印出来发现调用数据的地址是“./new_public_density_data”
看到这个位置,图片和ground truth要配对,是不是我没做好文件?
我极度怀疑跑不通的原因是:文件“shanghai_tech_part_a_train.list”里面应该一行一个图片名字一个ground truth文件的名字才能运行
C-3-Framework可以跑跑,据说各个数据集都跑通了
2022_CLTR(数据)
-
代码地址:dk-liang / CLTR;数据集: JHU-CROWD ++,NWPU-Crowd dataset
-
[paper] [code][project]
第一步整理数据集的时候就报错 python prepare_jhu.py --data_path /xxx/xxx/jhu_crowd_v2.0 [ WARN:0@0.670] global D:\a\opencv-python\opencv-python\opencv\modules\imgcodecs\src\loadsave.cpp (239) cv::findDecoder imread_('F:\FILES_OF_ALBERT\IT_paid_class\鐐瑰ご-姣曚笟璁烘枃\鍏紑鏁版嵁闆哱jhu_crowd_v2.0/test/images\0002.jpg'): can't open/read file: check file path/integrity F:\FILES_OF_ALBERT\IT_paid_class\点头-毕业论文\公开数据集\jhu_crowd_v2.0/test/images\0002.jpg Traceback (most recent call last): File "prepare_jhu.py", line 64, in <module> if img.shape[1] >= img.shape[0] and img.shape[1] >= 2048: AttributeError: 'NoneType' object has no attribute 'shape'
2020_M-SFANet(数据)
[M-SFANet] Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting (ICPR) [paper][code]
-
代码地址:https://github.com/Pongpisit-Thanasutives/Variations-of-SFANet-for-Crowd-Counting;数据集: Shanghaitech datasets (A&B)(拿到了) 、The Beijing-BRT dataset(有)
-
我猜原因是:数据文件夹没有按照他的要求放在他规定的位置,所以他拿不到数据,说你样本量为0,所以报错
File "train.py", line 60, in <module> trainer.setup() File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2020_M-SFANet\utils\regression_trainer.py", line 66, in setup for x in ['train', 'val']} File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2020_M-SFANet\utils\regression_trainer.py", line 66, in <dictcomp> for x in ['train', 'val']} File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\utils\data\dataloader.py", line 262, in __init__ sampler = RandomSampler(dataset, generator=generator) # type: ignore File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\utils\data\sampler.py", line 104, in __init__ "value, but got num_samples={}".format(self.num_samples)) ValueError: num_samples should be a positive integer value, but got num_samples=0
2019_BL(环境)
[BL] Bayesian Loss for Crowd Count Estimation with Point Supervision (ICCV(oral)) [paper][code]
github地址:https://github.com/ZhihengCV/Bayesian-Crowd-Counting;数据集:UCF-QNRF;
数据集放在了(F:\home\teddy),我不知道怎么样能指定我想存数据的地址
我觉得是你环境不符合要求导致的报错,报错都出现咋py37_torch171的包里面(这个多线程的包出了问题multiprocessing\spawn.py)——要不然就是我电脑配置不行,训练不起来;torch包本身也有问题(OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.)
Traceback (most recent call last): Traceback (most recent call last): File "<string>", line 1, in <module> File "train.py", line 58, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) trainer.train() File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2019_BL\utils\regression_trainer.py", line 88, in train File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 225, in prepare self.train_eopch() _fixup_main_from_path(data['init_main_from_path']) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2019_BL\utils\regression_trainer.py", line 100, in train_eopch run_name="__mp_main__") File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 263, in run_path for step, (inputs, points, targets, st_sizes) in enumerate(self.dataloaders['train']): pkg_name=pkg_name, script_name=fname) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 96, in _run_module_code File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\utils\data\dataloader.py", line 352, in __iter__ mod_name, mod_spec, pkg_name, script_name) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\runpy.py", line 85, in _run_code return self._get_iterator() exec(code, run_globals) File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2019_BL\train.py", line 1, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\utils\data\dataloader.py", line 294, in _get_iterator from utils.regression_trainer import RegTrainer File "F:\FILES OF ALBERT\IT_paid_class\点头-毕业论文\模型创新\2019_BL\utils\regression_trainer.py", line 6, in <module> return _MultiProcessingDataLoaderIter(self) import torch File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\__init__.py", line 117, in <module> File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\utils\data\dataloader.py", line 801, in __init__ raise err OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies. w.start() File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\process.py", line 112, in start self._popen = self._Popen(self) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__ reduction.dump(process_obj, to_child) File "D:\Programing_File\Anaconda3\envs\py37_torch171\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) BrokenPipeError: [Errno 32] Broken pipe
2022_GauNet(需要ubuntu环境)
(模仿了这个模型 DAU-ConvNet)
ubuntu环境,换了Ubuntu再运行
[GauNet] Rethinking Spatial Invariance of Convolutional Networks for Object Counting (CVPR) [paper][code]
-
代码地址:zhiqic/Rethinking-Counting ;数据集:QNRF,UCF50,SHHA,SHHB(我猜就是shanghai tech partA和B),UCF50]
-
配环境的过程
环境要求
Python=3.7(一开始说是3.5) tensorflow=1.13.1(这个版本tf适配的python是3.3-3.7) Ubuntu 16.04 (not tested on other OS and other versions) C++11 CMake 2.8 or newer (tested on version 3.5) CUDA SDK Toolkit (tested on version 8.0 and 9.0) BLAS (ATLAS or OpenBLAS) cuBlas
# 配置虚拟环境,python3.5 conda create --name py3.5_tf1.13.1 python=3.5 # python后的版本只允许写3.5,你写3.5.0就装不上了 从网站把whl文件下载下来,用pip安装————https://github.com/skokec/DAU-ConvNet/releases (安装whl文件会顺便把TensorFlow1.13.1下载下来,所以这一步之前,)
2019_SFCN(环境问题)
-
代码地址:gjy3035/GCC-SFCN ;数据集:UCF-QNRF(有了)
他要求python2.7 pytorch 0.5.,
但是装了 py2.7以后pytorch0.5是无法安装的;最新版本的pytorch也无法安装
The following specifications were found to be incompatible with each other: Output in format: Requested package -> Available versions Package vc conflicts for: torchvision==0.4.0 -> numpy[version='>=1.11'] -> vc[version='14.*|9.*|>=14.1,<15.0a0'] python=2.7 -> sqlite[version='>=3.30.1,<4.0a0'] -> vc[version='>=14.1,<15.0a0'] pytorch==1.2.0 -> cffi -> vc[version='14.*|9.*|>=14.1,<15.0a0'] cudatoolkit=11.0 -> vc[version='>=14.1,<15.0a0'] python=2.7 -> vc=9 Package zlib conflicts for: python=2.7 -> sqlite[version='>=3.30.1,<4.0a0'] -> zlib[version='>=1.2.11,<1.3.0a0'] torchvision==0.4.0 -> pillow[version='>=4.1.1'] -> zlib[version='>=1.2.11,<1.3.0a0'] Package vs2015_runtime conflicts for: cudatoolkit=11.0 -> vc[version='>=14.1,<15.0a0'] -> vs2015_runtime[version='>=14.15.26706|>=14.27.29016|>=14.16.27012'] cudatoolkit=11.0 -> vs2015_runtime[version='>=14.16.27012,<15.0a0'] Package cudatoolkit conflicts for: pytorch==1.2.0 -> cudatoolkit[version='>=10.0,<10.1|>=9.2,<9.3'] torchvision==0.4.0 -> cudatoolkit[version='>=10.0,<10.1|>=9.2,<9.3']