(work2) andy@gpu-machine:~/deepFM_CTR_beat/model_train$ python beat_deepFM_train.py
开始模型训练:!!!!
Traceback (most recent call last):
File "beat_deepFM_train.py", line 155, in <module>
model = model.to(device)
File "/home/andy/.conda/envs/fun/lib/python3.8/site-packages/torch/nn/modules/module.py", line 907, in to
return self._apply(convert)
File "/home/andy/.conda/envs/fun/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply
module._apply(fn)
File "/home/andy/.conda/envs/fun/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply
module._apply(fn)
File "/home/andy/.conda/envs/fun/lib/python3.8/site-packages/torch/nn/modules/module.py", line 601, in _apply
param_applied = fn(param)
File "/home/andy/.conda/envs/fun/lib/python3.8/site-packages/torch/nn/modules/module.py", line 905, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA out of memory. Tried to allocate 5.96 GiB (GPU 0; 9.78 GiB total capacity; 1.07 GiB already allocated; 5.81 GiB free; 1.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
if hasattr(torch.cuda, 'empty_cache'):
torch.cuda.empty_cache()
taskkill /F /PID <your PID here>
杀死对应的进程。batch_szie参数:
(1)提高了内存利用率,大矩阵乘法并行计算效率提高。
(2)计算的梯度方向比较准,引起的训练的震荡比较小。
(3)跑完一次epoch所需要的迭代次数变小,相同数据量的数据处理速度加快。
缺点:容易内容溢出,想要达到相同精度,epoch会越来越大,容易陷入局部最优,泛化性能差。
batchsize设置:通常10到100,一般设置为2的n次方。
原因:计算机的gpu和cpu的memory都是2进制方式存储的,设置2的n次方可以加快计算速度。