Layer List로 이루어진 모델 RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same 오류 해결방법

Problem

DeepMC 모델을 구현하던 중, Layer List를 선언해야할 일이 생겼다.

WPD를 이용해 다중 스케일로 분류한 입력값으로(실험에서는 7개 스케일) 동일한 구조를 가진 CNN stack를 학습해야하는 상황이다.

그래서 CNN stack을 다음같이 선언했다.

self.CNNstacks = [CNNstack(self.num_encoder_feature) 
                  for _ in range(self.num_of_CNN_stacks)]

그리고 모델 fit을 진행했더니 뱉는 에러가 RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same 에러였다.

에러 전문은 다음과 같다.

  File "trainer.py", line 81, in <module>
    trainer.fit(deepmc, datamodule=dl)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 460, in fit
    self._run(model)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 758, in _run
    self.dispatch()
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 799, in dispatch
    self.accelerator.start_training(self)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
    self._results = trainer.run_stage()
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in run_stage
    return self.run_train()
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 844, in run_train
    self.run_sanity_check(self.lightning_module)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1112, in run_sanity_check
    self.run_evaluation()
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 967, in run_evaluation
    output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 174, in evaluation_step
    output = self.trainer.accelerator.validation_step(args)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 226, in validation_step
    return self.training_type_plugin.validation_step(*args)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in validation_step
    return self.lightning_module.validation_step(*args, **kwargs)
  File "/home/ubuntu/jini1114/DeepMC/net/deepmc.py", line 150, in validation_step
    y_hat = self([X,U])
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/jini1114/DeepMC/net/deepmc.py", line 101, in forward
    CNNs = [
  File "/home/ubuntu/jini1114/DeepMC/net/deepmc.py", line 102, in <listcomp>
    self.CNNstacks[i](torch.cat((X[:,self.X_levels[i],:,:],U[:,self.U_levels[i],:,:]),1)) for i in range(self.num_of_CNN_stacks)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/jini1114/DeepMC/net/encoder.py", line 68, in forward
    return self.sequence(WPD)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 263, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 259, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

에러 전문을 확인하면 알 수 있듯이, CNN stack forward 과정에서 에러가 발생했다.

Input type은 cuda인데 weight type이 cuda가 아니라서 발생한 문제였다.

그래서 .cuda()를 붙인 CNN stack을 미리 만들고 deepcopy도 해보고, list comprehension할 때 .cuda()도 붙여보고 별 짓을 다해봤다.

그리고 이렇게 선언한 layer는 summary에서 인식하지 못했다.

분명히 CNNstack과 Scaled_guided_attention에서 layer list를 사용해서 선언을 했을텐데, parameter로 잡히지 않았다.

Solution

해답은 의외로 간단했다.

layer를 list로 선언하는 것 자체는 문제가 없었지만, 한가지 단계를 추가해야 했다.

ModulList를 이용해서 layer list를 감싸줘야 했다.

self.CNNstacks = [CNNstack(self.num_encoder_feature) 
                    for _ in range(self.num_of_CNN_stacks)]
self.CNNstacks = torch.nn.ModuleList(self.CNNstacks)

이렇게하면 에러도 사라지고 parameter도 정상적으로 인식되는 것을 볼 수 있다.

위에서는 없었던 CNNstack과 Scaled_guided_attention에 parameter가 잡히는 것을 확인할 수 있다.

reference : https://discuss.pytorch.org/t/runtimeerror-tensor-for-out-is-on-cpu-tensor-for-argument-1-self-is-on-cpu-but-expected-them-to-be-on-gpu-while-checking-arguments-for-addmm/105453/7

RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checkin

Hi, I found the solution to this problem. I forgot to use ModuleList in class defining Residual Block. When I added it, the code ran perfectly. Here’s the modified code: # Residual Block class DenseResidual(torch.nn.Module): def __init__(self, inp_dim, n

discuss.pytorch.org

'인공지능 > Python' 카테고리의 다른 글

원본 이미지와 마스크 이미지 합성(with opacity) (0)	2022.07.13
Pytorch lightning DDP constant가 device에 할당되지 않는 문제 / Tensor for argument #2 'mat1' is on CPU, but expected it to be on GPU (0)	2022.06.21
pytorch_lightning.utilities.exceptions.MisconfigurationException: to use manual_backward, please disable automatic optimization: set model property automatic_optimization as False 에러 해결 (0)	2022.06.14
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time. 에러해결 (0)	2022.06.14
Python 그래프 부드럽게 그리기 (Python plot line smooth) (0)	2022.05.24

은긔 노트

Layer List로 이루어진 모델 RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same 오류 해결방법

Problem

Solution

'인공지능 > Python' 카테고리의 다른 글

티스토리툴바

Layer List로 이루어진 모델 RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same 오류 해결방법

Problem

Solution

'인공지능 > Python' 카테고리의 다른 글

'인공지능/Python' Related Articles

티스토리툴바