RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time. 에러해결

인공지능/Python

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time. 에러해결

은긔짜응 2022. 6. 14. 15:50

Problem

Pytorch lightning으로 DeepMC 모델을 training하려고 시도하던중에 만난 에러이다.

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time.

optimizer를 호출하기 전에 backward를 2번 호출하면 발생하는 에러라고 말한다.

https://github.com/PyTorchLightning/pytorch-lightning/discussions/8549

RuntimeError: Trying to backward through the graph a second time · Discussion #8549 · PyTorchLightning/pytorch-lightning

I'm migrating my repository to pytorch-lightning and I get the following error: RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already...

github.com

Solution

에러에서 제시하고 있듯이 retain_graph=True로 바꿔준 다음에 backward를 호출하면 된다.

    def training_step(self, batch, batch_idx):
        X = batch[0]
        U = batch[1]
        Target = batch[2]
        y_hat = self([X,U])
        
        loss = self.loss(y_hat, Target.unsqueeze(2))
        self.log("training_loss", loss, on_step=True, on_epoch=True, sync_dist=True)
        # 추가한 부분
        self.manual_backward(loss, retain_graph=True)
        # 
        return loss

manual_backward를 호출하면 아마 다음 오류도 만나게 될텐데, 그건 다음 게시글에서 해결한다.