To solve PyTorch Model Training: RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR follow any of the below methods.
Contents
Error log
self.dropout, self.training, self.bidirectional, self.batch_first)
else:
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
or
RuntimeError: CUDA error: unspecified launch failure
How to solve RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR ?
The error RuntimeError: cuDNN error: CUDNN STATUS INTERNAL ERROR is generally tough to troubleshoot, however it’s generally due to a lack of memory. Normally, you would receive an out of memory error, however depending on where it occurs, PyTorch is unable to intercept the error and hence cannot deliver a relevant error message.
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
Try reducing the number of workers and also try to downgrade the cudatoolkit to version 10.1. So try to reinstall pytorch with cudatoolkit 10.1 using :
conda install pytorch torchvision cudatoolkit=10.1
Hope it helps.