Skip to content

CUDA error during inference #7

@wsong1106

Description

@wsong1106

Thanks for this great work, I was following the setup instructions and came across this issue when trying to run inference on my data. It might be due to the mismatch between the dimension of input tensor and the dimensions of the weight matrix. Do you know how we can resolve this? Thank.

Begin report impression labeling. The progress bar counts the # of batches completed:
The batch size is 18
  0%|                                                                                                                              | 0/1 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "label.py", line 147, in <module>
    y_pred = label(checkpoint_path, csv_path)
  File "label.py", line 96, in label
    out = model(batch, attn_mask)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/CheXbert/src/models/bert_labeler.py", line 44, in forward
    final_hidden = self.bert(source_padded, attention_mask=attention_mask)[0]
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 790, in forward
    encoder_attention_mask=encoder_extended_attention_mask,
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 407, in forward
    hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 368, in forward
    self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 314, in forward
    hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/transformers/modeling_bert.py", line 216, in forward
    mixed_query_layer = self.query(hidden_states)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/ENTER/envs/chexbert/lib/python3.7/site-packages/torch/nn/functional.py", line 1372, in linear
    output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions