Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate MedQA_USMLE on a saved model #13

Open
manusikka opened this issue Mar 15, 2023 · 1 comment
Open

Evaluate MedQA_USMLE on a saved model #13

manusikka opened this issue Mar 15, 2023 · 1 comment

Comments

@manusikka
Copy link

Hello,

We followed your steps using deepspeed and were able to create a fine tuned model which was basically created as a checkpoint by the run. We saved this model and then loaded it next time using something like this:
tokenizer = GPT2Tokenizer.from_pretrained("/content/drive/MyDrive/Colab Notebooks/SavedModel")
model = GPT2LMHeadModel.from_pretrained("/content/drive/MyDrive/Colab Notebooks/SavedModel")

Now we wanted to run a sample question inference on this model and were using this link
https://huggingface.co/docs/transformers/tasks/multiple_choice#inference

Here is our code:
prompt = ("A 20-year-old woman presents with menorrhagia for the past several years."
"She says that her menses “have always been heavy”, and she has experienced easy bruising for as long as she can remember."
"Family history is significant for her mother, who had similar problems with bruising easily. "
"The patient's vital signs include: heart rate 98/min, respiratory rate 14/min, temperature 36.1°C (96.9°F),"
" and blood pressure 110/87 mm Hg. Physical examination is unremarkable. "
" Laboratory tests show the following: platelet count 200,000/mm3, PT 12 seconds,"
" and PTT 43 seconds. Which of the following is the most likely cause of this patient’s symptoms?")
candidate1 = "Factor V Leiden"
candidate2 = "Hemophilia A"
candidate3 = "Lupus anticoagulant"
candidate4 = "Protein C deficiency"
candidate5 = "Von Willebrand disease"

inputs = tokenizer([[prompt, candidate1], [prompt, candidate2],[prompt, candidate3],[prompt, candidate4],[prompt, candidate5]], return_tensors="pt", padding=True)
labels = torch.tensor(0).unsqueeze(0)

outputs = model(**{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels)
logits = outputs.logits

However we get this error:

ValueError Traceback (most recent call last)
in
2
3 #model = AutoModelForMultipleChoice.from_pretrained("my_awesome_swag_model")
----> 4 outputs = model(**{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels)
5 logits = outputs.logits

4 frames
/usr/local/lib/python3.9/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
2844 if size_average is not None or reduce is not None:
2845 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
2847
2848

ValueError: Expected input batch_size (840) to match target batch_size (0).

Do you have a recommendation on how to run a sample question inference on this model ?

@githubusera
Copy link

githubusera commented Mar 16, 2023

We've got it figured out:
instead of GPT2LMHeadModel, we had to use GPT2ForMultipleChoice

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("/content/drive/MyDrive/Colab Notebooks/SavedModel100")
import sys
sys.path.insert(0, '/content/BioMedLM/finetune')
from utils.custom_modeling_gpt2 import GPT2ForMultipleChoice
model = GPT2ForMultipleChoice.from_pretrained("/content/drive/MyDrive/Colab Notebooks/SavedModel100")

We are good now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants