Add code for running the Eval Harness in t5x #10

DanielHesslow · 2021-11-24T15:49:28Z

Adding support for running the EleutherAI Evaulation Harness directly addressing issue #4.

thomasw21

Haven't read the entire code, but it would be nice to check that the whole pipeline works. Typically are you able to load one of the checkpoints to run evaluation. Otherwise awesome work!

thomasw21 · 2021-11-25T13:45:15Z

bigscience/gins/eval_harness.gin

+  num_partitions = 4
+  model_parallel_submesh = (2,1,1,1)
+
+TASK_FEATURE_LENGTHS = {"inputs": 512, "targets": 114}


I'm confused by this at inference, like how do you make a sample fit inside this?

Not quite sure what you mean, but fit as in fit in memory?

In that case I didn't play with it too much but since we're not storing grads and the batch size is small everything seems to work out fine even for the xxl. Just reduced it since we cant partition the small one 4 times.

If you mean the length of the features I should probably find a way to make sure that it's never truncated. The tasks I've looked at in the EH are quite short though so didn't seem to be an issue but should probably add an assert. Will be more of an issue if we look at few-shot instead of zero-shot.

Ah okay I see, it automatically pads to those sequence lengths right? Concerning the truncation problem ... that's a good problem. We tried tracking the length of each task in this google sheet (shared internally). And it seems to be okay-ish to truncate (most samples will fit) race might be problematic though.

bigscience/gins/eval_harness.gin

thomasw21 · 2021-11-25T13:48:19Z

bigscience/gins/eval_harness.gin

+utils.RestoreCheckpointConfig:
+  path = %CHECKPOINT_PATH
+  mode = 'specific'
+  dtype = 'bfloat16'


I'm saving them in float32 don't know if this impacts if you load a float32 checkpoint in bfloat16. I have some earlier checkpoints, if you could run inference on them that'd be awesome!

Good point, but I don't think it should be an issue, since the training is in bfloat16 the inf should work as well. I'll check and see if it makes any difference though.

Sure, send me a path and I can test it.

DanielHesslow · 2021-11-25T14:14:10Z

Running on checkpoints is just a matter of doing the following.

python3 ${T5X_DIR}/t5x/eval_harness.py
--gin_file_="t5x/examples/t5/t5_1_1/small.gin"
--gin_file_="t5x/bigscience/gins/eval_harness.gin"
--gin.INFER_OUTPUT_DIR="'.'"
--gin.DROPOUT_RATE=0.0
--gin.CHECKPOINT_PATH="'gs://t5-data/pretrained_models/t5.1.1.lm100k.small/model.ckpt-1100000'"
--results_path /home/Danie/base_test.json

Not tested on our current checkpoints, but should just be a matter of changing the checkpoint path.

thomasw21

Based on successful evaluation runs (that seemed coherent). Let's merge this to allow others to run evaluation more easily. Haven't had the time to review it though.

Add code for running the Eval Harness in t5x

64271d2

thomasw21 reviewed Nov 25, 2021

View reviewed changes

Remove duplicated use_cached

3caa0a0

thomasw21 approved these changes Dec 6, 2021

View reviewed changes

DanielHesslow merged commit b438d49 into main Dec 6, 2021

thomasw21 pushed a commit that referenced this pull request Dec 7, 2021

Add code for running the Eval Harness in t5x (#10)

887e01b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add code for running the Eval Harness in t5x #10

Add code for running the Eval Harness in t5x #10

DanielHesslow commented Nov 24, 2021

thomasw21 left a comment

thomasw21 Nov 25, 2021

DanielHesslow Nov 25, 2021

DanielHesslow Nov 25, 2021

thomasw21 Nov 25, 2021

thomasw21 Nov 25, 2021

DanielHesslow Nov 25, 2021 •

edited

Loading

DanielHesslow commented Nov 25, 2021

thomasw21 left a comment

Add code for running the Eval Harness in t5x #10

Add code for running the Eval Harness in t5x #10

Conversation

DanielHesslow commented Nov 24, 2021

thomasw21 left a comment

Choose a reason for hiding this comment

thomasw21 Nov 25, 2021

Choose a reason for hiding this comment

DanielHesslow Nov 25, 2021

Choose a reason for hiding this comment

DanielHesslow Nov 25, 2021

Choose a reason for hiding this comment

thomasw21 Nov 25, 2021

Choose a reason for hiding this comment

thomasw21 Nov 25, 2021

Choose a reason for hiding this comment

DanielHesslow Nov 25, 2021 • edited Loading

Choose a reason for hiding this comment

DanielHesslow commented Nov 25, 2021

thomasw21 left a comment

Choose a reason for hiding this comment

DanielHesslow Nov 25, 2021 •

edited

Loading