Skip to content

Commit

Permalink
Reuse cache for different generation size
Browse files Browse the repository at this point in the history
  • Loading branch information
leng-yue committed Dec 20, 2023
1 parent 95d90c8 commit 0dafcf3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tools/llama/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ def decode_n_tokens(
**sampling_kwargs,
):
previous_tokens = torch.zeros(
(model.config.num_codebooks + 1, num_new_tokens),
(model.config.num_codebooks + 1, model.config.max_seq_len),
dtype=torch.int,
device=cur_token.device,
)
Expand Down

0 comments on commit 0dafcf3

Please sign in to comment.