Setting ignore_eos for llama serverless endpoint #51

vidhishanair · 2024-11-15T22:39:55Z

eureka-ml-insights/eureka_ml_insights/models/models.py

Line 279 in 1713e79

ignore_eos: str = "false"

LlamaServerlessAzureRestEndpointModel which is used to run 405B models sets ignore_eos: str = "false" by default. This is passed to the api as a string and as a consequence it's not set correctly. This causes the model to continue post EOS and generate random tokens till max_token limit.

Fix: Need to set ignore_eos: bool = False. I have tested this fix for Calendar Planning.

We will need to test other bool flags like skip_special_tokens and use_beam_search as well. Similar str flags are there in Mistral model class too.

nushib linked a pull request Nov 16, 2024 that will close this issue

Serverless fix #52

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting ignore_eos for llama serverless endpoint #51

Setting ignore_eos for llama serverless endpoint #51

vidhishanair commented Nov 15, 2024

Setting ignore_eos for llama serverless endpoint #51

Setting ignore_eos for llama serverless endpoint #51

Comments

vidhishanair commented Nov 15, 2024