Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For Ollama add a configuration parameter for context size #1253

Open
rubixhacker opened this issue Feb 16, 2025 · 3 comments
Open

For Ollama add a configuration parameter for context size #1253

rubixhacker opened this issue Feb 16, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@rubixhacker
Copy link

Ollama defaults to a context size of 2048 token and Goose is often exceeding that window, when this happens Ollama truncates the input. This leads to suboptimal results from the LLM.

Output from Ollama when this occurs:
time=2025-02-16T13:27:35.103Z level=WARN source=runner.go:129 msg="truncating input prompt" limit=2048 prompt=3520 keep=4 new=2048

When sending the request to Ollama the context size can be specified by adding the following to the payload
"options": { "num_ctx": 4096 }

@yingjiehe-xyz yingjiehe-xyz added the enhancement New feature or request label Feb 17, 2025
@addhyh
Copy link

addhyh commented Feb 20, 2025

agree.And I hope to see the CTX usage of LLM like gemini studio.

@tiensi
Copy link
Contributor

tiensi commented Feb 22, 2025

Did a bit of a dive into this. Unfortunately Goose is hitting Ollama with the OpenAi API v1/chat/completions which does not expose any direct way of modifying the model's context size.

The recommendation (seen in the above link) is to create your own ollama model with a context size and point to it with the same api. This seems convoluted and not the right solution for this problem. A short term solution assuming you're running an ollama service locally would be to update your existing instance with the desired context window

There was work planned to expose num_ctx as an openai supported endpoint but it was discarded. Instead what seems to be in progress is setting the context length via an environment variable .

If the above work goes through we can maybe add a helper in goose to set the environment variable, but I'll defer to the main Goose team to decide whether that's appropriate.

@CrazyBoyM
Copy link

ollama is just a toy for chinldren developers, 2k context size can not do anything. and people do not want change there openai sdk to ollama's/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants