Replies: 2 comments
-
To limit the maximum token size of a session and prevent it from becoming too large during extended conversations, you can use the Here's a relevant code snippet: class TokenBufferMemory:
def __init__(self, conversation: Conversation, model_instance: ModelInstance) -> None:
self.conversation = conversation
self.model_instance = model_instance
def get_history_prompt_messages(self, max_token_limit: int = 2000,
message_limit: Optional[int] = None) -> list[PromptMessage]:
# fetch limited messages, and return reversed
query = db.session.query(Message).filter(
Message.conversation_id == self.conversation.id,
Message.answer != ''
).order_by(Message.created_at.desc())
if message_limit and message_limit > 0:
messages = query.limit(message_limit).all()
else:
messages = query.all()
messages = list(reversed(messages))
prompt_messages = [UserPromptMessage(content=message.query) for message in messages]
prompt_messages += [AssistantPromptMessage(content=message.answer) for message in messages]
# prune the chat message if it exceeds the max token limit
curr_message_tokens = self.model_instance.get_llm_num_tokens(prompt_messages)
while curr_message_tokens > max_token_limit and prompt_messages:
prompt_messages.pop(0)
curr_message_tokens = self.model_instance.get_llm_num_tokens(prompt_messages)
return prompt_messages Additionally, the class AppRunner:
def recalc_llm_max_tokens(self, model_config: ModelConfigWithCredentialsEntity,
prompt_messages: list[PromptMessage]):
model_instance = ModelInstance(
provider_model_bundle=model_config.provider_model_bundle,
model=model_config.model
)
model_context_tokens = model_config.model_schema.model_properties.get(ModelPropertyKey.CONTEXT_SIZE)
max_tokens = model_config.parameters.get('max_tokens', 0)
prompt_tokens = model_instance.get_llm_num_tokens(prompt_messages)
if prompt_tokens + max_tokens > model_context_tokens:
max_tokens = max(model_context_tokens - prompt_tokens, 16)
model_config.parameters['max_tokens'] = max_tokens This method ensures that the total number of tokens in the session does not exceed the |
Beta Was this translation helpful? Give feedback.
-
为什么 OpenLLMLargeLanguageModel 模型计算tokens数量写死了 使用 _get_num_tokens_by_gpt2 方法?我部署了本地的千问模型,接入到本地部署的 dify,中文对话,_get_num_tokens_by_gpt2 对tokens的计算会偏大很多。导致输入内容长度严重偏小。 |
Beta Was this translation helpful? Give feedback.
-
How do you limit the maximum token size of a session so that you don't have a huge token size if you keep talking in a session
Beta Was this translation helpful? Give feedback.
All reactions