Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multimodality support #2001

Merged
merged 11 commits into from
Jul 31, 2024
Merged

Add multimodality support #2001

merged 11 commits into from
Jul 31, 2024

Conversation

timothycarambat
Copy link
Member

@timothycarambat timothycarambat commented Jul 30, 2024

Pull Request Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ”¨ chore
  • πŸ“ docs

Relevant Issues

resolves #558

What is in this change?

  • Adds support for MultiModal models (cloud and local LLMs).
    Supported models must be all-in-one models. If demand is high enough we will break out vision from LLM

Supported

  • Openai
  • Anthropic
  • Gemini
  • OpenRouter
  • Kobold
  • TextWebGen
  • LMStudio
  • LocalAI
  • Ollama
  • LiteLLM
  • AWS BedRock

Unsupported

  • TogetherAI
  • Azure (can be done, but requires a lot more work and package updates)
  • HuggingFace - unsupported
  • MistralAI
  • Perplexity
  • Cohere
  • Generic OAI
  • GroqAi

Additional Information

  • The use of multi-modality in a non-multi-modal LLM will simply produce an error message in the UI from the provider.

  • The DNDFileUploader was refactored to move files to a high scope as well as provide it as a context around the chat

  • API support on the backend is not in this PR and will be done later once we know this to be stable.

  • Local development docker testing

  • Supporting documentation on docs site (PR: 74

Implementation testing

  • Openai
  • Anthropic
  • Gemini
  • OpenRouter
  • Kobold
  • TextWebGen
  • LMStudio
  • LocalAI
  • Ollama
  • LiteLLM
  • AWS Bedrock

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

@timothycarambat
Copy link
Member Author

RTM

@timothycarambat timothycarambat self-assigned this Jul 31, 2024
@timothycarambat timothycarambat merged commit 38fc181 into master Jul 31, 2024
@timothycarambat timothycarambat deleted the 558-multi-modal-support branch July 31, 2024 17:47
DipFlip pushed a commit to DipFlip/anything-llm that referenced this pull request Aug 4, 2024
* Add multimodality support

* Add Bedrock, KoboldCpp,LocalAI,and TextWebGenUI multi-modal

* temp dev build

* patch bad import

* noscrolls for windows dnd

* noscrolls for windows dnd

* update README

* update README

* add multimodal check
TuanBC pushed a commit to TuanBC/anything-llm that referenced this pull request Aug 26, 2024
* Add multimodality support

* Add Bedrock, KoboldCpp,LocalAI,and TextWebGenUI multi-modal

* temp dev build

* patch bad import

* noscrolls for windows dnd

* noscrolls for windows dnd

* update README

* update README

* add multimodal check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add multi-modal support for image generation (Ollama LLava, GPT4V, DALLE, SD)
1 participant