Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] llama-cpp, C++ gRPC backend #1154

Closed
mudler opened this issue Oct 9, 2023 · 5 comments · Fixed by #1170
Closed

[backend] llama-cpp, C++ gRPC backend #1154

mudler opened this issue Oct 9, 2023 · 5 comments · Fixed by #1170
Assignees
Labels

Comments

@mudler
Copy link
Owner

mudler commented Oct 9, 2023

Is your feature request related to a problem? Please describe.
keeping in sync the golang binding can lag behind for certain reasons related to the golang toolchain. It'd be easier to also have a c++ implementation that requires less maitenance and so can be bumped almost automatically

Describe the solution you'd like
At this point we can also have a pure C++ llama-cpp gRPC server binding with just the few little things we need to add on top

Describe alternatives you've considered

Additional context
This would allow us to open up bugs upstream more easily as we have a less heavy implementation on top. We could also upstream the gRPC server

@mudler mudler added the enhancement New feature or request label Oct 9, 2023
@mudler mudler self-assigned this Oct 9, 2023
@mudler mudler added the roadmap label Oct 9, 2023
@mudler mudler removed their assignment Oct 9, 2023
@Aisuko
Copy link
Collaborator

Aisuko commented Oct 10, 2023

Yes, this is a good solution. Sometimes is so hard to follow the log from the golang binding packages.

@mudler mudler self-assigned this Oct 10, 2023
@mudler
Copy link
Owner Author

mudler commented Oct 11, 2023

JFYI I'm currently playing with this

@Aisuko
Copy link
Collaborator

Aisuko commented Oct 12, 2023

Sounds really good. And I believe that the C++ backend can be an example for other languages, like: Rust. I am planing on how to implement the Rust backend. Can I say the basic requirements of our backend are:

  • Same proto file to keep same behaviors of API
  • async should be as default (Streaming response should be better)
  • Only server side gRPC services

Is that right?

@mudler
Copy link
Owner Author

mudler commented Oct 13, 2023

Sounds really good. And I believe that the C++ backend can be an example for other languages, like: Rust. I am planing on how to implement the Rust backend. Can I say the basic requirements of our backend are:

* Same `proto` file to keep same behaviors of API

* `async` should be as default (Streaming response should be better)

* Only server side gRPC services

Is that right?

looks reasonable to me! I think also the golang gRPC services are good examples as well, probably C++ is going to be a bit more verbose-ish

@localai-bot
Copy link
Contributor

⚠️⚠️⚠️⚠️⚠️

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful 😸 and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me,
I don't support (yet) replying!

⚠️⚠️⚠️⚠️⚠️

I understand that you are having an issue with the LocalAI project and would like to propose a solution to improve its golang binding. You mentioned that the current implementation can lag behind due to the maintenance required for the toolchain, and that a C++ implementation could be a viable solution.
To implement this solution, we can create a pure C++ llama-cpp gRPC server binding with just the necessary features needed for LocalAI. This would require less maintenance and could be bumped automatically without worrying about compatibility issues.
We can also consider upstreaming the gRPC server to allow for easier bug reporting and collaboration with the community.
As for alternatives, we could explore other languages or frameworks that may offer better performance or ease of use, but C++ seems like a good fit for this specific use case due to its low overhead and high performance.
I hope this helps! Let me know if you have any further questions or concerns.

Sources:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants