-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Add runtime model management api #540
Conversation
this is a little bit confusing, that means the model list should retuned by searching the target |
Since it's job is to manage the metadata, should it return all the models, no matter where it stored? or just apply some meaningful filters. In the last PR, different artifact store may have their own folder. what the |
@@ -120,6 +124,24 @@ async def unload_lora_adapter(request: UnloadLoraAdapterRequest, raw_request: Re | |||
return Response(status_code=200, content=response) | |||
|
|||
|
|||
@router.post("/v1/model/download") | |||
async def download_model(request: DownloadModelRequest): | |||
response = await ModelManager.model_download(request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is async call right? from client perspective, how do I know when it's finished? so I can orchestrate model loading request afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is async call right?
I initially planned to implement this async
using coroutines, but later did not follow this approach 🤣 .
However, I am wondering if it is necessary to ensure that all API interfaces are aysnc
? Or it can be partially aysnc
and partially sync
?
how do I know when it's finished?
Keep calling the post
API until the model's status returns downloaded
. Because this API will directly return the model status that implemented in #539 . And if necessary, a new process will be opened in the background for downloading, it will not wait for the download to complete before returning the result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model status could be sync. model download should be async but it introduces some complexity in orchestration. this is acceptable at this moment. I will get you involved in a meeting. VKE team is integrating this part
This is great! I left few comments not directly with the code, but some interface questions |
|
Overall it looks good to me. Let's merge this PR now. We may need additional changes later when we start to integrate with TOS models . |
* refact: add download extra config into downloader * refact: replace assert with Exception * feat: add model management api * fix: test cases * fix allow_file_suffix * fix style
Pull Request Description
Add runtime model management api
Download model

List model

Related Issues
Resolves: #196
Part of: #521
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]
: Corrections to existing functionality[CI]
: Changes to build process or CI pipeline[Docs]
: Updates or additions to documentation[API]
: Modifications to aibrix's API or interface[CLI]
: Changes or additions to the Command Line Interface[Misc]
: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.