Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in GPU Offload Configuration in the example #67

Open
vegax87 opened this issue Sep 7, 2024 · 2 comments
Open

Error in GPU Offload Configuration in the example #67

vegax87 opened this issue Sep 7, 2024 · 2 comments

Comments

@vegax87
Copy link

vegax87 commented Sep 7, 2024

I encountered an issue while trying to use the gpuOffload configuration as stated in the README

const llama3 = await client.llm.load(modelPath, { config: { gpuOffload: "max" } });

However, this resulted in an error indicating that the gpuOffload parameter was expected to be an object, not a string. Actually gpuOffload requires 3 parameters

const llama3 = await client.llm.load(modelPath, { 
  config: { 
    gpuOffload: { 
      ratio: 1.0, 
      mainGpu: 0, 
      tensorSplit: [1.0] 
    } 
  } 
});

ratio: Specifies the proportion of the workload to be offloaded to the GPU. A value of 1.0 means the entire workload will be handled by the GPU.
mainGpu: Indicates which GPU ID to use as the primary one. For example, 0 refers to the first GPU in the system.
tensorSplit: An array that specifies how to split the tensors among the GPUs. [1.0] means the entire workload will be handled by the primary GPU.

@Connum
Copy link

Connum commented Feb 11, 2025

Hi!

It also requires a splitStrategy property with "evenly" or "favorMainGpu" as a value. According to the code comment:

 * - "evenly": Splits model evenly across GPUs
 * - "favorMainGpu": Fill the main GPU first, then fill the rest of the GPUs evenly

Why is this not being updated in the examples?

@ryan-the-crayon
Copy link
Contributor

ryan-the-crayon commented Feb 11, 2025

@Connum Hi, we are still rapidly stabilizing the APIs. The example code were outdated and will be replaced once we stabilize the API very soon. As for GPU config, it is actually much less important to specify the offload ratio now as it is now determined automatically by LM Studio based on your hardware + model combo. You only need to specify the field if you want to override it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants