[v0.7.1rc1] FAQ & Feedback #19

Yikun · 2025-02-08T00:37:28Z

Please leave comments here about your usage of vLLM Ascend Plugin.

Does it work? Does it not work? Which models do you need? Which feature do you need? any bugs?

For in depth discussion, please feel free to join #sig-ascend in the vLLM Slack workspace.

FAQ:

1. What devices are currently supported?

Currently, only Atlas A2 series are supported.

Atlas A2 Training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
Atlas 800I A2 Inference series (Atlas 800I A2)

2. How to setup dev env, build and test?

Here is a step by step guide for building and testing.

(Updated on: 2025.02.08)

shannanyinxiang · 2025-02-17T09:13:35Z

Any plans to support qwen2.5-vl?

Yikun · 2025-02-17T09:20:13Z

Any plans to support qwen2.5-vl?

@shannanyinxiang According our test, the ~~qwen2.5-vl~~(qwen2-vl) already work, you can have a try. If you encounter any problems, pls feel free to raise a issue, we are also welcome to contribute the doc (like #53).

shannanyinxiang · 2025-02-17T09:29:55Z

Any plans to support qwen2.5-vl?

@shannanyinxiang According our test, the qwen2.5-vl already work, you can have a try. If you encounter any problems, pls feel free to raise a issue, we are also welcome to contribute the doc (like #53).

Thank you for your prompt reply!

invokerbyxv · 2025-02-19T14:52:00Z

Any plans to support qwen2.5-vl?

@shannanyinxiang According our test, the ~~qwen2.5-vl~~(qwen2-vl) already work, you can have a try. If you encounter any problems, pls feel free to raise a issue, we are also welcome to contribute the doc (like #53).

方便分享一下qwen2-vl的启动参数吗？

whu-dft · 2025-02-21T03:02:38Z

HELP!

I installed vllm with pip install vllm vllm_ascend, then I test it with vllm serve Qwen2.5-0.5B-Instruct, the error was reported as:

INFO 02-21 10:53:40 __init__.py:30] Available plugins for group vllm.platform_plugins:
INFO 02-21 10:53:40 __init__.py:32] name=ascend, value=vllm_ascend:register
INFO 02-21 10:53:40 __init__.py:34] all available plugins for group vllm.platform_plugins will be loaded.
INFO 02-21 10:53:40 __init__.py:36] set environment variable VLLM_PLUGINS to control which plugins to load.
INFO 02-21 10:53:40 __init__.py:44] plugin ascend loaded.
/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/torch_npu/utils/collect_env.py:58: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/torch_npu/utils/collect_env.py:58: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.RC3/aarch64-linux/ascend_toolkit_install.info owner does not match the current owner.
  warnings.warn(f"Warning: The {path} owner does not match the current owner.")
INFO 02-21 10:53:41 __init__.py:30] Available plugins for group vllm.platform_plugins:
INFO 02-21 10:53:41 __init__.py:32] name=ascend, value=vllm_ascend:register
INFO 02-21 10:53:41 __init__.py:34] all available plugins for group vllm.platform_plugins will be loaded.
INFO 02-21 10:53:41 __init__.py:36] set environment variable VLLM_PLUGINS to control which plugins to load.
INFO 02-21 10:53:41 __init__.py:44] plugin ascend loaded.
INFO 02-21 10:53:41 __init__.py:211] No platform detected, vLLM is running on UnspecifiedPlatform
INFO 02-21 10:53:41 __init__.py:211] No platform detected, vLLM is running on UnspecifiedPlatform
ERROR 02-21 10:53:41 engine.py:400] Failed to infer device type
ERROR 02-21 10:53:41 engine.py:400] Traceback (most recent call last):
ERROR 02-21 10:53:41 engine.py:400]   File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine
ERROR 02-21 10:53:41 engine.py:400]     engine = MQLLMEngine.from_engine_args(engine_args=engine_args,
ERROR 02-21 10:53:41 engine.py:400]   File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 119, in from_engine_args
ERROR 02-21 10:53:41 engine.py:400]     engine_config = engine_args.create_engine_config(usage_context)
ERROR 02-21 10:53:41 engine.py:400]   File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1126, in create_engine_config
ERROR 02-21 10:53:41 engine.py:400]     device_config = DeviceConfig(device=self.device)
ERROR 02-21 10:53:41 engine.py:400]   File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/config.py", line 1660, in __init__
ERROR 02-21 10:53:41 engine.py:400]     raise RuntimeError("Failed to infer device type")
ERROR 02-21 10:53:41 engine.py:400] RuntimeError: Failed to infer device type
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 402, in run_mp_engine
    raise e
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine
    engine = MQLLMEngine.from_engine_args(engine_args=engine_args,
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 119, in from_engine_args
    engine_config = engine_args.create_engine_config(usage_context)
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1126, in create_engine_config
    device_config = DeviceConfig(device=self.device)
  File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/config.py", line 1660, in __init__
    raise RuntimeError("Failed to infer device type")
RuntimeError: Failed to infer device type

I checked that the installed version of vllm is v0.7.3, and vllm_ascend is 0.7.1rc1. Then I try to install vllm==0.7.1, another error occured:

$ pip install vllm==0.7.1
Looking in indexes: https://mirrors.bfsu.edu.cn/pypi/web/simple/
Collecting vllm==0.7.1
  Downloading https://mirrors.bfsu.edu.cn/pypi/web/packages/c1/9d/151eba20b6959913d05df917cb53d5adb5d2e3dd8a19fea365d48b2b2bf3/vllm-0.7.1.tar.gz (5.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.3/5.3 MB 2.8 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      /tmp/pip-build-env-3fj_kpho/overlay/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
        cpu = _conversion_method_template(device=torch.device("cpu"))
      Traceback (most recent call last):
        File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
          main()
        File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
          json_out["return_val"] = hook(**hook_input["kwargs"])
        File "/home/fdd/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 143, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-3fj_kpho/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 334, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
        File "/tmp/pip-build-env-3fj_kpho/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 304, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-3fj_kpho/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 320, in run_setup
          exec(code, locals())
        File "<string>", line 631, in <module>
        File "<string>", line 525, in get_vllm_version
      RuntimeError: Unknown runtime environment
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

But I'm sure I have numpy installed on my current environment.

$ pip show numpy | head
Name: numpy
Version: 1.26.0
Summary: Fundamental package for array computing in Python
Home-page: https://numpy.org
Author: Travis E. Oliphant et al.
Author-email:
License: Copyright (c) 2005-2023, NumPy Developers.
        All rights reserved.

        Redistribution and use in source and binary forms, with or without

By the way, my device is 910B3.

wangxiyuan · 2025-02-21T03:07:56Z

@whu-dft please follow the install guide https://vllm-ascend.readthedocs.io/en/v0.7.1rc1/installation.html

pip install vllm vllm-ascend doesn't work currentlly. We'll make it avaliable in the next release.

whu-dft · 2025-02-21T03:13:27Z

Thanks!

sisrfeng · 2025-02-21T07:30:00Z

Is there any table comparing vllm-ascend V.S MindIE considering speed, model support, etc ?

Infinite666 · 2025-02-22T03:31:19Z

Same as above, need performance of vllm-ascend based on different hardware. We tested both vllm-ascend and mindIE on 910B, seems like the performance of mindIE is better.

Yikun · 2025-02-22T10:58:53Z

@Infinite666 @sisrfeng Thanks for your feedback.

Currently, the performance and accuracy of vLLM on Ascend still need to be improved. We are also working together with the MindIE team to improve it. The first release will be v0.7.3 in 2025 Q1. Therefore, in the short term, we will still focus on the performance improvement of vLLM Ascend, and welcome everyone join us to improve it.

Yikun pinned this issue Feb 8, 2025

Yikun changed the title ~~[Alhpa] FAQ & Feedback~~ [v0.7.1rc1] FAQ & Feedback Feb 17, 2025

Yikun mentioned this issue Feb 17, 2025

Add a turtorial for Qwen 2.5-VL #75

Open

Yikun mentioned this issue Feb 18, 2025

[Doc] Update doc to work with release #85

Merged

whu-dft mentioned this issue Feb 21, 2025

Strange Memory Consumption Phenomenon in vLLM #89

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v0.7.1rc1] FAQ & Feedback #19

[v0.7.1rc1] FAQ & Feedback #19

Yikun commented Feb 8, 2025 •

edited

Loading

shannanyinxiang commented Feb 17, 2025

Yikun commented Feb 17, 2025 •

edited

Loading

shannanyinxiang commented Feb 17, 2025

invokerbyxv commented Feb 19, 2025

whu-dft commented Feb 21, 2025

wangxiyuan commented Feb 21, 2025

whu-dft commented Feb 21, 2025

sisrfeng commented Feb 21, 2025

Infinite666 commented Feb 22, 2025 •

edited

Loading

Yikun commented Feb 22, 2025 •

edited

Loading

[v0.7.1rc1] FAQ & Feedback #19

[v0.7.1rc1] FAQ & Feedback #19

Comments

Yikun commented Feb 8, 2025 • edited Loading

1. What devices are currently supported?

2. How to setup dev env, build and test?

shannanyinxiang commented Feb 17, 2025

Yikun commented Feb 17, 2025 • edited Loading

shannanyinxiang commented Feb 17, 2025

invokerbyxv commented Feb 19, 2025

whu-dft commented Feb 21, 2025

wangxiyuan commented Feb 21, 2025

whu-dft commented Feb 21, 2025

sisrfeng commented Feb 21, 2025

Infinite666 commented Feb 22, 2025 • edited Loading

Yikun commented Feb 22, 2025 • edited Loading

Yikun commented Feb 8, 2025 •

edited

Loading

Yikun commented Feb 17, 2025 •

edited

Loading

Infinite666 commented Feb 22, 2025 •

edited

Loading

Yikun commented Feb 22, 2025 •

edited

Loading