前置条件
-
docker、docker-compose安装教程
https://blog.csdn.net/u010800804/article/details/141754183
-
git-lfs 安装教程
https://blog.csdn.net/u010800804/article/details/146534351
vLLM镜像拉取
docker pull crpi-33mr80vehc50lqh8.cn-chengdu.personal.cr.aliyuncs.com/yunxinai/vllm-openai:v0.9.0
-
可以看到,速度还是非常快的
模型下载
-
基于git-lfs下载完整的模型文件
https://modelscope.cn/models/Qwen/Qwen3-30B-A3B/files
git lfs install
git clone https://www.modelscope.cn/Qwen/Qwen3-30B-A3B.git
-
模型文件大小差不多60G左右
模型部署
yaml文件编写
services: vllm: container_name: vllm restart: no image: crpi-33mr80vehc50lqh8.cn-chengdu.personal.cr.aliyuncs.com/yunxinai/vllm-openai:v0.9.0 ipc: host volumes: - /data/vllm:/models command: ["--model", "/models/Qwen3-30B-A3B", "--served-model-name", "Qwen3_30B_A3B", "--gpu-memory-utilization", "0.75", "--tensor-parallel-size", "2", "--uvicorn-log-level", "debug", "--api-key", "EHmTL656TaTBlCnSQbpqbhG6NXDWItpo"] ports: - 30041:8000 deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu]
日志查看
-
执行查看日志命令:
docker-compose logs vllm -f --tail=50
模型验证
curl-X POST "http://127.0.0.1:30041/v1/chat/completions"\
-H"Content-Type: application/json"\
-H"Authorization: Bearer EHmTL656TaTBlCnSQbpqbhG6NXDWItpo"\
-d'{
"model": "Qwen3_30B_A3B",
"messages": [
{
"role": "user",
"content": "一大爷带着二大爷上三大爷家里说四大爷被五大爷骗到六大爷家偷七大爷放在柜子里九大爷,谁是小偷"
},
{
"role": "system",
"content": "请帮我仔细回答问题"
}
],
"temperature": 0.5,
"stream": false
}'
关注微信公众号「云馨AI」,回复「微信」,
无论你是AI爱好者还是初学者,这里都能为你打开AI世界的大门!加入我们,与志同道合的朋友一起探索AI的无限可能,共同拥抱智能未来!