华为昇腾MindIE容器运行千问大模型

前言

最近在华为昇腾相关设备接触了MindIE大模型部署,与Nvidia有些区别。

这里以2.3.0版本示例,简单记录一下,并提供可用的一件运行配置文件。

不过大模型请自行在hugginface或魔搭获取。

品牌 管理工具 /dev 设备
英伟达 nvidia-smi /dev/nvidiaX
华为昇腾 npu-smi /dev/davinciX

工作目录: /opt/qwen3/

模型目录: /data/models/Qwen3-8B


踩过的坑

容器运行挂载了start.sh以补充镜像缺失的pytorch、protobuf等lib依赖,其原本就打在了镜像中,但是没在环境变量中加载到,导致直接运行会报出依赖未找到的错误;所以可能根据版本不同,环境变量中缺失的依赖也可能不同。

真不知道出镜像的人在干什么

所以,docker-compose内请注意挂载config.json、start.sh启动脚本,以及mindie-service路径的latest和版本号(这里是2.3.0)的config.json均需要挂载,注意config.json的权限需要是640。

并且,设备路径 /dev/davinci 中这里指定了了设备1, 请根据现场情况调整。

如果有问题,可把command调整为bash方便exec进容器手动调试mindie-service的运行

1
2
3
4
5
6
cd /usr/local/Ascend/mindie/latest/mindie-service
ldd ./bin/mindieservice_daemon | grep not
# 查看ldd有无输出not found
# 然后手动运行一下
./bin/mindieservice_daemon
# 查看有无明显异常日志

具体可用配置如下

docker-compose.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
version: '3.8'

services:
qwen3-8b-mindie:
image: swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.3.0-800I-A2-py311-openeuler24.03-lts
container_name: qwen3-8b-mindie
network_mode: host
ipc: host
privileged: true
devices:
- /dev/davinci1
- /dev/davinci_manager
- /dev/devmm_svm
- /dev/hisi_hdc
volumes:
- /usr/local/Ascend/driver:/usr/local/Ascend/driver
- /usr/local/dcmi:/usr/local/dcmi
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
- /data/models:/data/models
- /opt/qwen3/config.json:/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json
- /opt/qwen3/config.json:/usr/local/Ascend/mindie/2.3.0/mindie-service/conf/config.json
- /opt/qwen3/start.sh:/start.sh
environment:
- MINDIE_LOG_LEVEL=info
- MINDIE_LOG_TO_STDOUT=true
stdin_open: true
tty: true
command: /start.sh

config.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
{
"Version": "1.0.0",
"ServerConfig": {
"ipAddress": "0.0.0.0",
"managementIpAddress": "0.0.0.0",
"port": 1025,
"managementPort": 1026,
"metricsPort": 1027,
"allowAllZeroIpListening": true,
"maxLinkNum": 1000,
"httpsEnabled": false,
"fullTextEnabled": false,
"inferMode": "standard",
"interCommTLSEnabled": false,
"interCommPort": 1121,
"openAiSupport": "vllm",
"tokenTimeout": 600,
"e2eTimeout": 600,
"distDPServerEnabled": false
},
"BackendConfig": {
"backendName": "mindieservice_llm_engine",
"modelInstanceNumber": 1,
"npuDeviceIds": [[1]],
"tokenizerProcessNumber": 8,
"multiNodesInferEnabled": false,
"multiNodesInferPort": 1120,
"interNodeTLSEnabled": false,
"ModelDeployConfig": {
"maxSeqLen": 8192,
"maxInputTokenLen": 7168,
"truncation": false,
"ModelConfig": [
{
"modelInstanceType": "Standard",
"modelName": "qwen3-8b",
"modelWeightPath": "/data/models/Qwen3-8B",
"worldSize": 1,
"cpuMemSize": 0,
"npuMemSize": -1,
"backendType": "atb",
"trustRemoteCode": true
}
]
},
"ScheduleConfig": {
"templateType": "Standard",
"templateName": "Standard_LLM",
"cacheBlockSize": 128,
"maxPrefillBatchSize": 50,
"maxPrefillTokens": 8192,
"prefillTimeMsPerReq": 150,
"prefillPolicyType": 0,
"decodeTimeMsPerReq": 50,
"decodePolicyType": 0,
"maxBatchSize": 200,
"maxIterTimes": 512,
"maxPreemptCount": 0,
"supportSelectBatch": false,
"maxQueueDelayMicroseconds": 5000,
"maxFirstTokenWaitTime": 2500
}
},
"LogConfig": {
"dynamicLogLevel": "",
"dynamicLogLevelValidHours": 2,
"dynamicLogLevelValidTime": ""
},
"EnableDynamicAdjustTimeoutConfig": false
}

start.sh

1
2
3
4
5
6
7
8
9
10
#!/bin/bash

source /usr/local/Ascend/mindie/latest/mindie-service/set_env.sh

export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib:/usr/local/Ascend/mindie/3.0.0/mindie-llm/lib/protobuf/:/usr/local/Ascend/mindie/3.0.0/mindie-llm/lib/cares:$LD_LIBRARY_PATH

echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"

cd /usr/local/Ascend/mindie/latest/mindie-service
./bin/mindieservice_daemon

华为昇腾MindIE容器运行千问大模型
https://www.fishingrodd.cn/2026/06/25/MindIE容器运行千问大模型/
作者
FishingRod
发布于
2026年6月25日
更新于
2026年6月25日
许可协议