K8s 基于 HAMi 的 GPU/NPU 算力切分实践指南
- 11/16/2025
引言
在 AI 模型训练、推理及高性能计算场景中,GPU、NPU 等异构计算资源常面临 “算力闲置” 与 “多任务调度冲突” 的双重挑战 —— 单张芯片算力充足但仅支撑单个任务时利用率偏低,多个轻量任务并发请求时又易因资源竞争导致调度阻塞。
HAMi(High-performance AI Model Infrastructure)作为 Kubernetes 生态下的 AI 原生资源调度与管理方案,通过统一的设备插件与调度扩展能力,可实现 GPU、NPU 资源的精细化切分与高效调度,既能大幅提升异构资源利用率,又能适配不同任务的资源诉求。
本文结合 HAMi 官方文档与实践经验,详细拆解 GPU、NPU 算力切分的部署、配置与验证全流程,为实际落地提供可直接复用的操作指南。
核心组件介绍
HAMi 核心框架
HAMi 是 Project-HAMi 社区推出的 AI 资源管理解决方案,核心包含 调度器扩展(hami-scheduler) 与 设备插件(hami-device-plugin) 两大组件。其核心优势在于:
- 支持 GPU(NVIDIA)、NPU(华为昇腾)等多类型异构资源的统一管理;
- 提供显存、计算核心等维度的精细化切分能力,实现物理芯片向虚拟资源(vGPU/vNPU)的灵活拆分;
- 深度集成 Kubernetes 调度体系,无需改造底层集群即可快速部署。
设备插件组件
- GPU 设备插件:集成 NVIDIA 资源管理能力,支持 Tesla T4 等主流 GPU 型号的算力切分,通过
nvidia.com/gpu、nvidia.com/gpumem等资源名实现显存与算力的精准申请,无需安装额外插件; - NPU 设备插件(hami-ascend-device-plugin):专为华为昇腾 NPU 设计,适配 Ascend 310P 等型号,支持按 AI Core、显存维度切分虚拟资源,通过
huawei.com/AscendXXX系列资源名实现资源申请。
安装部署
前置准备
- 确保 Kubernetes 集群版本为 v1.23+(本文基于 v1.23.7 验证),且集群已安装 Helm 3 及以上版本;
- GPU 节点需提前安装 NVIDIA 驱动与容器运行时(nvidia-container-toolkit)可参考K8s 基于 Volcano 优先级调度的 GPU 算力切分实践指南;
- NPU 节点需提前安装 Ascend 驱动(版本 ≥7.2)与 Ascend Docker Runtime,确保硬件环境就绪,可参考K8s 基于 Volcano 优先级调度的 NPU 算力切分实践指南。
为目标节点打标签
通过节点标签实现设备插件的精准调度,分别为 GPU、NPU 节点添加对应标签
# GPU 节点打标签
kubectl label node <gpu-node-name> gpu=on # 替换为实际 GPU 节点名称
# NPU 节点打标签
kubectl label node <npu-node-name> ascend=on # 替换为实际 NPU 节点名称
部署 HAMi 核心框架(Helm 方式)
GPU 算力切分部署
添加 HAMi Helm 仓库并部署核心组件,仅启用 GPU 支持:
helm repo add hami-charts https://project-hami.github.io/HAMi/
helm install hami hami-charts/hami -n kube-system \
--version 2.7.0 \
--set scheduler.patch.imageNew.repository=lomtom-common/kube-webhook-certgen \
--set scheduler.kubeScheduler.image.repository=lomtom-common/kube-scheduler \
--set scheduler.kubeScheduler.image.tag=v1.23.7 \
--set scheduler.extender.image.repository=lomtom-common/hami \
--set devicePlugin.image.repository=lomtom-common/hami \
--set devicePlugin.monitor.image.repository=lomtom-common/hami \
--set global.imageRegistry=swr.cn-east-3.myhuaweicloud.com
查看hami-device-plugin 和 hami-scheduler pod 状态
# kubectl get pod -n kube-system
hami-device-plugin-qg8qj 2/2 Running 0 32s
hami-scheduler-d846f7b69-9l498 2/2 Running 0 32s
NPU 算力切分部署
部署 HAMi 核心组件时启用昇腾 NPU 支持,并额外部署专用设备插件:
- 部署 HAMi 核心组件:
helm repo add hami-charts https://project-hami.github.io/HAMi/
helm install hami hami-charts/hami -n kube-system \
--version 2.7.0 \
--set scheduler.patch.imageNew.repository=lomtom-common/kube-webhook-certgen \
--set scheduler.kubeScheduler.image.repository=lomtom-common/kube-scheduler \
--set scheduler.kubeScheduler.image.tag=v1.23.7 \
--set scheduler.extender.image.repository=lomtom-common/hami \
--set devicePlugin.image.repository=lomtom-common/hami \
--set devicePlugin.monitor.image.repository=lomtom-common/hami \
--set global.imageRegistry=swr.cn-east-3.myhuaweicloud.com \
--set devices.ascend.enabled=true
projecthami/hami:v2.7.0
jettech/kube-webhook-certgen:v1.5.2
liangjw/kube-webhook-certgen:v1.1.1
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.23.7
查看 hami-scheduler pod 状态
kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
hami-scheduler-59598d4f7d-k7sn7 2/2 Running 0 8m33s
- 创建 NPU 设备配置 ConfigMap
定义 Ascend 310P3 等 NPU 型号的切分规格(显存、AI Core、AI CPU 分配),具体需要配置可根据实际情况而定,YAML 如下:
apiVersion: v1
kind: ConfigMap
metadata:
name: hami-scheduler-device
namespace: kube-system
labels:
app.kubernetes.io/component: hami-scheduler
app.kubernetes.io/name: hami
app.kubernetes.io/instance: hami
data:
device-config.yaml: |-
vnpus:
# Ascend 310P3 NPU 配置
- chipName: 310P3
commonWord: Ascend310P
resourceName: huawei.com/Ascend310P # NPU 资源名称(Pod 申请时使用)
resourceMemoryName: huawei.com/Ascend310P-memory # NPU 显存资源名称
memoryAllocatable: 21527 # 可分配显存总量(MB)
memoryCapacity: 24576 # 显存总容量(MB)
aiCore: 8 # AI Core 总数
aiCPU: 7 # AI CPU 总数
templates: # vNPU 切分模板
- name: vir01
memory: 3072 # 单 vNPU 显存(MB)
aiCore: 1 # 单 vNPU AI Core 数量
aiCPU: 1 # 单 vNPU AI CPU 数量
- name: vir02
memory: 6144 # 单 vNPU 显存(MB)
aiCore: 2 # 单 vNPU AI Core 数量
aiCPU: 2 # 单 vNPU AI CPU 数量
- name: vir04
memory: 12288 # 单 vNPU 显存(MB)
aiCore: 4 # 单 vNPU AI Core 数量
aiCPU: 4 # 单 vNPU AI CPU 数量
- 创建 NPU 设备配置 ConfigMap
定义 Ascend 310P3、910B4 等 NPU 型号的切分规格(显存、AI Core、AI CPU 分配),具体需要配置可根据实际情况而定,YAML 如下:
apiVersion: v1
kind: ConfigMap
metadata:
name: hami-scheduler-device
namespace: kube-system
labels:
app.kubernetes.io/component: hami-scheduler
app.kubernetes.io/name: hami
app.kubernetes.io/instance: hami
data:
device-config.yaml: |-
vnpus:
# Ascend 910B4 NPU 配置
- chipName: 910B4
commonWord: Ascend910B4
resourceName: huawei.com/Ascend910B4 # NPU 资源名称(Pod 申请时使用)
resourceMemoryName: huawei.com/Ascend910B4-memory # NPU 显存资源名称
memoryAllocatable: 32768 # 可分配显存总量(MB)
memoryCapacity: 32768 # 显存总容量(MB)
aiCore: 20 # AI Core 总数
aiCPU: 7 # AI CPU 总数
templates: # vNPU 切分模板
- name: vir05_1c_8g
memory: 8192 # 单 vNPU 显存(MB)
aiCore: 5 # 单 vNPU AI Core 数量
aiCPU: 1 # 单 vNPU AI CPU 数量
- name: vir10_3c_16g
memory: 16384 # 单 vNPU 显存(MB)
aiCore: 10 # 单 vNPU AI Core 数量
aiCPU: 3 # 单 vNPU AI CPU 数量
# Ascend 310P3 NPU 配置
- chipName: 310P3
commonWord: Ascend310P
resourceName: huawei.com/Ascend310P # NPU 资源名称(Pod 申请时使用)
resourceMemoryName: huawei.com/Ascend310P-memory # NPU 显存资源名称
memoryAllocatable: 21527 # 可分配显存总量(MB)
memoryCapacity: 24576 # 显存总容量(MB)
aiCore: 8 # AI Core 总数
aiCPU: 7 # AI CPU 总数
templates: # vNPU 切分模板
- name: vir01
memory: 3072 # 单 vNPU 显存(MB)
aiCore: 1 # 单 vNPU AI Core 数量
aiCPU: 1 # 单 vNPU AI CPU 数量
- name: vir02
memory: 6144 # 单 vNPU 显存(MB)
aiCore: 2 # 单 vNPU AI Core 数量
aiCPU: 2 # 单 vNPU AI CPU 数量
- name: vir04
memory: 12288 # 单 vNPU 显存(MB)
aiCore: 4 # 单 vNPU AI Core 数量
aiCPU: 4 # 单 vNPU AI CPU 数量
-
创建插件所需的 RBAC 权限与 DaemonSet,完整 YAML 如下:
-
部署 hami-ascend-device-plugin(NPU 专用设备插件):
创建 RBAC 权限与 DaemonSet 资源,完整 YAML 如下:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: hami-ascend
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "update", "watch", "patch"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hami-ascend
subjects:
- kind: ServiceAccount
name: hami-ascend
namespace: kube-system
roleRef:
kind: ClusterRole
name: hami-ascend
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: hami-ascend
namespace: kube-system
labels:
app.kubernetes.io/component: "hami-ascend"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: hami-ascend-device-plugin
namespace: kube-system
labels:
app.kubernetes.io/component: hami-ascend-device-plugin
spec:
selector:
matchLabels:
app.kubernetes.io/component: hami-ascend-device-plugin
hami.io/webhook: ignore
template:
metadata:
labels:
app.kubernetes.io/component: hami-ascend-device-plugin
hami.io/webhook: ignore
spec:
priorityClassName: "system-node-critical"
serviceAccountName: hami-ascend
containers:
- image: swr.cn-east-3.myhuaweicloud.com/lomtom-common/ascend-device-plugin:v1.1.0
imagePullPolicy: IfNotPresent
name: device-plugin
resources:
requests:
memory: 500Mi
cpu: 500m
limits:
memory: 500Mi
cpu: 500m
args:
- --config_file
- /device-config.yaml
securityContext:
privileged: true
readOnlyRootFilesystem: false
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
- name: pod-resource
mountPath: /var/lib/kubelet/pod-resources
- name: hiai-driver
mountPath: /usr/local/Ascend/driver
readOnly: true
- name: log-path
mountPath: /var/log/mindx-dl/devicePlugin
- name: tmp
mountPath: /tmp
- name: ascend-config
mountPath: /device-config.yaml
subPath: device-config.yaml
readOnly: true
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
- name: pod-resource
hostPath:
path: /var/lib/kubelet/pod-resources
- name: hiai-driver
hostPath:
path: /usr/local/Ascend/driver
- name: log-path
hostPath:
path: /var/log/mindx-dl/devicePlugin
type: Directory
- name: tmp
hostPath:
path: /tmp
- name: ascend-config
configMap:
name: hami-scheduler-device
nodeSelector:
ascend: "on"
验证 HAMi 部署状态
部署完成后,检查核心组件 Pod 状态,确保所有组件正常运行:
# 查看 HAMi 核心组件
kubectl get pod -n kube-system -l app.kubernetes.io/name=hami
# 查看 NPU 专用设备插件
kubectl get pod -n kube-system -l app.kubernetes.io/component=hami-ascend-device-plugin
预期输出:hami-scheduler(2/2 Running)、hami-device-plugin(2/2 Running)、hami-ascend-device-plugin(1/1 Running)。
# GPU环境
NAME READY STATUS RESTARTS AGE
hami-device-plugin-qg8qj 2/2 Running 0 33m
hami-scheduler-d846f7b69-9l498 2/2 Running 0 33m
# NPU环境
NAME READY STATUS RESTARTS AGE
hami-scheduler-59598d4f7d-k7sn7 2/2 Running 0 43m
hami-ascend-device-plugin-sjmxj 1/1 Running 0 41m
验证
GPU 算力切分测试(基于 vGPU)
创建申请 vGPU 资源的 Pod,指定显存与算力需求:
kubectl apply -f -<<EOF
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod-t4
spec:
containers:
- name: gpu-container
image: swr.cn-east-3.myhuaweicloud.com/lomtom-common/pytorch:2.1.2-cuda12.1-cudnn8-runtime-ubuntu22.04
command: ["sleep"]
args: ["100000"] # 保持 Pod 长期运行
resources:
limits:
cpu: "1"
memory: 1000Mi
nvidia.com/gpu: "1" # 申请 1 个 Ascend310P vNPU
nvidia.com/gpumem: 3000
requests:
cpu: "1"
memory: 1000Mi
nvidia.com/gpu: "1"
nvidia.com/gpumem: 3000
EOF
验证切分结果:
# 进入 Pod 执行 nvidia-smi 查看资源分配
kubectl exec -it gpu-pod-t4 -- nvidia-smi
预期输出:Pod 内仅能看到分配的 3000MiB 显存(与申请值一致),物理 GPU 算力被成功切分为虚拟资源,验证 GPU 切分功能生效。
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03 Driver Version: 550.144.03 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla T4 Off | 00000000:00:06.0 Off | 0 |
| N/A 45C P8 14W / 70W | 0MiB / 3000MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
基于vNPU的切分测试
创建申请 vNPU 资源的 Pod:
kubectl apply -f -<<EOF
apiVersion: v1
kind: Pod
metadata:
name: npu-pod-310p
spec:
containers:
- name: npu-container
image: swr.cn-south-1.myhuaweicloud.com/ascendhub/ascend-pytorch:24.0.RC1-A2-1.11.0-ubuntu20.04
command: ["sleep"]
args: ["100000"] # 保持 Pod 长期运行
resources:
limits:
cpu: "1"
memory: 1000Mi
huawei.com/Ascend310P: "1" # 申请 1 个 Ascend310P vNPU
huawei.com/Ascend310P-memory: "3072" # 申请 3GB 显存(对应 vir01 模板)
requests:
cpu: "1"
memory: 1000Mi
huawei.com/Ascend310P: "1"
huawei.com/Ascend310P-memory: "3072"
EOF
验证切分结果:
# npu-smi info
+-------------------------------------------------------------------------------------------------------+
| npu-smi 24.1.rc2 Version: 24.1.rc2 |
+-------------------------------+-----------------+-----------------------------------------------------+
| NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)|
| Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) |
+===============================+=================+=====================================================+
| 32896 310Pvir01 | OK | NA 48 0 / 0 |
| 0 0 | 0000:85:00.0 | 0 225 / 2690 |
+===============================+=================+=====================================================+
+-------------------------------+-----------------+-----------------------------------------------------+
| NPU Chip | Process id | Process name | Process memory(MB) |
+===============================+=================+=====================================================+
| No running processes found in NPU 32896 |
+===============================+=================+=====================================================+
# npu-smi info -t info-vnpu -i 6 -c 0
+-------------------------------------------------------------------------------+
| NPU resource static info as follow: |
| Format:Free/Total NA: Currently, query is not supported. |
| AICORE Memory AICPU VPC VENC VDEC JPEGD JPEGE PNGD |
| GB |
|===============================================================================|
| 7/8 18/21 6/7 11/12 3/3 11/12 14/16 7/8 NA/NA |
+-------------------------------------------------------------------------------+
| Total number of vnpu: 1 |
+-------------------------------------------------------------------------------+
| Vnpu ID | Vgroup ID | Container ID | Status | Template Name |
+-------------------------------------------------------------------------------+
| 132 | 0 | ffffffffffff | 1 | vir01 |
+-------------------------------------------------------------------------------+
# cat /etc/vnpu.cfg
vnpu_config_recover:enable
[vnpu-config start]
2:132:npu-smi set -t create-vnpu -i 6 -c 0 -f vir01 -v 132 -g 0