[教程] MindSpore Lite显存共享

lujiale · 2026 年1 月 9 日 14:54

ACL模型显存共享

注意：纯动态模型不支持激活共享！

显存共享分为work mem共享和weight共享， work mem共享即让多个模型的输入输出的内存共用一块内存， weight共享即让多个模型的权重共享一块内存，下面主要介绍work mem共享。

work mem共享开启时，指定的多个模型将共用一块内存作为公共的work mem，该共用内存的大小为指定模型中的work mem最大的一块，因此work mem共享能够节省许多显存。

当多张卡上的模型需要共享内存时，需要将模型根据其属于的卡分类，并分别指定计算共享的显存大小。在这种情况下将在多张卡上分别申请当前卡上指定模型中最大work mem的内存。

work mem共享python demo：

def init_context(device_id, device_type='ascend'):
  context = mslite.Context()
  context.target = [device_type]
  context.ascend.device_id = device_id
  return context
# 单卡work mem共享demo
model_group = mslite.ModelGroup()
model_group.add_model([path1, path2])
model_group.cal_max_size_of_workspace(mslite.ModelType.MINDIR, init_context(1))
m1 = mslite.Model()
m2 = mslite.Model()
m1.build_from_file(path1, mslite.ModelType.MINDIR, init_context(1))
m2.build_from_file(path2, mslite.ModelType.MINDIR, init_context(1))

#多卡work mem共享demo
model_group = mslite.ModelGroup()
model_group.add_model([path1, path2])
model_group.cal_max_size_of_workspace(mslite.ModelType.MINDIR, init_context(1))
model_group.add_model([path3, path4])
model_group.cal_max_size_of_workspace(mslite.ModelType.MINDIR, init_context(2))
m1 = mslite.Model()
m2 = mslite.Model()
m3 = mslite.Model()
m4 = mslite.Model()
m1.build_from_file(path1, mslite.ModelType.MINDIR, init_context(1))
m2.build_from_file(path2, mslite.ModelType.MINDIR, init_context(1))
m1.build_from_file(path1, mslite.ModelType.MINDIR, init_context(2))
m2.build_from_file(path2, mslite.ModelType.MINDIR, init_context(2))

权重共享

权重共享的原理为，同一个模型的权重相同，并且权重内存为只读内存，所以当出现单个模型在多个线程中初始化的场景时，可以让这些模型使用同一份权重显存，以此节省显存。下列用例为使用单个模型路径在不同线程下是实例化多个模型对象，并进行权重共享：

在接口调用上与work mem共享的差别仅在于初始化model group时使用的flag为mslite。ModelGroupFlag.SHARE_WEIGHT


def test_runtime_general_modelgroup_weightmem_shared_func():
    mod_group_flag = mslite.ModelGroupFlag.SHARE_WEIGHT //指定显存共享类型为权重共享
    device_id_list = [ 0, 0, 0, 0, 0]
    loop = 30
    use_mem = []
    threads = []
    for i in range(5):
        device_id = device_id_list[i]
        t = threading.Thread(target=thread_infer, args=(["xxx/xx.mindir"], ["1, 2, 3, 4:1, 2, 3, 4"], model_group_flag, loop, device_id, use_mem))
        t.start()
        threads.append(t)
    for i in threads:
        t.join()
    print("use_mem: ", use_mem)
    memory_cost = max(use_mem)
    print("memory_cost: ", memory_cost)
    assert 4000 < memory_cost < 4800

GE模型显存共享

GE模型启用内存共享时需要修改代码以及模型config。当前只尝试单卡使用，暂未测试多卡能否使用。

build_from_file时传入如下配置：

[ge_graph_options]
ge.externalWeigth = 1

GE启用内存共享demo：

model_group = mslite.ModelGroup(flags = mslite.ModelGroupFlag.SHARE_WEIGHT) # flag需要指定为mslite.ModelGroupFlag.SHARE_WEIGHT
mode = mslite.Model()
model_vae = mslite.Model()
model_group.add_model([model.model_vae]) # add_model需要在定义模型后即调用，不要再build_from_file后调用
model.build_from_file(model_path, mslite.ModelType.MINDIR, init_context(), config_path = config_file) # config_paht 指定为添加了个。externalWeight=1的配置
model_vae.build_from_file(model_vae_path, mslite.ModelType.MINDIR, init_context(), config_path = config_vae_file)

话题		回复	浏览量
mslite中创建两个model对象能否实现权重内存或者显存复用 MindSpore Lite推理部署	5	106	2026 年1 月 9 日
如何获取模型运行内存大小 MindSpore Lite推理部署	1	35	2026 年3 月 5 日
[教程] MindSpore Lite昇腾推理支持的常见配置项及使用方式说明 MindSpore Lite推理部署推理	1	196	2025 年10 月 24 日
MindSpore Lite输入输出数据免拷贝推理经验-Inference Experience	0	19	2026 年1 月 9 日
MindSpore Lite模型加载报错RuntimeError: build from file failed! Error is Common error code. 推理经验-Inference Experience	0	27	2025 年9 月 9 日

[教程] MindSpore Lite显存共享

ACL模型显存共享

权重共享

GE模型显存共享

相关话题