1. 系统环境
硬件环境(Ascend/GPU/CPU): Ascend
MindSpore版本: 2.2.0&CANN7.0
执行模式(PyNative/ Graph): 不限
2. 报错信息
2.1 问题描述
使用MindSpore 2.2.0 & CANN7.0,wizardcoder模型层数设为4,主要目的是快速验证是否能正确推理,跑推理脚本时出现报错。
2.2 报错信息
[WARNING] GE_ ADPT(3981562,ffff8dda3af0,python):2023-10-26-15:56:10.535.470 [mindspore/ccsrc/transform/graph_ir/utils.cc:300] CheckAndGetGraphRunner] Can not find init_subgraph.kernel_graph_0 sub graph, don't need data init subgraph in INFER mode.
[WARNING] GE_ADPT(3981562, ffff8dda3af0,python) :2023-10-26- 15:56: 10.577.631 [mindspore/ccs rc/transform/graph_ ir/ut ils.cc:300] CheckAndGe tGraphRunner] Can not find init _subgraph.496_1_wizardcode r WizardCoderLMHeadModel_construct 98 sub graph, don't need data init subgraph in INFER mode.
[TRACE] GE(3981562,python):2023-10-26-15:56:10.579.360 [status:INIT] [ge_api.cc :964]3984989 RunGraphAsync :start to run graph async, session_id: 0, graph_id: 1,input size 1
Traceback (most recent call last):
File "test_wizardcoder_generate .py", line 67, in <module>
main(config_path=args.config, max_length=args.max_length, use_past=args.use_past, batch_size=args.batch_size, device_id=args.device_id)
File "test_wizardcoder_generate -py", line 47, in main
output = model.generate(input ids=tokenizer(prompt) ["input ids"], use past=use past, max length=max length)
File "/home/wizardcoder/2 wizardcoder-mindformers-1019/mindtormers/generation/text_generator.py", line 565, in generate
**model kwargs,
File "/home/wizardcoder/2 wizardcoder-mindformers-1019/mindformers/generation/text generator.py",ine39 in fonward
res = self(**model inputs) # pylint: disable=E1102
File "/home/miniconda3/1 ib/python3.7/site-packages/mindspore/nn/cell.py", line 680, in _call
out = self.compile and run(*args, **kwargs)
File "/home/miniconda3/lib/python3.7/site-packages/mindspore/nn/cell.py", line 1023, in compile_and_run
return cell graph executor(self, *new args, phase=self.phase)
File "/home/miniconda3/lib/python3.77site-packages/minds pore/common/api.py", line 1589, in _call_
return self.run(obj, *args, phase=phase)
File "/home/miniconda37lib/python3.7/site-packages/minds pore/common/api .py", line 1628, in run
return self._exec_pip(obj , *args, phase=phase real)
File "/home/miniconda3/lib/python3.7/site-packages/minds pore/common/api.py", line 121, in wrapper
results = fn(*arg, **kwargs)
File "/home/miniconda3/l ib/python3.7/site-packages/minds pore/common/api.py", line 1608, in _exec_pip
return self._graph_executor (args, phase)
RuntimeError:
Memory not enough:
Device(id:7) memory isn't enough and alloc failed, kernel name: kernel_graph_0 Hos tDSActor, alloc size: 8192B.
- C++ Call Stack: (For framework developers)
mindspore/ccsrc/ runtime/graph scheduler/graph scheduler.cc:679 Run
3. 解决方案
通过排查发现是设置了环境变量 “MS_DISABLE_REF_MODE=1”导致出错,unset这一环境变量之后可以正常推理。