模型转换失败:not support onnx data type IsNaN

➜ ./converter_lite --fmk=ONNX
–modelFile=model.onnx
–outputFile=qwen
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.847.871 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.847.914 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.847.935 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.847.953 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.847.974 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.847.994 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.013 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.046 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.064 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.083 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.102 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.131 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.150 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.167 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.186 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.204 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.223 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.241 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.258 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.275 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.292 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.311 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.328 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.345 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:894] ConvertNodes] not support onnx data type IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.355 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:792] ConvertOnnxGraph] convert nodes failed.
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.848.363 [mindspore-lite/tools/converter/parser/onnx/onnx_model_parser.cc:705] Parse] convert onnx graph failed.
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.902 [mindspore-lite/tools/converter/converter_funcgraph.cc:107] Load3rdModelToFuncgraph] Get funcGraph failed for fmk: 2
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.937 [mindspore-lite/tools/converter/converter_funcgraph.cc:187] Build] Load model file failed!
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.943 [mindspore-lite/tools/converter/converter.cc:1197] HandleGraphCommon] Build func graph failed
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.955 [mindspore-lite/tools/converter/converter.cc:1152] Convert] Handle graph failed: -1 Common error code.
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.962 [mindspore-lite/tools/converter/converter.cc:1345] RunConverter] Convert model failed
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.968 [mindspore-lite/tools/converter/converter_context.h:60] PrintOps] ===========================================
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.972 [mindspore-lite/tools/converter/converter_context.h:61] PrintOps] UNSUPPORTED OP LIST:
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.977 [mindspore-lite/tools/converter/converter_context.h:63] PrintOps] FMKTYPE: ONNX, OP TYPE: IsNaN
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.981 [mindspore-lite/tools/converter/converter_context.h:65] PrintOps] ===========================================
[ERROR] LITE(1024844,7f80c4cdff40,converter_lite):2025-09-28-17:04:04.908.992 [mindspore-lite/tools/converter/cxx_api/converter.cc:361] Convert] Convert model failed, ret=Common error code.
ERROR [mindspore-lite/tools/converter/converter_lite/main.cc:107] main] Convert failed. Ret: Common error code.
Convert failed. Ret: Common error code.

用户您好,欢迎使用MindSpore,已经收到您上述的问题,还请耐心等待下答复~

推理网络模型中没有遇到过IsNaN 算子,确认下网络模型是否正确导出?

IsNaN当前MindSpore Lite不支持,这个算子应该不是推理算子,类似于dropout算子,在推理中应该没有实际使用,确认一下是否是推理的必须算子;

目前这个算子Ascend卡上是支持的,如果有需要的话,可以提一个rfc到mindspore lite社区,安排补充适配

感谢各位大佬的回复,公司内部现在处于选型阶段,目前目标是使用mindspore lite跑通Qwen 2.5 0.5B的模型。
网络冲浪了一下,发现关于模型转换的资料不是很多,上面转换出来的.onnx模型是使用下面的脚本生成的:

from optimum.onnxruntime import ORTModelForCausalLM
from transformers import AutoTokenizer

model_id = "/Users/workspace/hf/Qwen2.5-0.5B"
save_directory = "./onnx_model/"

# 从 Transformers 加载模型并导出为 ONNX
ort_model = ORTModelForCausalLM.from_pretrained(model_id, export=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# 保存 ONNX 模型和分词器
ort_model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)

好的,谢谢。方便贴一下社区的网址不~

https://gitee.com/mindspore/mindspore-lite

1 个赞

这个如果有大佬熟悉的话还请指点迷津,纯新手,对模型转换这方面欠缺的比较多 :sob:

当然如果有直接的方案能解决这个问题更好,因为其实模型转换并不是我此次调研的重点,只是现在卡在这里了

可以参考下以下代码,具体的输入数据需要根据业务场景配置。
建议可以先用torch 跑通模型的推理demo后,在调用模型的forward接口前,参考以下代码进行导出。此时的模型真实输入数据可以更方便的获取到。

from transformers import Qwen2ForCausalLM
import torch

# 加载模型
pretrain_path = 'model_path/'
model = Qwen2ForCausalLM.from_pretrained(pretrain_path)
output_onnx_path = 'outputs/model.onnx'

# 初始化模型输入数据,用于torch.onnx.export。 实际使用可在业务模型调用推理前,获取真实的业务输入数据。
attention_mask = torch.randn(1, 128).to(torch.bool)
position_ids = torch.randn(1, 128).to(torch.int32)
input_embeds = torch.randn(1, 128, 896).to(torch.float32)

# 导出onnx 模型,注意模型的输入list需要与模型类的forward中申明的入参顺序保持一致。
# 如果需要导出动态shape的onnx文件,可在dynamic_axes中申明需要可变的维度
torch.onnx.export(
    model,
    (None, attention_mask, position_ids, None, input_embeds, None, False),
    output_onnx_path,
    input_names=['attention_mask', 'position_ids', 'input_embeds'],
    output_names=['outputs'],
    keep_initializers_as_inputs=True,
    verbose=False,
    do_constant_folding=False,
    dynamic_axes={
        'attention_mask': {0: 'batch_size', 1: 'seq_length'},
        'position_ids': {0: 'batch_size', 1: 'seq_length'},
        'input_embeds': {0: 'batch_size', 1: 'seq_length'},
    }
)

@YoungChen 感谢大佬!请问有试过上述方式吗?我这用torch.export的方式各种报错,执行推理倒是没问题。
我这边能成功转换的方式都是通过optimum库实现,记录一下另一种我这边成功执行的方式(optimum-cli):

optimum-cli export onnx --model Qwen2.5-0.5B --task text-generation onnx_model_optimum

但是一样的,生成的模型中会包含IsNaN,可以用这种方式验证:

python -c "import onnx; model = onnx.load('model.onnx'); print([node.op_type for node in model.graph.node if node.op_type == 'IsNaN'])"

尝试了下提供的optimum-cli命令导出onnx

optimum-cli export onnx --model Qwen2.5-0.5B --task text-generation onnx_model_optimum

在导出torch.nn.functional.scaled_dot_product_attention 方法时,会引入IsNaN算子。

我修改了下optimum的optimum\exporters\onnx\model_patcher.py 文件,将其中的original_scaled_dot_product_attention 方法替换成torch 官网的小算子。能够避免引入IsNaN。可以尝试一下。
torch.nn.functional.scaled_dot_product_attention — PyTorch 2.8 documentation

# 原始代码:
# original_scaled_dot_product_attention = torch.nn.functional.scaled_dot_product_attention
# 替换后
def original_scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0,
        is_causal=False, scale=None, enable_gqa=False) -> torch.Tensor:
    L, S = query.size(-2), key.size(-2)
    scale_factor = 1 / math.sqrt(query.size(-1)) if scale is None else scale
    if attn_mask is not None:
        attn_bias = torch.zeros_like(attn_mask)
    else:
        attn_bias = torch.zeros(L, S, dtype=query.dtype, device=query.device)
    if is_causal:
        assert attn_mask is None
        temp_mask = torch.ones(L, S, dtype=torch.bool).tril(diagonal=0)
        attn_bias.masked_fill_(temp_mask.logical_not(), float("-inf"))
        attn_bias.to(query.dtype)

    if attn_mask is not None:
        if attn_mask.dtype == torch.bool:
            attn_bias.masked_fill_(attn_mask.logical_not(), float("-inf"))
        else:
            attn_bias = attn_mask + attn_bias

    if enable_gqa:
        key = key.repeat_interleave(query.size(-3)//key.size(-3), -3)
        value = value.repeat_interleave(query.size(-3)//value.size(-3), -3)

    attn_weight = query @ key.transpose(-2, -1) * scale_factor
    attn_weight += attn_bias
    attn_weight = torch.softmax(attn_weight, dim=-1)
    attn_weight = torch.dropout(attn_weight, dropout_p, train=True)
    return attn_weight @ value

此方法在使用 optimum 命令导出时,在dynamic_axes_fix阶段会有报错。不过不影响onnx的导出。

多谢提供详细的解决方案,按照上述修改尝试了一下发现会报错在这里:

onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Where(9) node with name '/model/layers.0/self_attn/Where_2'

然后虽然有.onnx文件生成,但是大小似乎是不符合预期的:

➜ ll -a model.onnx                                                   
-rw-r--r--@ 1 user  staff   1.1M Sep 30 10:42 model.onnx

这个和具体的optimum版本有关么?我本地的版本是:

➜ pip show optimum
Name: optimum
Version: 1.27.0
Summary: Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality.
Home-page: https://github.com/huggingface/optimum
Author: HuggingFace Inc. Special Ops Team
Author-email: hardware@huggingface.co
License: Apache
Location: /Users/workspace/hf/venv/lib/python3.9/site-packages
Requires: huggingface_hub, numpy, packaging, torch, transformers
Required-by:

同目录下是否有个model.onnx_data 文件?权重过大时,模型结构和权重会分文件保存

1 个赞

嗷,对了,忘记看另一个文件了。上面那个报错应该是不影响模型可用性的吧,我先去调一下mindspore-lite/tools/converter工具,感觉胜利在望呀 :grinning_face_with_smiling_eyes:

@YoungChen 大佬太牛了,执行转换成功输出.ms文件了,就是log里有大量[ERROR]报错,这个是正常的不?

...
[ERROR] LITE(2045748,7fc7fdea7f40,converter_lite):2025-09-30-14:35:12.185.650 [mindspore-lite/src/common/ops/populate/populate_register.h:62] GetParameterCreator] Unsupported parameter type in Create : Dropout
[ERROR] LITE(2045748,7fc7fdea7f40,converter_lite):2025-09-30-14:35:12.186.547 [mindspore-lite/src/common/ops/populate/populate_register.h:62] GetParameterCreator] Unsupported parameter type in Create : Dropout
[WARNING] LITE(2045748,7fc7fdea7f40,converter_lite):2025-09-30-14:35:12.186.988 [mindspore-lite/tools/optimizer/graph/infershape_pass.cc:183] Run] exist op cannot support infer shape.
[WARNING] LITE(2045748,7fc7fdea7f40,converter_lite):2025-09-30-14:35:12.192.210 [mindspore-lite/tools/common/graph_util.cc:149] GetShapeVectorAndIdxFromCNode] Shape is empty /lm_head/MatMul
CONVERT RESULT SUCCESS:0

转换SUCCESS可以先继续往下走,试试能否正常推理。

1 个赞

转换后有两个文件:.ms和.msw,在用mindspore-lite/examples/quick_start_cpp测试时执行到:

auto build_ret = model->Build(model_buf, size, mindspore::kMindIR, context);

会报错:

[ERROR] ME(2691857,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.477.160 [le_utils.cc:230] RealPath] file path not exists.
[ERROR] ME(2691857,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.477.189 [hema_tensor_wrapper.cc:70] Init] Read tensor data from msw file failed
[ERROR] ME(2691857,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.477.197 [te_model.cc:505] ConstructModel] PrepareInnerTensors failed.
[ERROR] ME(2691857,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.477.203 [te_model.cc:622] ImportFromBuffer] construct model failed.
[ERROR] ME(2691857,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.477.543 [te_session.cc:2056] LoadModelAndCompileByBuf] Import model failed
[ERROR] ME(2691857,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.477.556 [x_api/model/model_impl.cc:199] Build] Init session failed
[ERROR] ME(2691856,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.488.611 [le_utils.cc:230] RealPath] file path not exists.
[ERROR] ME(2691856,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.488.633 [hema_tensor_wrapper.cc:70] Init] Read tensor data from msw file failed
[ERROR] ME(2691856,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.488.638 [te_model.cc:505] ConstructModel] PrepareInnerTensors failed.
[ERROR] ME(2691856,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.488.642 [te_model.cc:622] ImportFromBuffer] construct model failed.
[ERROR] ME(2691856,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.488.938 [te_session.cc:2056] LoadModelAndCompileByBuf] Import model failed
[ERROR] ME(2691856,7f22780dbe80,mindspore_quick_start_cpp):2025-09-30-17:42:25.488.945 [x_api/model/model_impl.cc:199] Build] Init session failed

是不是这里对.ms和.msw这两个文件需要整合成一个呀?

尝试将mindspore-lite/examples/quick_start_cpp/main.cc中对应的逻辑改为:

// Build model directly from file path
auto build_ret = model->Build(model_path, mindspore::kMindIR, context);

报错信息变为:

[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.235 [heduler.cc:1442] ScheduleNodeToKernel] FindBackendKernel return nullptr, name: /model/rotary_emb/Sin, type: Sin
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.272 [heduler.cc:1534] ScheduleSubGraphToKernels] schedule node return nullptr, name: /model/rotary_emb/Sin, type: Sin
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.280 [heduler.cc:1364] ScheduleMainSubGraphToKernels] Schedule subgraph failed, index: 0
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.307 [heduler.cc:1485] ScheduleGraphToKernels] ScheduleSubGraphToSubGraphKernel failed
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.485 [heduler.cc:390] Schedule] Schedule graph to kernels failed.
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.494 [te_session.cc:604] CompileGraph] Schedule kernels failed: -1
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.106.761 [te_session.cc:2110] LoadModelAndCompileByPath] Compile model failed
[ERROR] ME(3029707,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:03.107.411 [x_api/model/model_impl.cc:237] Build] Init session failed
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.408 [heduler.cc:1442] ScheduleNodeToKernel] FindBackendKernel return nullptr, name: /model/rotary_emb/Sin, type: Sin
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.451 [heduler.cc:1534] ScheduleSubGraphToKernels] schedule node return nullptr, name: /model/rotary_emb/Sin, type: Sin
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.460 [heduler.cc:1364] ScheduleMainSubGraphToKernels] Schedule subgraph failed, index: 0
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.488 [heduler.cc:1485] ScheduleGraphToKernels] ScheduleSubGraphToSubGraphKernel failed
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.660 [heduler.cc:390] Schedule] Schedule graph to kernels failed.
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.666 [te_session.cc:604] CompileGraph] Schedule kernels failed: -1
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.505.923 [te_session.cc:2110] LoadModelAndCompileByPath] Compile model failed
[ERROR] ME(3029706,7f0deed37e80,mindspore_quick_start_cpp):2025-09-30-20:04:04.506.590 [x_api/model/model_impl.cc:237] Build] Init session failed

不用的ms是模型结构,msw里面保存的是权重