MindSpore报错“RuntimeError: Exceed function call depth limit 1000”

Skyti · 2025 年7 月 24 日 02:22

1 报错描述

1.1 系统环境

Hardware Environment(Ascend/GPU/CPU): Ascend

Software Environment:

-- MindSpore version (source or binary): 1.6.0

-- Python version (e.g., Python 3.7.5): 3.7.6

-- OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 4.15.0-74-generic

-- GCC/Compiler version (if compiled from source):

1.2 基本信息

1.2.1 脚本

训练脚本是构建了GRU的单算子网络，通过门控制循环单元更新值。脚本如下：

01 class Net(nn.Cell):
02   def __init__(self):
03     super(Net, self).__init__()
04     self.gru = nn.GRU(10, 16, 1000, has_bias=True, batch_first=True, bidirectional=False)
05
06   def construct(self, x, h0):
07     output = self.gru(x, h0)
08     return output
09
10 net = Net()
11 x = Tensor(np.ones([3, 5, 10]).astype(np.float32))
12 h0 = Tensor(np.ones([1 * 1000, 3, 16]).astype(np.float32))
13 output, hn = net(x, h0)
14 print('output', output.shape))

1.2.2 报错

这里报错信息如下：

Traceback (most recent call last):
 File 'demo.py', line 13, in &lt;module&gt;
  output, hn = net(x, h0)
  …
RuntimeError: mindspore/ccsrc/pipeline/jit/static_analysis/evaluator.cc:198 Eval] Exceed function call depth limit 1000, (function call depth: 1001, simulate call depth: 997).
It's always happened with complex construction of code or infinite recursion or loop.
Please check the code if it's has the infinite recursion or call 'context.set_context(max_call_depth=value)' to adjust this value.
If max_call_depth is set larger, the system max stack depth should be set larger too to avoid stack overflow.
For more details, please refer to the FAQ at https://www.mindspore.cn.
The function call stack (See file ' /rank_0/om/analyze_fail.dat' for more details):
\# 0 In file demo.py(07)
    output = self.gru(x, h0)

2 原因分析

我们看报错信息：在RuntimeError中，写到*Exceed function call depth limit 1000, (function call depth: 1001, simulate call depth: 997)，意思是超过函数调用深度限制，结合官方接口可知是配置的num_layers太大，导致loop嵌套的数量超过阈值了，因此我们需要将num_layers改小点，或者按照上面提示说的，通过 context.set_context(max_call_depth=value)*调大阈值，但不建议这样做，因为嵌套层次这么多，执行速度会非常慢。

检查代码发现，04行代码num_layers为1000，解决办法为修改max_call_depth的值：context.set_context(max_call_depth=10000)

3 解决方法

基于上面已知的原因，很容易做出如下修改：

01 context.set_context(max_call_depth=10000)
02 class Net(nn.Cell):
03   def __init__(self):
04     super(Net, self).__init__()
05     self.gru = nn.GRU(10, 16, 1000, has_bias=True, batch_first=True, bidirectional=False)
06
07   def construct(self, x, h0):
08     output = self.gru(x, h0)
09     return output
10
10 net = Net()
11 x = Tensor(np.ones([3, 5, 10]).astype(np.float32))
12 h0 = Tensor(np.ones([1 * 1000, 3, 16]).astype(np.float32))
13 output, hn = net(x, h0)
14 print('output', output.shape))

此时执行成功，输出如下：

output (3, 5, 16)

4 总结

定位报错问题的步骤：

1、找到报错的用户代码行：output = self.gru(x, h0)；

2、根据日志报错信息中的关键字，缩小分析问题的范围：Exceed function call depth limit 1000, (function call depth: 1001, simulate call depth: 997)；

3、需要重点关注报错提示信息、初始化的正确性。

话题		回复	浏览量
MindSpore报RuntimeError:ReduceSum算子不支持8维及以上的输入而报错模型训练-Model Training	0	2	2025 年7 月 29 日
MindSpore报错RuntimeError: Load op info form json config failed, version: Ascend310，及解决推理经验-Inference Experience	0	0	2025 年8 月 1 日
MindSpore报错AttributeError:'NoneType' has no attribute... 模型训练-Model Training	0	1	2025 年7 月 24 日
Mindspore 报错:the dimension of logits must be equal to 2, but got 3 模型训练-Model Training	0	1	2025 年7 月 24 日
MindSpore论坛报错活动第三十五期活动公告 Activities	1	34	2025 年6 月 25 日

MindSpore报错“RuntimeError: Exceed function call depth limit 1000”

1 报错描述

1.1 系统环境

1.2 基本信息

1.2.1 脚本

1.2.2 报错

2 原因分析

3 解决方法

4 总结

5 参考文档

5.1 GRU算子API接口

MindSpore报错“RuntimeError: Exceed function call depth limit 1000”

1 报错描述

1.1 系统环境

1.2 基本信息

1.2.1 脚本

1.2.2 报错

2 原因分析

3 解决方法

4 总结

5 参考文档

5.1 GRU算子API接口

相关话题