MindSpore报运行时错误: x shape的C_in除以group应等于weight的C_in

1 报错描述

1.1 系统环境

Hardware Environment(Ascend/GPU/CPU): Ascend
Software Environment:
-- MindSpore version (source or binary): 1.6.0
-- Python version (e.g., Python 3.7.5): 3.7.6
-- OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 4.15.0-74-generic
-- GCC/Compiler version (if compiled from source):

1.2 基本信息

1.2.1 脚本

训练脚本是通过构建Conv2d的单算子网络,对输入张量计算二维卷积。脚本如下:

01 class Net(nn.Cell):
02   def __init__(self,in_channels,out_channels,kernel_size):
03     super(Net, self).__init__()
04     self.in_channels = in_channels
05     self.out_channels = out_channels
06     self.kernel_size = kernel_size
07     self.conv2d = nn.Conv2d(self.in_channels,self.out_channels,self.kernel_size)
08
09   def construct(self, x):
10     result = self.conv2d(x)
11     return result
12
13 net = Net(in_channels=1, out_channels =240,kernel_size =4)
14 x = Tensor(np.ones([3, 3, 1024, 640]), mindspore.float32)
15 out = net(x)
16 print('out',out.shape)

1.2.2 报错

这里报错信息如下:

[CRITICAL] CORE(117119,ffff837a2010,python):2022-04-07-09:46:50.529.443 [build/mindspore/merge/mindspore/core/ops_merge.cc:6648] Conv2dInferShape] For 'Conv2D', 'C_in' of input 'x' shape divide by parameter 'group' should be equal to 'C_in' of input 'weight' shape: 1, but got 'C_in' of input 'x' shape: 3, and 'group': 1
Traceback (most recent call last):
  File "demo.py", line 15, in <module>
    out = net(x)
  File "../lib/python3.7/site-packages/mindspore/nn/cell.py", line 576, in __call__
    out = self.compile_and_run(*args)
  File "../lib/python3.7/site-packages/mindspore/nn/cell.py", line 942, in compile_and_run
    self.compile(*inputs)
  File "../lib/python3.7/site-packages/mindspore/nn/cell.py", line 915, in compile
    _cell_graph_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
  File "../lib/python3.7/site-packages/mindspore/common/api.py", line 791, in compile
    result = self._graph_executor.compile(obj, args_list, phase, self._use_vm_mode())
RuntimeError: build/mindspore/merge/mindspore/core/ops_merge.cc:6648 Conv2dInferShape] For 'Conv2D', 'C_in' of input 'x' shape divide by parameter 'group' should be equal to 'C_in' of input 'weight' shape: 1, but got 'C_in' of input 'x' shape: 3, and 'group': 1
The function call stack (See file 'demo/rank_0/om/analyze_fail.dat' for more details):
# 0 In file demo.py(10)
        result = self.conv2d (x)
                 ^
# 1 In file ../lib/python3.7/site-packages/mindspore/nn/layer/conv.py(286)
        if self.has_bias:
# 2 In file ../lib/python3.7/site-packages/mindspore/nn/layer/conv.py(285)
        output = self.conv2d(x, self.weight)

2 原因分析

我们着看报错信息,在RuntimeError中,写到*‘C_in’ of input ‘x’ shape divide by parameter ‘group’ should be equal to ‘C_in’ of input ‘weight’ shape: 1, but got ‘C_in’ of input ‘x’ shape: 3, and ‘group’: 1*,意思是输入x shape中C_in 除以 group 必须要等于输入weight shape的C_in,即x_shape[C_in] / group 必须要 == w_shape[C_in] ,但是用户给的w_shape[C_in] 值是1,但是x_shape[C_in] / group 却==3,这个w_shape[C_in]就是权重的channels维的大小,也就是你传的in_channels属性值,检查一下是不是把nn.Conv2d初始化时的in_channels属性设置成1了,在官网中对C_in和in_channels也做了几乎一样的描述。`


检查代码发现,13行代码in_channels确实不等于14行C_in值,此时将in_channels设置为数据相同的C_in值即可。

3 解决方法

基于上面已知的原因,很容易做出如下修改:


此时执行成功,输出如下: out: (3, 240, 1024, 640)

4 总结

定位报错问题的步骤:
1、 找到报错相关的用户代码行: result = self.conv2d (x)
2、 根据日志报错信息中的关键字,缩小分析问题的范围: ‘C_in’ of input ‘x’ shape divide by parameter ‘group’ should be equal to ‘C_in’ of input ‘weight’ shape: 1, but got ‘C_in’ of input ‘x’ shape: 3, and ‘group’: 1
3、需要重点关注变量定义、初始化的正确性。

5 参考文档

5.1 Conv2d算子API接口