MindSpore报错ValueError:For xx,the x shape:xx must be equal to xxx

1 报错描述

1.1 系统环境

Hardware Environment(Ascend/GPU/CPU): Ascend

Software Environment:

-- MindSpore version (source or binary): 1.6.0

-- Python version (e.g., Python 3.7.5): 3.7.6

-- OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 4.15.0-74-generic

-- GCC/Compiler version (if compiled from source):

1.2 基本信息

1.2.1 脚本

训练脚本是通过构建BinaryCrossEntropy单算子网络,计算两个变量二进制交叉熵的例子。脚本如下:

import mindspore.nn as nn

class Net(nn.Cell):
    def __init__(self):
        super(Net, self).__init__()
        self.binary_cross_entropy = ops.BinaryCrossEntropy()
            self.weight = None

    def construct(self, logits, labels):
        result = self.binary_cross_entropy(logits, labels)
        return result

logits = Tensor(np.random.uniform(0, 1, (4, 3, 388, 388)).astype(np.float32))
labels = Tensor(np.random.randint(0, 2, (4, 2, 388, 388)).astype(np.float32))
net = Net()
out = net(logits, labels)
print(‘out:’,out)

1.2.2 报错

这里报错信息如下:

Traceback (most recent call last):
  File "demo.py", line 14, in <module>
    out = net(logits, labels)
  File "../lib/python3.7/site-packages/mindspore/nn/cell.py", line 569, in __call__
    out = self.compile_and_run(*args)
  File "../lib/python3.7/site-packages/mindspore/nn/cell.py", line 899, in compile_and_run
    self.compile(*inputs)
  File "../lib/python3.7/site-packages/mindspore/nn/cell.py", line 884, in compile
    _cell_graph_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
  File "../lib/python3.7/site-packages/mindspore/common/api.py", line 784, in compile
    result = self._graph_executor.compile(obj, args_list, phase, self._use_vm_mode())
ValueError: mindspore/core/utils/check_convert_utils.h:216 Check] For primitive[BinaryCrossEntropy], the x shape: [4,3,388,388,] must be equal to [4,2,388,388,]
The function call stack (See file 'demo/rank_0/om/analyze_fail.dat' for more details):
# 0 In file demo.py(08)
        result = self.binary_cross_entropy(logits, labels, self.weight)

原因分析

我们看报错信息,在ValueError中,写到For primitive[BinaryCrossEntropy], the x shape: [4,3,388,388,] must be equal to [4,2,388,388,],意思是传的labels的shape是 [4, 2, 388, 388],但是你传进去的logits的shape却是[4, 3, 388, 388],而所有的损失函数都会要求logits和labels的shape必须一样,在官网中对BinaryCrossEntropy输入logits和labels的shape做了限制说明。

检查代码发现,12行代码labels的维度为确实不等于logits,此时需要检查传入值是否有问题,使logits和labels的shape相等。

2 解决方法

基于上面已知的原因,很容易做出如下修改:

import mindspore.nn as nn  
  
class Net(nn.Cell):  
    def __init__(self):  
        super(Net, self).__init__()  
        self.binary_cross_entropy = ops.BinaryCrossEntropy()  
            self.weight = None  
  
    def construct(self, logits, labels):  
        result = self.binary_cross_entropy(logits, labels)  
        return result  
  
logits = Tensor(np.random.uniform(0, 1, (4, 3, 388, 388)).astype(np.float32))  
labels = Tensor(np.random.randint(0, 2, (4, 3, 388, 388)).astype(np.float32))  
net = Net()  
out = net(logits, labels)  
print(‘out:’,out)

此时执行成功,输出如下:

out: 1.0000657

3 总结

定位报错问题的步骤:

1、找到报错的用户代码行: result = self.binary_cross_entropy(logits, labels, self.weight) ;

2、 根据日志报错信息中的关键字,缩小分析问题的范围: the x shape: [4,3,388,388,] must be equal to [4,2,388,388,] ;

3、需要重点关注变量定义、初始化的正确性。

4 参考文档

4.1 BinaryCrossEntropy算子API接口