Tensor张量shape不匹配导致执行报错:ValueError:x.shape和y.shape不能广播

1.系统环境

硬件环境(Ascend/GPU/CPU): GPU
软件环境:– MindSpore 版本: 1.7.0
执行模式: 动态图(PYNATIVE_MODE) – Python 版本: 3.7.6
操作系统平台: linux

2.报错信息

2.1 问题描述

某Tensor张量通道与其他Tensor张量通道不匹配导致运行报错。

2.2 报错信息

ValueError: For ‘Sub’, x.shape and y.shape are supposed to broadcast, where broadcast means that x.shape[i] = 1 or -1 or y.shape[i] = 1 or -1 or x.shape[i] = y.shape[i], but now x.shape and y.shape can not broadcast, got i: -3, x.shape: [112, 112, 3], y.shape: [3, 1, 1].

2.3 脚本代码

class MyWithLossCell(nn.Cell):
  def __init__(self,net,loss_fn,input_tensor):
      super(MyWithLossCell, self).__init__(auto_prefix=False)
      self.net = net
      self._loss_fn = loss_fn
      self.STD = Tensor([0.229, 0.224, 0.225])
      self.MEAN = Tensor([0.485, 0.456, 0.406])
      self.expand_dims = mindspore.ops.ExpandDims()
      self.normalize = P.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
      self.tensorize = ToTensor()
      self.input_tensor = input_tensor
      self.input_emb = self.net(self.expand_dims(self.input_tensor, 0))
    
  def construct(self,mask_tensor):
      ref = mask_tensor
      adversarial_tensor = mindspore.numpy.where((ref == 0), self.input_tensor, (mask_tensor - self.MEAN[:, None, None] )/ self.STD[:, None, None])
      adversarial_emb =  self.net(self.expand_dims(adversarial_tensor, 0))
      loss = self._loss_fn( adversarial_emb,self.input_emb,mask_tensor)
      return loss
    
  @property
  def backbone_network(self):
      return self.net

3.根因分析


看报错信息,翻译意思是x.shape[i] = 1或-1或y.shape[i] = 1或-1或x.shape[i] = y.shape[i],但现在x.shape和y.shape不能广播,得到i: -3, x.shape: [112,12,3], y.shape:[3,1,1]。这说明x和y的shape不匹配。

调试发现input_tensor的shape是(3,112,112),而mask_tensor是(112,112,3),shape不匹配。

4.解决方案

解决方案说明:将mask_tensor和input_tensor的shape进行统一。

查看mindspore文档发现,利用swapaxes可以交换张量的两个轴。

修改后代码:

def _create_mask(self, face_image):
       mask = Image.new('RGB', face_image.size, color=(0, 0, 0))
       d = ImageDraw.Draw(mask)
       landmarks = fr.face_landmarks(np.array(face_image))
       area = [landmark
               for landmark in landmarks[0]['chin']
               if landmark[1] > max(landmarks[0]['nose_tip'])[1]]
       area.append(landmarks[0]['nose_bridge'][1])
       d.polygon(area, fill=(255, 255, 255))
       mask_array = np.array(mask)
       mask_array = mask_array.astype(np.float32)

       for i in range(mask_array.shape[0]):
           for j in range(mask_array.shape[1]):
               for k in range(mask_array.shape[2]):
                   if mask_array[i][j][k] == 255.:
                       mask_array[i][j][k] = 0.5
                   else:
                       mask_array[i][j][k] = 0

       mask_tensor = Tensor(mask_array)
       mask_tensor = mask_tensor.swapaxes(0, 2).swapaxes(1, 2)
       mask_tensor.requires_grad = True
       return mask_tensor

找到生成mask_tensor的函数,在最原始端对mask_tensor的shape进行更改。
修改后代码正常运行。