ImageFolderDataset读取图片在进行PIL转化的时候出现报错

1 系统环境

硬件环境(Ascend/GPU/CPU): Ascend/GPU/CPU
MindSpore版本: mindspore=2.0.0
执行模式(PyNative/ Graph):不限
Python版本: Python=3.7.5
操作系统平台: 不限

2 报错信息

2.1 问题描述

定义了一个transforms的操作集合,通过ImageFolderDataset读取图片数据,transforms_list里面包含转ToPIL以及Resize操作,运行之后,会出现如下报错。显示图片的维度信息不对。

2.2 脚本信息

import mindspore.dataset as ds
import mindspore.dataset.vision as vision
import nupy as np
from mindspore.dataset.transforms import Compose
from PIL import Image

image path = "/data2/shenwei/Minddata/test/mindspore/tests/ut/data/dataset/testPK/data"

transforms_List = Compose([vision.ToPIL(),vision.Resize((200, 200))])

# pipeɩine mode
dataset = ds.ImageFoderDataset(image_path, shuffle=False)
dataset = dataset.map(operations=transforms_ɩist, input_coumns="image")
for item in dataset.create_dict_iterator():
    print(item[″image"].shape)
    print(item[″image"].dtype)
    break

# eager mode
img = np.fromfile("/data2/shenwei/Minddata/test/mindspore/tests/ut/data/dataset/testPK/data/class1/Θ.jpg",np.uint8)
img = transforms_list(img)[0]
assert isinstance(img, Image.Image) print(type(img))

2.3 问题描述

Traceback (most recentcall ast):  
  File "tset_numpy.py", line 15, in<module>  
    for item in dataset.create_dict .iterator():  
  File"/home/shenwei/.conda/envs/㏕ib/python3.7/s ite-packages/mindspore/dataset/engine/iterators.py",line 152, in _next_   
    data = sef._get_next( )  
  File "/home/shenwei/.conda/envs/lib/python3.7/s ite-packages/mindspore/dataset/engine/iterators.py", line 277, in _get_next  
    raise err  
  File"/home/shenwei/.conda/envs/lib/python3.7/site-packages/mindspore/dataset/engine/iterators.py", line 260, in _get_next   
-Python Call stack:   
map operation: [PyFunc] failed. Thecorresponding data file is: /data2/shenwei/Minddata/test/mindspore/tests/ut/data/dataset/testpK/data/class1/2.jpg. Error description: valueÈrror: Traceback (most recentcall last):   
  File'"/home/shenwei/.conda/envs/shenweb/python3.7/s ite-packages/mindspore/dataset/transforms/py_transforms_util.py", line 63, in copose   
    args = transform(*args )   
  File "/home/shenwei/.conda/envs/lib/python3.7/s ite-packages/mindspore/dataset/transforms/transforms.py", Line 133, in _call_   
    return self._execute py( img)   
  File"/home/shenwei/.conda/envs/lib/python3.7/s ite-packages/mindspore/dataset/vision/transforms.py",line 4739, in _execute_py   
    return util.to_pil( img)   
  File "/home/shenwei/ conda/envs/lib/python3.7/s ite-packages/mindspore/dataset/vis ion/py_transforms_util.py", line 178, in to_pil   
    raise ValueError("Thedimens ion of input image should be 2or 3. Got {}.".format(img.ndim))   
ValueError: The dimens ion ofinput image should be 2 or 3. Got 1   
  
- Dataset Pipeline Error Message:   
ERROR Execute user Python code failed. check 'pvthon Call Stack' above  
- C++ Call Stack: (For framework developers.  
mindspore/ccsrc/minddata/dataset/engine/datasetops/map op/map iob.h(57)

3 根因分析

通过上面的报错可以发现实际的报错信息为“ValueError: The dimension of input image should be 2 or 3. Got 1.”意思是to_pil这个方法传入的图像数据的维度必须为2或者3,此处传入的图像的维度信息为1。所以校验报错。从代码中可以看出,无论是pipline还是eager模式,在读出此图片的时候都未进行解码操作,ImageFolderDataset有个主意点,如果在读取图片未指定其参数decode为True的话,那么传出来的对象将是一维的。所以问题的根因在这。

4 解决方案

针对该问题的解决方案有两种:
1.在加载ImageFolderDataset的数据集的时候,将ImageFolderDataset的参数decode设置为True。然后在通过map去构建pipeline即可。


2.在构建ImageFolderDataset数据集的时候不设置decode为True,在mapo操作中添加vision.Decode()操作也可以成功执行。Eager模式同样也是这么解决。