NFS上生成mindrecord报错Failed to write mindrecord meta files

1. 系统环境

硬件环境(Ascend/GPU/CPU): Ascend910
MindSpore版本:2.2.0
执行模式(PyNative/ Graph):不限
Python版本:不限
操作系统平台:Linux

2. 报错信息

2.1 问题描述及报错

模型正向报错信息如下:

start commit !!  
[ERROR] MD(68811,7f2 382c85700,python):2023-11-27- 16:03:22.406.216 [mindspore/ccsrc/minddata/mindrecord/ io/shard_index_generator.cc:587] DatabaseWriter] Exception thrown from dataset pipeline. Refer to 'Dataset Pipeline Error Message'.	• [Internal ERROR] Failed to execute the sql [ CREATE TABLE INDEXES( ROW ID	INT NOT NULL, PAGE ID RAW	INT NOT NULL, PAGE OFFSET RAW	INT NOT NULL, PAGE_OFFSET_RAW_END  INT NOT NULL, ROW_GROUP_ID	INT NOT NULL, PAGE_ID_BLOB	INT NOT NULL, PAGE_OFFSET_BLOB	INT NOT NULL,	PAGE_OFFSET_BLOB_END INT	NOT NULL,-PRIMARY KEY(ROW_ID)); ], database is locked  
Line of code : 131  
File	: mindspore/ccsrc/minddata/mindrecord/io/shardindex_generator.cc  
Traceback (most recent call last):  
    File "preprocess_data.py", line252, in <module>  
        main()  
    File "preprocess_data •py", line 221, in main  
        writer.commit()  
    File "/root/miniconda3/envs/lc j_new/lib/python3.7/s ite-packages/mindspore/mindrecord/filewriter.py" , line 455, in commit  
        self._generator.wr ite_to db()  
    File "/root/miniconda3/envs/lcj_new/l ib/python3.7/s ite-packages/mindspore/mindrecord/shard indexgenerator.py", line 70, in write_to_db  
        ret = self._generator.write_to_db()  
RuntimeError: Exception thrown fromdataset pipeline. Refer to 'Dataset Pipeline Error Message'.  
- Framework Unexpected Exception Raised:  
This exception is caused by framework's unexpected error. Please create an issue at https://gitee.com/mindspore/mindspore/issues to get help.  
- Dataset Pipeline Error Message:  
[ERROR] [Internal ERROR] Failed to write mindrecord meta files.  
- C++ Call Stack: (For framework developers)  
mindspore/ccsrc/minddata/mindrecord/ io/shard index_generator.cc(576).

3. 根因分析

Sqlite暂不支持对NFS上的网络文件进行操作,请用户使用本地磁盘进行mindrecord读写。

6.0 How To Corrupt Your Database Files  
The pager module is very robust butit can be subverted. This section attempts to identify and explain the risks.(See also the Things That Can GoWrong section of the article on Atomic Commit.  
Clearly, a hardware or operatingsystem fault that introduces incorrect data into the middle of thedatabase file or journal will causeproblems. Likewise, if a rogueprocess opens a database file or journal and writes malformed data into the middle of it, then the database will become corrupt.There is not much that can bedone about these kinds of problemsso they are given no further attention.  
SQLite uses POSIX advisory locksto implement locking on Unix. OnWindows it uses the LockFile(),LockFileEx(), and UnlockFile() system calls. SQLite assumes that these system calls all work as advertised. If that is not the case, then database corruption can resuIt. One should note that POSIX advisory locking is known to be buggy or even unimplemented on many NFS implementations (includingrecent versions of Mac OS X) andthat there are reports of locking problems for network filesıvstems under Windows. Your best defense is to not use SQLitefor files on a network filesystem.

4. 解决方案

mindrecord存储的路径,避免在NFS上,应该指向本地磁盘