The checkpoint file of rank1 is needed for converting rank1's checkpoint, but it is missing.
此问题需要检查训练使用的yaml配置文件; yaml配置文件中设置auto_trans_ckpt=False,enable_parallel_optimizer=False后,模型微调可以运行起来了。
yaml
auto_trans_ckpt=False
enable_parallel_optimizer=False