没有ckpt文件导致模型加载执行报错:ckpt does not exist, please check whether the 'ckpt_file_name' is correct.

1.系统环境

硬件环境(Ascend/GPU/CPU): GPU

软件环境:

– MindSpore 版本: 1.7.0

执行模式:动态图(PYNATIVE_MODE) – Python 版本: 3.7.6

– 操作系统平台: linux

2.报错信息

2.1 问题描述

没有下载对应模型的ckpt文件,导致模型加载执行报错。

2.2 报错信息

ValueError: For ‘load_checkpoint’, the checkpoint file: /data0/BigPlatform/FL/limingjun/华为/mindspore-face -github/example/facerecognition_ascend_v170_ms1mv2_research_cv_acc90.ckpt does not exist, please check whether the ‘ckpt_file_name’ is correct.

2.3 脚本代码

def get_net():
   log_path = os.path.join(config.ckpt_path, 'logs')
   config.logger = get_logger(log_path, config.local_rank)

   config.device_target = 'GPU'
   '''get_model'''
   net = get_backbone(config)
   if config.fp16:
       net.add_flags_recursive(fp16=True)
   config.weight = "./facerecognition_ascend_v170_ms1mv2_research_cv_acc90.ckpt"
   param_dict = load_checkpoint(config.weight)
   param_dict_new = {}
   for key, value in param_dict.items():
       if key.startswith('moments.'):
           continue
       elif key.startswith('network.'):
           param_dict_new[key[8:]] = value
       else:
           param_dict_new[key] = value
   load_param_into_net(net, param_dict_new)
   config.logger.info('INFO, ------------- load model success--------------')


   if config.device_target == 'GPU':
       print("GPU")
       net.to_float(mstype.float32)
   net.set_train(False)

   return net

3.根因分析

看报错信息,翻译说明ckpt不存在,请检查“ckpt_file_name”是否正确。

mindspore需要下载对应模型参数的ckpt文件。 https://www.mindspore.cn/resources/hub/

这官网提供相关模型的下载网址。 除了下载对应的ckpt文件,也可以通过mindspore_hub库导出。

4.解决方案

解决方案说明:下载ckpt文件

修改后代码:

def get_net():
   log_path = os.path.join(config.ckpt_path, 'logs')
   config.logger = get_logger(log_path, config.local_rank)

   config.device_target = 'GPU'
   '''get_model'''
   net = get_backbone(config)
   if config.fp16:
       net.add_flags_recursive(fp16=True)
   config.weight = "../FaceRecognition/facerecognition_ascend_v170_ms1mv2_research_cv_acc90.ckpt"
   param_dict = load_checkpoint(config.weight)
   param_dict_new = {}
   for key, value in param_dict.items():
       if key.startswith('moments.'):
           continue
       elif key.startswith('network.'):
           param_dict_new[key[8:]] = value
       else:
           param_dict_new[key] = value
   load_param_into_net(net, param_dict_new)
   config.logger.info('INFO, ------------- load model success--------------')


   if config.device_target == 'GPU':
       print("GPU")
       net.to_float(mstype.float32)
   net.set_train(False)

   return net

正常导入ckpt文件后,模型可以正常的加载。