论文概述:
原论文名为《利用预训练检查点进行序列生成任务》,主要探讨了如何将预训练检查点应用于序列生成任务(seq2seq),特别是在自然语言处理(NLP)中的机器翻译、文本摘要、句子拆分和句子融合等任务上。作者提出了一种基于 Transformer 架构的 序列到序列模型,并通过广泛的实验验证了预训练模型在这些任务中的有效性。实验结果显示,与随机初始化的模型相比,利用预训练模型显著改善了性能,尤其是在机器翻译、文本摘要、句子拆分和句子融合任务上。预训练编码器被认为是序列生成任务的重要组成部分,并且编码器与解码器之间的权重共享也能带来更好的效果。
研究背景:
近年来,无监督预训练大型神经网络模型(如 BERT、GPT-2 和 RoBERTa)显著提高了许多自然语言理解任务的性能。通过使用公开发布的预训练检查点,研究人员能够通过热启动(warm-starting)节省大量计算资源并推动多项基准任务的进展。然而,这些预训练模型大多应用于自然语言理解任务,较少研究其在序列生成任务中的应用。
论文研究方法:
1.基于 Transformer 的序列到序列模型:
作者提出了一种基于 Transformer 的序列到序列(seq2seq)模型,该模型能够兼容BERT、GPT-2 和 RoBERTa的预训练检查点。通过使用这些预训练模型的编码器和解码器,模型能够显著提高性能。
2.探索了多种encoder-decoder初始化的组合
所有权重均随机初始化的组合架构。编码器和解码器之间的参数共享的组合。随机初始化的编码器与一个基于 BERT初始化的解码器的组合等
3.应用到多种seq2seq任务
探索了在句子融合、句子拆分、机器翻译和文本摘要任务中的效果,并与从头训练的随机初始化模型进行了比较。
4.研究不同的初始化策略和配置对模型性能的影响:
组合不同的预训练检查点,仅初始化嵌入层,初始化部分层
创新点分析:
**1.将预训练模型扩展至序列生成任务,**而不仅仅是传统的自然语言理解任务。这一进展大大扩展了大型预训练模型的应用场景,并为其他 NLP 任务提供了新的方法。
**2.基于 Transformer 的序列到序列模型架构,**并结合了流行的预训练模型。这种兼容性使得利用预训练模型进行编码和解码成为可能,从而显著提升了序列生成任务的性能。
**3.预训练编码器和解码器权重共享的,**这种方法不仅能提高模型性能,还能显著降低模型的内存占用。共享权重的方式在大多数任务中都表现出了良好的效果,尤其在计算资源有限的情况下尤为重要。
结果:
作者试验了多种编码器-解码器组成的transformer模型在句子级融合与拆分任务中的表现。可以看出,不同的组合在不同的数据集上的得分都有所差异。
在机器翻译任务中,24层BERT-RND的组合展现出最佳的性能。
使用MindSpore NLP进行模型评估:
加载了预训练的google/bert_for_seq_generation_L-24_bbc_encoder 模型,在BBC数据集的Test集上选取了几条进行了实验,用rouge指标作为评判标准。Test集里一共有1000 samples,受限于设备和实际情况,受限于设备和实际情况,我们挑选三条做展示
| 生成的文本 | carry on star patsy rowlands dies actress patsy rowlands known to millions for her roles in the carry on films has died at the age of 71. rowlands starred in nine of the popular carry on films alongside fellow regulars sid james kenneth williams and barbara windsor. she also carved out a successful television career appearing for many years in itv s well - loved comedy bless this house. rowlands died in hove on saturday morning her agent said. born in january 1934 rowlands won a scholarship to the guildhall school of speech and drama scholarship when she was just 15. after spending several years at the players theatre in london she made her film debut in 1963 in tom jones directed by tony richardson. she made her first carry on film in 1969 where she appeared in carry on again doctor. rowlands played the hard - done - by wife or the put - upon employee as a regular carry on star. she also appeared in carry on at your convenience carry on matron and carry on loving as well as others. in recent years she appeared in bbc mini - series the cazalets and played mrs potts in the london stage version of beauty and the beast. agent simon beresford said : she was
just an absolutely favourite client she never complained about anything particularly when she was ill she was an old trouper. she was of the old school - she had skills from musical theatre and high drama that is why she worked with the great and the good of directors. she didn t mind always being recognised for the carry on films because she thoroughly enjoyed making them. she was a really lovely person and she will be much missed. her last appearance on stage was as mrs pearce in the award - winning production of my fair lady at the national theatre. previously married she leaves one son alan. her funeral will be a private family occasion with a memorial service at a later date. aires aires aires ¢ aires ¢ [UNK] [UNK] [UNK] [UNK] [UNK] [UNK] [UNK] [UNK] revolutionary [UNK] hollow radcliffe sleeves sleeves | Generated text: sydney to host north v south game sydney will host a northern versus southern hemisphere charity match in june or july the australian rugby union ( aru ) said on wednesday. the match will include players from the lions tour of new zealand. the australian rugby union has thrown its support behind a proposed north - south match to raise funds for the tsunami appeals the aru said. the date is yet to be decided but the most likely venue is sydney s olympic stadium. aru chief executive gary flowers said the world cricket charity match in melbourne earlier this month had inspired the aru. we still need to discuss the options with the irb ( international rugby board ) the lions and our sanzar ( south africa new zealand and australia rugby ) partners but june or july is seen as a better option than march to ensure we have the cream of southern hemisphere rugby available he said. wallabies captain george gregan said the charity match was a great initiative. tri - nations rivals australia new zealand and south africa would feature prominently in a southern team against a northern side comprised of six nations teams france ireland england wales italy and scotland.
coach clive woodward s lions squad will tour new zealand in june and july including tests on 25 june 2 and 9 july. almost 80 000 fans packed into melbourne cricket ground on 10 january for a charity match that raised £5. 9m for victims of the asian tsunami. feed feed 690 [UNK] feed balcony revolutionary revolutionary everyday assent aires aires [UNK] 690 690 [UNK] [UNK] [UNK] robertson aires | Generated text: uk coal plunges into deeper loss shares in uk coal have fallen after the mining group reported losses had deepened to £51. 6m in 2004 from £1. 2m. the uk s biggest coal producer blamed geological problems industrial action and operating flaws at its deep mines for its worsening fortunes. the south yorkshire company led by new chief executive gerry spindler said it hoped to return to profit in 2006. in early trade on thursday its shares were down 10 % at 119 pence. uk coal said it was making significant progress in shaking up the business. it had introduced new wage structures a new daily maintenance regime for machinery at its mines and methods to continue mining in adverse conditions. the company said these actions should significantly uplift earnings. it expected
2005 to be a transitional year and to return to profitability in 2006. the recent rise in coal prices has failed to benefit the company as most of its output had already been sold it said. total production costs were £1. 30 per gigajoule uk coal said but the average selling price was just £1. 18 per gigajoule. we have a long journey ahead to fix these issues. we continue to make progress and great strides have already been made said mr spindler. uk coal operates 15 deep and surface mines across nottinghamshire derbyshire leicestershire yorkshire the west midlands northumberland and durham. aires aires washington [UNK] revolutionary [UNK] [UNK] robertson und und 新 und [UNK]eta mal aires aires aires aires [UNK] |
|----|----|----|----|
| 得分 | [{‘rouge-1’: {‘r’: 0.9679144385026738, ‘p’: 0.9141414141414141, ‘f’: 0.9402597352638219}, ‘rouge-2’: {‘r’: 0.958904109589041, ‘p’: 0.8832807570977917, ‘f’: 0.9195402248934834}, ‘rouge-l’: {‘r’: 0.9679144385026738, ‘p’: 0.9141414141414141, ‘f’: 0.9402597352638219}}] | [{‘rouge-1’: {‘r’: 0.9530201342281879, ‘p’: 0.9044585987261147, ‘f’: 0.9281045701668162}, ‘rouge-2’: {‘r’: 0.9372197309417041, ‘p’: 0.836, ‘f’: 0.8837209252488502}, ‘rouge-l’: {‘r’: 0.9530201342281879, ‘p’: 0.9044585987261147, ‘f’: 0.9281045701668162}}] | [{‘rouge-1’: {‘r’: 0.9931506849315068, ‘p’: 0.9294871794871795, ‘f’: 0.9602648956677339}, ‘rouge-2’: {‘r’: 0.9908675799086758, ‘p’: 0.9194915254237288, ‘f’: 0.9538461488531337}, ‘rouge-l’: {‘r’: 0.9931506849315068, ‘p’: 0.9294871794871795, ‘f’: 0.9602648956677339}}] |
undefined|----|----|----|----|
完整代码已上传到Github仓库中,您可以通过以下链接查看并运行:
https://github.com/Mr-Shay/BERT_generation-mindnlp.git
结论:
1.预训练编码器至关重要:使用预训练编码器显著提高了序列生成任务的性能,尤其是在理解输入文本方面。
2.共享编码器和解码器的权重能够提高性能:共享编码器和解码器的参数在大多数任务中能够提高性能,并减少内存占用,尤其对于大规模的 NLP 任务更为高效。
3.较大模型的效果更好:较大的模型相对较小模型有更好的性能,但它们容易在小数据集上过拟合,需要谨慎调整。
推荐各位开发者用MindSpore NLP来加载并复现该模型。MindSpore NLP提供与 PyTorch 类似的简洁接口,降低了学习和使用的难度,开发者可以迅速加载和复现模型,节省时间和精力。详尽的文档和示例代码帮助开发者更好地理解和应用各种功能。MindSpore NLP支持多种自然语言处理任务,如文本分类、命名实体识别等,适应不同的需求,并拥有一个支持性强的社区,提供技术交流和经验分享的机会。


