Skip to main content

Towards A Unified Training for Levenshtein Transformer

Kangjie Zheng (Peking University); Longyue Wang (Tencent AI Lab); Zhihao Wang (Xiamen University); Chen Binqi (Peking University); Ming Zhang (Peking University); Zhaopeng Tu (Tencent AI Lab)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Levenshtein Transformer (LevT) is a widely-used text-editing model, which generates a sequence based on editing operations (deletion and insertion) in a non-autoregressive manner. However, it is challenging to train the key refinement components of LevT due to training-inference discrepancy. By carefully designing experiments, our work reveals that the deletion module is under-trained while the insertion module is over-trained due to the imbalance training signals for the two refinement modules. Based on these observations, we further propose a dual learning approach that can remedy the imbalance training by feeding an initial input to both refinement modules, which is consistent with the process in inference. Experimental results on three representative NLP tasks demonstrate the effectiveness and universality of the proposed approach.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00