P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

Share this to:


Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang1


Prompt tuning, which only tunes continuous
prompts with a frozen language model, sub-
stantially reduces per-task storage and mem-
ory usage at training. However, in the con-
text of NLU, prior work reveals that prompt
tuning does not perform well for normal-sized
pretrained models. We also find that exist-
ing methods of prompt tuning cannot handle
hard sequence labeling tasks, indicating a lack
of universality. We present a novel empiri-
cal finding that properly optimized prompt tun-
ing can be universally effective across a wide
range of model scales and NLU tasks. It
matches the performance of finetuning while
having only 0.1%-3% tuned parameters. Our
method P-Tuning v2 is an implementation of
Deep Prompt Tuning (Li and Liang, 2021; Qin
and Eisner, 2021) optimized and adapted for
NLU. Given the universality and simplicity of
P-Tuning v2, we believe it can serve as an al-
ternative to finetuning and a strong baseline for
future research.


Papers With Code

Publication Date:



Liu, Xiao, et al. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-Tuning Universally Across Scales and Tasks. arXiv:2110.07602, arXiv, 20 Mar. 2022. arXiv.org, http://arxiv.org/abs/2110.07602.




Amanda Morton

Comment Date:



This paper explains how we optimized our continuous prompts using prompt tuning as an alternative to fine-tuning our model.