Should We Still Pretrain Encoders with Masked Language Modeling?
Paper
• 2507.00994 • Published
• 81
Research material on research about pre-training encoders, with extensive comparison on masked language modeling paradigm vs causal langage modeling.