SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization Paper • 2511.12982 • Published Nov 17 • 3
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization Paper • 2511.12982 • Published Nov 17 • 3
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization Paper • 2511.12982 • Published Nov 17 • 3 • 2
Backdoor Cleaning without External Guidance in MLLM Fine-tuning Paper • 2505.16916 • Published May 22 • 17
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model Paper • 2503.04543 • Published Mar 6 • 1
MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models Paper • 2510.16641 • Published Oct 18 • 4