FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos Paper • 2512.10927 • Published Dec 11, 2025 • 6
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14, 2025 • 45
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation Paper • 2511.01163 • Published Nov 3, 2025 • 32
LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation Paper • 2510.22946 • Published Oct 27, 2025 • 18