Context-Picker: Dynamic context selection using multi-stage reinforcement learning Paper • 2512.14465 • Published 25 days ago • 1
IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation Paper • 2512.10730 • Published 30 days ago • 3
LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning Paper • 2509.24786 • Published Sep 29, 2025 • 6
Temporal Memory Attention for Video Semantic Segmentation Paper • 2102.08643 • Published Feb 17, 2021
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion Paper • 2407.07844 • Published Jul 10, 2024 • 1
FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models Paper • 2508.20586 • Published Aug 28, 2025 • 3
WebNovelBench: Placing LLM Novelists on the Web Novel Distribution Paper • 2505.14818 • Published May 20, 2025 • 4
ViSpeak: Visual Instruction Feedback in Streaming Videos Paper • 2503.12769 • Published Mar 17, 2025 • 8
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation Paper • 2501.11325 • Published Jan 20, 2025 • 5
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding Paper • 2406.08877 • Published Jun 13, 2024
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models Paper • 2407.15886 • Published Jul 21, 2024 • 3