SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations Paper • 2512.05905 • Published 25 days ago • 19
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning Paper • 2511.06805 • Published Nov 10 • 12
WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation Paper • 2511.06251 • Published Nov 9 • 13
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation Paper • 2511.08195 • Published Nov 11 • 31
UI2Code$^\text{N}$: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation Paper • 2511.08195 • Published Nov 11 • 31 • 4
Running on CPU Upgrade Featured 2.74k The Smol Training Playbook 📚 2.74k The secrets to building world-class LLMs
Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published Oct 20 • 67
CogView: Mastering Text-to-Image Generation via Transformers Paper • 2105.13290 • Published May 26, 2021
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations Paper • 2402.04236 • Published Feb 6, 2024 • 9
Relay Diffusion: Unifying diffusion process across resolutions for image synthesis Paper • 2309.03350 • Published Sep 4, 2023
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers Paper • 2204.14217 • Published Apr 28, 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers Paper • 2205.15868 • Published May 29, 2022 • 1