11 32 26

Mengzhao Chen

ChenMnZ

https://chenmnz.github.io/

ChenMnZ

AI & ML interests

model compression

Recent Activity

upvoted an article about 2 months ago

The Optimal Architecture for Small Language Models

upvoted a paper about 2 months ago

mHC: Manifold-Constrained Hyper-Connections

upvoted a paper 3 months ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

View all activity

Organizations

None yet

upvoted an article about 2 months ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

118

upvoted a paper about 2 months ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 311

upvoted 4 papers 3 months ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 73

V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models

Paper • 2511.16668 • Published Nov 20, 2025 • 55

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published Nov 13, 2025 • 99

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 38

upvoted 3 papers 4 months ago

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22, 2025 • 115

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20, 2025 • 123

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29, 2025 • 78

commented a paper 4 months ago

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29, 2025 • 78 •

upvoted 2 papers 4 months ago

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Paper • 2510.24824 • Published Oct 28, 2025 • 17

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 91

upvoted 2 papers 5 months ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13, 2025 • 181

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24, 2025 • 82

upvoted an article 6 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

upvoted 2 papers 9 months ago

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22, 2025 • 34

Scaling Diffusion Transformers Efficiently via μP

Paper • 2505.15270 • Published May 21, 2025 • 35

authored 3 papers 9 months ago

Mengzhao Chen

AI & ML interests

Recent Activity

Organizations

ChenMnZ's activity

The Optimal Architecture for Small Language Models

Efficient LLM Pretraining: Packed Sequences and Masked Attention