Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Shiyu Huang's picture

9 7 13

Shiyu Huang

ShiyuHuang

21world's profile picture

JLouisBiz's profile picture

AdinaY's profile picture

·

https://huangshiyu13.github.io

huangshiyu13
shiyu-huang-841b92106

AI & ML interests

VLM, LLM, RL, AIGC, Robotics

Organizations

ShiyuHuang 's collections 5

streaming_model

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 98

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31, 2025 • 124
LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5, 2025 • 62

S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20, 2025 • 63

video_benchmark

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21, 2025 • 84
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9, 2025 • 44
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 98

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published Feb 18, 2025 • 41

streaming_model

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 98

video_benchmark

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21, 2025 • 84
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9, 2025 • 44
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 98

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31, 2025 • 124
LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5, 2025 • 62

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published Feb 18, 2025 • 41

S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20, 2025 • 63

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs