arxiv:2507.16815
Fu-En Yang
FuEnYang
AI & ML interests
Computer Vision, Deep Learning, Vision-Language Models (VLMs), Vision-Language-Action Models (VLAs), Reasoning Models, Embodied AI
Recent Activity
upvoted
a
paper
about 12 hours ago
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
upvoted
a
paper
about 12 hours ago
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation
upvoted
a
paper
about 12 hours ago
LTX-2: Efficient Joint Audio-Visual Foundation Model