ViLaVT
Collection
3 items
•
Updated
This repository contains the ViLaVT-sft-7B model as presented in Chatting with Images for Introspective Visual Thinking. Please refer to the code https://github.com/AntResearchNLP/ViLaVT.
If you find our work helpful, please consider citing our papers:
@misc{wu2026chattingimagesintrospectivevisual,
title={Chatting with Images for Introspective Visual Thinking},
author={Junfei Wu and Jian Guan and Qiang Liu and Shu Wu and Liang Wang and Wei Wu and Tieniu Tan},
year={2026},
eprint={2602.11073},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.11073},
}