Meta VL-JEPA - Vision-Language Prediction Models Collection Meta VL-JEPA Vision-Language Joint Embedding Predictive Architecture for video understanding • 6 items • Updated Jan 16 • 7
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16 Text Generation • 32B • Updated 16 days ago • 72k • 107
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 Image-Text-to-Text • 13B • Updated Dec 2, 2025 • 94.5k • 74