Nikolai Mozgovoi's picture

86 4

Nikolai Mozgovoi

vonexel

·

vonexel

AI & ML interests

None yet

Recent Activity

upvoted a paper 26 days ago

TiDAR: Think in Diffusion, Talk in Autoregression

upvoted a paper 26 days ago

LuxDiT: Lighting Estimation with Video Diffusion Transformer

upvoted a paper 26 days ago

2D Gaussian Splatting with Semantic Alignment for Image Inpainting

View all activity

Organizations

None yet

upvoted 12 papers 26 days ago

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published Nov 12, 2025 • 118

LuxDiT: Lighting Estimation with Video Diffusion Transformer

Paper • 2509.03680 • Published Sep 3, 2025 • 17

2D Gaussian Splatting with Semantic Alignment for Image Inpainting

Paper • 2509.01964 • Published Sep 2, 2025 • 7

Lost in Embeddings: Information Loss in Vision-Language Models

Paper • 2509.11986 • Published Sep 15, 2025 • 28

Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models

Paper • 2509.12132 • Published Sep 15, 2025 • 6

Human3R: Everyone Everywhere All at Once

Paper • 2510.06219 • Published Oct 7, 2025 • 10

ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

Paper • 2510.08551 • Published Oct 9, 2025 • 33

BLIP3o-NEXT: Next Frontier of Native Image Generation

Paper • 2510.15857 • Published Oct 17, 2025 • 24

LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal

Paper • 2510.15868 • Published Oct 17, 2025 • 26

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published Oct 18, 2025 • 48

Accelerating Vision Transformers with Adaptive Patch Sizes

Paper • 2510.18091 • Published Oct 20, 2025 • 6

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23, 2025 • 40

upvoted 2 papers about 1 month ago

Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story

Paper • 2511.15210 • Published Nov 19, 2025 • 89

RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models

Paper • 2510.10390 • Published Oct 12, 2025 • 4

upvoted a paper 3 months ago

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

Paper • 2311.02772 • Published Nov 5, 2023 • 8

upvoted 5 papers 4 months ago

Durian: Dual Reference-guided Portrait Animation with Attribute Transfer

Paper • 2509.04434 • Published Sep 4, 2025 • 10

Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis

Paper • 2509.09595 • Published Sep 11, 2025 • 48

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published Aug 30, 2025 • 17

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published Sep 1, 2025 • 33

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Paper • 2509.08519 • Published Sep 10, 2025 • 128