General - a Kopachelli Collection

Kopachelli 's Collections

General

General

updated Aug 10, 2025

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 326
Qwen/Qwen3-14B-GGUF

Text Generation • 15B • Updated May 9, 2025 • 38.8k • 69
Qwen/Qwen3-8B-GGUF

Text Generation • 8B • Updated May 21, 2025 • 93.6k • 134
Qwen/Qwen3-4B-GGUF

Text Generation • 4B • Updated May 21, 2025 • 35.4k • 67
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 153
Qwen/Qwen2.5-Coder-7B-Instruct

Text Generation • 8B • Updated Jan 12, 2025 • 1.31M • • 626
Qwen/Qwen2.5-Coder-14B

Text Generation • 15B • Updated Nov 18, 2024 • 10.8k • • 60
Qwen/Qwen2.5-Coder-14B-Instruct

Text Generation • 15B • Updated Jan 12, 2025 • 260k • • 139
Qwen/Qwen2.5-Coder-7B

Text Generation • 8B • Updated Nov 18, 2024 • 246k • • 134
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 75
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 68
nvidia/Llama-3.1-Nemotron-Nano-8B-v1

Text Generation • 8B • Updated Oct 15, 2025 • 14k • • 216
Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2, 2025 • 41
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

Paper • 2504.16891 • Published Apr 23, 2025 • 25
OpenCodeReasoning-II: A Simple Test Time Scaling Approach via Self-Critique

Paper • 2507.09075 • Published Jul 11, 2025 • 16
tencent/Hunyuan-7B-Instruct

Text Generation • 8B • Updated Sep 2, 2025 • 5.74k • 86
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1, 2025 • 251
zai-org/GLM-4.1V-9B-Thinking

Image-Text-to-Text • 10B • Updated Oct 25, 2025 • 218k • • 770
zai-org/GLM-4.1V-9B-Base

Image-Text-to-Text • 10B • Updated Oct 25, 2025 • 557 • 65
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Paper • 2407.01906 • Published Jul 2, 2024 • 46
deepseek-ai/deepseek-moe-16b-base

Text Generation • 16B • Updated Jan 12, 2024 • 22.3k • 138
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 59
deepseek-ai/deepseek-moe-16b-chat

Text Generation • 16B • Updated Feb 5, 2024 • 8.63k • 155
Skywork/Skywork-VL-Reward-7B

Image-Text-to-Text • 8B • Updated Jun 10, 2025 • 1.22k • 47
Mungert/xLAM-2-32b-fc-r-GGUF

Text Generation • 33B • Updated Sep 24, 2025 • 292 • 5
zai-org/SWE-Dev-7B

8B • Updated Jul 9, 2025 • 241 • 6
Mungert/Skywork-VL-Reward-7B-GGUF

Image-Text-to-Text • 8B • Updated Sep 24, 2025 • 159
Skywork/Skywork-o1-Open-PRM-Qwen-2.5-1.5B

Text Classification • Updated Aug 29, 2025 • 3.7k • 33
jnorthrup/Skywork-o1-Open-PRM-Qwen-2.5-7B

Text Classification • 8B • Updated Jan 1, 2025 • 2
mistralai/Mixtral-8x7B-Instruct-v0.1

47B • Updated Jul 24, 2025 • 480k • 4.64k
Mungert/granite-guardian-3.1-8b-GGUF

Text Generation • 8B • Updated Sep 24, 2025 • 77
ariels/pest_twitter_geoparsing

Viewer • Updated Oct 9, 2024 • 678 • 7