πŸ‡°πŸ‡·β†”οΈπŸ‡ΊπŸ‡Έ LFM2-v8-rl-10k-merged

LiquidAI LFM2-1.2B 기반 ν•œμ˜/μ˜ν•œ λ²ˆμ—­ SOTA λͺ¨λΈ

GRPO (Group Relative Policy Optimization) κ°•ν™”ν•™μŠ΅μœΌλ‘œ 400 Step ν•™μŠ΅ μ™„λ£Œλœ μ΅œμ’… 버전.

πŸ“Š μ„±λŠ₯ (Flores-200 Enβ†’Ko, 1012 Samples)

Model CHrF++ BLEU Params
LFM2-v8-rl-10k-merged πŸ† 34.61 13.21 1.2B
LFM2-v6.1-curriculum 33.80 12.60 1.2B
Gemma-3-4B-it 32.83 11.36 4B

βœ… 1.2B λͺ¨λΈλ‘œ 4B λͺ¨λΈ λŠ₯κ°€!

πŸ”¬ μ–‘μžν™” μ‹€ν—˜ κ²°κ³Ό (1012개 μˆ˜λ™ 뢄석)

κ²°λ‘ : 4/5/8/32λΉ„νŠΈ λͺ¨λ‘ 사싀상 차이 μ—†μŒ!

Quantization CHrF++ BLEU Size λΉ„κ³ 
fp32 (원본) 34.32 13.10 4.68G 간헐적 반볡 버그
Q8_0 πŸ† 34.39 12.93 1.25G κ°€μž₯ μ•ˆμ •μ 
Q5_K_M 34.08 12.78 843M κ· ν˜•
Q4_K_M 33.97 12.56 731M κ²½λŸ‰ν™”

μˆ˜λ™ 뢄석 핡심 발견

1012개 예제 μˆ˜λ™ κ²€ν†  κ²°κ³Ό:

  • 90% 이상: λͺ¨λ“  λ²„μ „μ—μ„œ 의미적으둜 λ™μΌν•œ λ²ˆμ—­
  • 차이점: 단어 선택 차이만 쑴재 (예: "μ œμ•ˆν–ˆλ‹€" vs "λ§ν–ˆλ‹€")
  • ν™˜κ° νŒ¨ν„΄: λͺ¨λ“  λ²„μ „μ—μ„œ λ™μΌν•˜κ²Œ λ°œμƒ
    • "George W. Bush" β†’ "μ‘°μ§€ μ›Œμ‹±ν„΄" (μ—­λŒ€ λŒ€ν†΅λ Ή ν˜Όλ™)
    • "cheetahs" β†’ "κΈ°λ¦°" λ˜λŠ” "ν˜Έλž‘μ΄" (동물λͺ… ν˜Όλ™)

버전별 νŠΉμ΄μ‚¬ν•­:

ν˜„μƒ Q4 Q5 Q8 fp32 adapter
반볡 버그 ❌ ❌ ❌ ⚠️ ❌
λ²ˆμ—­ ν’ˆμ§ˆ βœ… βœ… βœ… βœ… βœ…
μ•ˆμ •μ„± βœ… βœ… βœ…βœ… ⚠️ βœ…

⚠️ fp32 merged λͺ¨λΈμ—μ„œ 0.1% 미만 ν™•λ₯ λ‘œ 반볡 좜λ ₯ 버그 발견 (예: "파고 파고 파고..." λ¬΄ν•œ 반볡)

μΆ”μ²œ μ‚¬μš© μ‹œλ‚˜λ¦¬μ˜€

μ‹œλ‚˜λ¦¬μ˜€ μΆ”μ²œ 버전 이유
ν”„λ‘œλ•μ…˜ μ„œλΉ™ Q8_0 GGUF μ•ˆμ •μ , 3.7λ°° μž‘μŒ
λͺ¨λ°”일/μ—£μ§€ Q5_K_M GGUF 크기 λŒ€λΉ„ μ„±λŠ₯ 졜적
μΆ”κ°€ ν•™μŠ΅ fp32 merged 전체 νŒŒλΌλ―Έν„° ν•„μš”

πŸš€ μ‚¬μš©λ²•

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged",
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged",
    trust_remote_code=True
)

def translate(text, direction="en2ko"):
    if direction == "en2ko":
        prompt = "Translate the following text to Korean."
    else:
        prompt = "Translate the following text to English."
    
    messages = [
        {"role": "system", "content": prompt},
        {"role": "user", "content": text}
    ]
    
    inputs = tokenizer.apply_chat_template(
        messages, return_tensors="pt", add_generation_prompt=True
    ).to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_new_tokens=256,
            do_sample=True,
            temperature=0.3,
            min_p=0.15,
            repetition_penalty=1.05,  # 반볡 λ°©μ§€
            pad_token_id=tokenizer.eos_token_id
        )
    
    return tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True).strip()

# μ‚¬μš© μ˜ˆμ‹œ
print(translate("Hello, how are you today?"))
# β†’ μ•ˆλ…•ν•˜μ„Έμš”, 였늘 기뢄이 μ–΄λ– μ„Έμš”?

print(translate("였늘 날씨가 정말 μ’‹λ„€μš”.", "ko2en"))
# β†’ The weather is really nice today.

πŸ“¦ GGUF 버전 (μΆ”μ²œ)

κ²½λŸ‰ν™” + μ•ˆμ •μ„±μ„ μœ„ν•΄ GGUF 버전 μ‚¬μš© ꢌμž₯:

πŸ‘‰ gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged-GGUF

πŸ”— κ΄€λ ¨ 링크

πŸ“œ λΌμ΄μ„ μŠ€

Downloads last month
25
Safetensors
Model size
1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged

Base model

LiquidAI/LFM2-1.2B
Finetuned
(55)
this model
Quantizations
1 model