zBotta/SmolLM2-360M-AccidentReports-distilled-kd1.7B

A compact 360M parameter instruction model specialized for single-paragraph accident/incident reports.
This student model was distilled (logit KD) from SmolLM2-1.7B-Instruct using QLoRA on zBotta/traffic-accidents-reports-5k.
It converts structured 5W1H inputs (What/When/Where/Who/How/Why + ContingencyActions) into concise, factual narratives.

Performance note: In quick spot-checks, this distilled model did not outperform the baseline zBotta/smollm2-accident-reporter-360m-800 on Cross-Encoding similarity under the same settings.


✨ Intended Use

  • Turn structured 5W1H fields into a single, factual paragraph describing an accident/incident.
  • Suitable for demos, prototypes, or lightweight backends needing small-footprint text generation.

🚫 Out of Scope / Limitations

  • Not a general-purpose chat or reasoning model.
  • May omit facts not present in the input by design; may still hallucinate with ambiguous prompts.
  • Not a legal/safety authority; human validation is required before operational use.

🧾 Prompt Format

Training used a simple instruction schema with an explicit response marker:

Instruction:

You are a reporting agent.
You task is to create a report when provided the what, when, why, who, how and where questions about the events.
You are also given information about the contingency actions regarding the event.

Guidelines:

Generate only one report given the informations about the event

Generate the report as text in one paragraph

It is important to focus on accuracy and coherence when generating the report so that the description content matches the information provided (what, when, where, who, how , why, contingency actions).
If an information is not provided in (what, when, where, who, how , why, contingency actions), it must not be part of the generated text description.

Input:

What: ...
When: ...
Where: ...
Who: ...
How: ...
Why: ...
ContingencyActions: ...

Response:

🚀 How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "zBotta/SmolLM2-360M-AccidentReports-distilled-kd1.7B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")  # merged FP16 weights

prompt = """### Instruction:
You are a reporting agent...
[full instruction text as above]

### Input:
What: rear-end collision between car A and van B at a traffic light
When: 2025-05-17 08:25 (occurrence)
Where: Main St & 3rd Ave, downtown
Who: Driver A (car), Driver B (van); police on scene
How: A failed to stop in time at red
Why: suspected distraction; investigation pending
ContingencyActions: police report filed; medical check for minor neck pain; vehicles towed

### Response:
"""

inputs = tok(prompt, return_tensors="pt").to(model.device)
gen = model.generate(
    **inputs,
    do_sample=False,              # deterministic decoding
    max_new_tokens=256,
    eos_token_id=tok.eos_token_id,
    pad_token_id=tok.pad_token_id,
    no_repeat_ngram_size=4,
    repetition_penalty=1.05,
    renormalize_logits=True,
)
print(tok.decode(gen[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True).strip())

🧰 Training Details

Distillation (Advanced KD)

Teacher: HuggingFaceTB/SmolLM2-1.7B-Instruct

Student: HuggingFaceTB/SmolLM2-360M-Instruct

KD temperature (KD_T): 3

Top-K KD (KD_TOP_K): 64 (partial KL on teacher top-K)

CE weight schedule (KD_ALPHA): start 0.6 → end 0.3 (KL gets 1 - CE)

Soft KD weighting: gamma=1.0, w_min=0.15

Unlikelihood (anti-repeat): β_UL = 0.05

Demo prompting: DEMO_PROB = 1.0 (always one demonstration)

KD schedule used trainer defaults (warmup 2 epochs, ramp 2 epochs).

QLoRA

Quantization: 4-bit nf4 with double quant; compute in fp16

LoRA:

rank r=8, lora_dropout=0.05, lora_alpha≈2r

Targets: typical attention/MLP projections (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)

SFT / Trainer Config

SFTConfig(
    output_dir=OUT_DIR,
    num_train_epochs=20,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=16,        # effective batch ≈ 64
    learning_rate=2e-5,
    lr_scheduler_type="cosine",
    warmup_ratio=0.1,
    weight_decay=0.05,
    label_smoothing_factor=0.05,
    max_grad_norm=0.5,
    logging_steps=50,
    eval_strategy="epoch",
    save_strategy="epoch",
    save_total_limit=2,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    fp16=False, bf16=False,                # training precision handled by bnb/device
    optim="adamw_bnb_8bit",
    packing=False,
    max_length=1024,
    gradient_checkpointing=True,
    gradient_checkpointing_kwargs={"use_reentrant": False},
    remove_unused_columns=False,
    dataloader_num_workers=4,
    dataloader_pin_memory=True,
    report_to="none",
    seed=42,
)

⚠️ Limitations & Bias

English-focused; short outputs only.

Domain-narrow: optimized for accident/incident narratives only.

Susceptible to input ambiguity; ensure clear, complete 5W1H fields.

May inherit biases from training data; do not rely on demographic attributes.

🛡️ Responsible / Safety

Treat outputs as drafts; human review is mandatory.

Avoid use in contexts where factual errors could cause harm without oversight.

Do not include PII unless you have legal basis and consent.

⚙️ Hardware & Inference

Designed for small-footprint serving; works well on consumer GPUs.

CPU inference is possible with the merged FP16 weights (will run in float32 on CPU via transformers).

Citation

If you use this model, please cite:

The source dataset: DSTI/traffic-accidents-reports-5k

Relevant distillation literature, e.g.: Hinton, Vinyals, Dean. Distilling the Knowledge in a Neural Network (2015).

@misc{accident_reporter_360m_distilled_kd1.7B,
  title  = {Accident Reporting distilled kd model (One-Paragraph)},
  author = {zBotta, SamdGuizani},
  year   = {2025}
}
Downloads last month
3
Safetensors
Model size
0.4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DSTI/SmolLM2-360M-AccidentReports-distilled-kd1.7B

Finetuned
(113)
this model

Dataset used to train DSTI/SmolLM2-360M-AccidentReports-distilled-kd1.7B