qwen3-4b-struct-lora-v11
This repository provides a LoRA adapter fine-tuned on top of:
azuki-digital/qwen3-4b-struct-lora-v4-merged
⚠️ This repository contains LoRA adapter weights only.
The base model must be loaded separately.
Training Objective
This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV). Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.
Training Configuration
| Item | Value |
|---|---|
| Base model | azuki-digital/qwen3-4b-struct-lora-v4-merged |
| Method | LoRA SFT (no quantization, bf16) |
| Max sequence length | 4096 |
| Epochs | 2 |
| Learning rate | 1e-5 |
| Warmup ratio | 0.05 |
| Weight decay | 0.05 |
| LoRA r | 32 |
| LoRA alpha | 64 |
| LoRA dropout | 0.05 |
| Target modules | q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj |
| Mask CoT | Yes (after_marker) |
| Dataset | daichira/structured-3k-mix-sft |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "azuki-digital/qwen3-4b-struct-lora-v4-merged"
adapter = "azuki-digital/qwen3-4b-struct-lora-v11"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
Sources & License (IMPORTANT)
Training Data • daichira/structured-3k-mix-sft • License: CC-BY-4.0
This dataset is used under the terms of CC-BY-4.0.
Compliance
Users must comply with: 1. Dataset attribution requirements (CC-BY-4.0) 2. Base model license (Apache-2.0) 3. This repository license (Apache-2.0)
- Downloads last month
- 4
Model tree for azuki-digital/qwen3-4b-struct-lora-v11-adapter
Base model
Qwen/Qwen3-4B-Instruct-2507