qwen3-4b-struct-lora-v11

This repository provides a LoRA adapter fine-tuned on top of:

azuki-digital/qwen3-4b-struct-lora-v4-merged

⚠️ This repository contains LoRA adapter weights only.
The base model must be loaded separately.


Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV). Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.


Training Configuration

Item Value
Base model azuki-digital/qwen3-4b-struct-lora-v4-merged
Method LoRA SFT (no quantization, bf16)
Max sequence length 4096
Epochs 2
Learning rate 1e-5
Warmup ratio 0.05
Weight decay 0.05
LoRA r 32
LoRA alpha 64
LoRA dropout 0.05
Target modules q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj
Mask CoT Yes (after_marker)
Dataset daichira/structured-3k-mix-sft

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "azuki-digital/qwen3-4b-struct-lora-v4-merged"
adapter = "azuki-digital/qwen3-4b-struct-lora-v11"

tokenizer = AutoTokenizer.from_pretrained(base)

model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

Sources & License (IMPORTANT)

Training Data • daichira/structured-3k-mix-sft • License: CC-BY-4.0

This dataset is used under the terms of CC-BY-4.0.

Compliance

Users must comply with: 1. Dataset attribution requirements (CC-BY-4.0) 2. Base model license (Apache-2.0) 3. This repository license (Apache-2.0)

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for azuki-digital/qwen3-4b-struct-lora-v11-adapter

Dataset used to train azuki-digital/qwen3-4b-struct-lora-v11-adapter