A Qwen3-30B-A3B-Instruct-2507 finetune designed to reduce model refusals. Produced using P-E-W's heretic (v1.2.0).

From testing, the model is on the verge of not refusing any prompts; a little bit of smart prompting is needed to get it to open up. Again from testing, it seems that even if you do this, it must reassure itself before responding, e.g., "Let's be direct, precise, and unflinching, as you asked". I'm unsure if a better system prompt could remove this initial reluctance.

You can bootstrap a system prompt by asking the model to generate one in which it allows for everything included in the safety policy here then progressively iterate until you get the right tone.

Heretic config and trial checkpoints are included in the files. Trial 139 was chosen:

Parameter	Value
direction_index	27.04
attn.o_proj.max_weight	1.34
attn.o_proj.max_weight_position	28.37
attn.o_proj.min_weight	0.11
attn.o_proj.min_weight_distance	24.07
mlp.down_proj.max_weight	1.42
mlp.down_proj.max_weight_position	35.97
mlp.down_proj.min_weight	1.28
mlp.down_proj.min_weight_distance	21.66

Metric	Value
Initial Refusals	99/100
Refusals	11/100
KL Divergence	0.0679

It costed roughly $14.68 USD to finetune this model using a rented A100 PCIe and took about 9 hours to complete. If anyone with better hardware than wants to give a crack at it, go ahead. I feel there's still a lot of room for improvement.

Some refusal markers I've encountered are "sensitive topic" and "approach with respect". You may want to include these in the next hereticisation attempt.

Downloads last month: 51

Safetensors

Model size

31B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dEcQEGdeTshDWbOJQcYp/Qwen3-30B-A3B-Instruct-2507-Heretic

Base model

Qwen/Qwen3-30B-A3B-Instruct-2507

Finetuned

(47)

this model

Quantizations

3 models