A Qwen3-30B-A3B-Instruct-2507 finetune designed to reduce model refusals. Produced using P-E-W's heretic (v1.2.0).

From testing, the model is on the verge of not refusing any prompts; a little bit of smart prompting is needed to get it to open up. Again from testing, it seems that even if you do this, it must reassure itself before responding, e.g., "Let's be direct, precise, and unflinching, as you asked". I'm unsure if a better system prompt could remove this initial reluctance.

You can bootstrap a system prompt by asking the model to generate one in which it allows for everything included in the safety policy here then progressively iterate until you get the right tone.

Heretic config and trial checkpoints are included in the files. Trial 139 was chosen:

Parameter Value
direction_index 27.04
attn.o_proj.max_weight 1.34
attn.o_proj.max_weight_position 28.37
attn.o_proj.min_weight 0.11
attn.o_proj.min_weight_distance 24.07
mlp.down_proj.max_weight 1.42
mlp.down_proj.max_weight_position 35.97
mlp.down_proj.min_weight 1.28
mlp.down_proj.min_weight_distance 21.66
Metric Value
Initial Refusals 99/100
Refusals 11/100
KL Divergence 0.0679

It costed roughly $14.68 USD to finetune this model using a rented A100 PCIe and took about 9 hours to complete. If anyone with better hardware than wants to give a crack at it, go ahead. I feel there's still a lot of room for improvement.

Some refusal markers I've encountered are "sensitive topic" and "approach with respect". You may want to include these in the next hereticisation attempt.

Downloads last month
51
Safetensors
Model size
31B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dEcQEGdeTshDWbOJQcYp/Qwen3-30B-A3B-Instruct-2507-Heretic

Finetuned
(47)
this model
Quantizations
3 models