---
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3.5-35B-A3B/blob/main/LICENSE
language:
- en
- zh
base_model:
- brayniac/Qwen3.5-35B-A3B-heretic
library_name: mlx
tags:
- heretic
- uncensored
- unrestricted
- decensored
- abliterated
- mxfp8
pipeline_tag: image-text-to-text
---
<div align="center"><img width="400px" src="https://qianwen-res.oss-accelerate.aliyuncs.com/logo_qwen3.5.png"></div>

# Qwen3.5-35B-A3B Heretic MLX mxfp8

### This is a abliterated (uncensored) version of [Qwen/Qwen3.5-35B-A3B](https://huggingface.co/Qwen/Qwen3.5-35B-A3B), made using [Heretic](https://github.com/p-e-w/heretic) v1.2.0


### Performance
| Metric | This model | Original model (a model) |
| :----- | :--------: | :---------------------------: |
| **KL divergence** | 0.0825 | 0 *(by definition)* |
| **Refusals** | 5/100 | 92/100 |

### Abliteration parameters

| Parameter | Value |
| :-------- | :---: |
| **direction_index** | 20.06 |
| **linear_attn.out_proj.max_weight** | 0.93 |
| **linear_attn.out_proj.max_weight_position** | 37.63 |
| **linear_attn.out_proj.min_weight** | 0.71 |
| **linear_attn.out_proj.min_weight_distance** | 4.94 |
| **moe_experts.down_proj.max_weight** | 1.18 |
| **moe_experts.down_proj.max_weight_position** | 23.68 |
| **moe_experts.down_proj.min_weight** | 0.86 |
| **moe_experts.down_proj.min_weight_distance** | 5.48 |
| **shared_expert.down_proj.max_weight** | 1.33 |
| **shared_expert.down_proj.max_weight_position** | 36.08 |
| **shared_expert.down_proj.min_weight** | 0.70 |
| **shared_expert.down_proj.min_weight_distance** | 17.75 |
| **attn.o_proj.max_weight** | 0.83 |
| **attn.o_proj.max_weight_position** | 35.00 |
| **attn.o_proj.min_weight** | 0.06 |
| **attn.o_proj.min_weight_distance** | 15.22 |
#### Sampling Parameters:  
   - We suggest using the following sets of sampling parameters depending on the mode and task type:  
     - **Thinking mode for general tasks**:  
       `temperature=1.0`, `top_p=0.95`, `top_k=20`, `min_p=0.0`, `presence_penalty=1.5`, `repetition_penalty=1.0`  
     - **Thinking mode for precise coding tasks (e.g., WebDev)**:  
       `temperature=0.6`, `top_p=0.95`, `top_k=20`, `min_p=0.0`, `presence_penalty=0.0`, `repetition_penalty=1.0`  
     - **Instruct (or non-thinking) mode for general tasks**:  
       `temperature=0.7`, `top_p=0.8`, `top_k=20`, `min_p=0.0`, `presence_penalty=1.5`, `repetition_penalty=1.0`  
     - **Instruct (or non-thinking) mode for reasoning tasks**:  
       `temperature=1.0`, `top_p=1.0`, `top_k=40`, `min_p=0.0`, `presence_penalty=2.0`, `repetition_penalty=1.0`  
   - For supported frameworks, you can adjust the `presence_penalty` parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.
-----
### Source
This model was converted to MLX format from [`brayniac/Qwen3.5-35B-A3B-heretic`](https://huggingface.co/brayniac/Qwen3.5-35B-A3B-heretic) using mlx-vlm version **0.3.12**.