--- license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3.5-35B-A3B/blob/main/LICENSE language: - en - zh base_model: - brayniac/Qwen3.5-35B-A3B-heretic library_name: mlx tags: - heretic - uncensored - unrestricted - decensored - abliterated - mxfp8 pipeline_tag: image-text-to-text ---
# Qwen3.5-35B-A3B Heretic MLX mxfp8 ### This is a abliterated (uncensored) version of [Qwen/Qwen3.5-35B-A3B](https://huggingface.co/Qwen/Qwen3.5-35B-A3B), made using [Heretic](https://github.com/p-e-w/heretic) v1.2.0 ### Performance | Metric | This model | Original model (a model) | | :----- | :--------: | :---------------------------: | | **KL divergence** | 0.0825 | 0 *(by definition)* | | **Refusals** | 5/100 | 92/100 | ### Abliteration parameters | Parameter | Value | | :-------- | :---: | | **direction_index** | 20.06 | | **linear_attn.out_proj.max_weight** | 0.93 | | **linear_attn.out_proj.max_weight_position** | 37.63 | | **linear_attn.out_proj.min_weight** | 0.71 | | **linear_attn.out_proj.min_weight_distance** | 4.94 | | **moe_experts.down_proj.max_weight** | 1.18 | | **moe_experts.down_proj.max_weight_position** | 23.68 | | **moe_experts.down_proj.min_weight** | 0.86 | | **moe_experts.down_proj.min_weight_distance** | 5.48 | | **shared_expert.down_proj.max_weight** | 1.33 | | **shared_expert.down_proj.max_weight_position** | 36.08 | | **shared_expert.down_proj.min_weight** | 0.70 | | **shared_expert.down_proj.min_weight_distance** | 17.75 | | **attn.o_proj.max_weight** | 0.83 | | **attn.o_proj.max_weight_position** | 35.00 | | **attn.o_proj.min_weight** | 0.06 | | **attn.o_proj.min_weight_distance** | 15.22 | #### Sampling Parameters: - We suggest using the following sets of sampling parameters depending on the mode and task type: - **Thinking mode for general tasks**: `temperature=1.0`, `top_p=0.95`, `top_k=20`, `min_p=0.0`, `presence_penalty=1.5`, `repetition_penalty=1.0` - **Thinking mode for precise coding tasks (e.g., WebDev)**: `temperature=0.6`, `top_p=0.95`, `top_k=20`, `min_p=0.0`, `presence_penalty=0.0`, `repetition_penalty=1.0` - **Instruct (or non-thinking) mode for general tasks**: `temperature=0.7`, `top_p=0.8`, `top_k=20`, `min_p=0.0`, `presence_penalty=1.5`, `repetition_penalty=1.0` - **Instruct (or non-thinking) mode for reasoning tasks**: `temperature=1.0`, `top_p=1.0`, `top_k=40`, `min_p=0.0`, `presence_penalty=2.0`, `repetition_penalty=1.0` - For supported frameworks, you can adjust the `presence_penalty` parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance. ----- ### Source This model was converted to MLX format from [`brayniac/Qwen3.5-35B-A3B-heretic`](https://huggingface.co/brayniac/Qwen3.5-35B-A3B-heretic) using mlx-vlm version **0.3.12**.