File size: 7,511 Bytes
fcb36ac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19955dd
fcb36ac
 
 
 
 
 
 
 
 
 
 
 
df26baa
fcb36ac
 
 
 
 
4355481
 
 
 
 
 
fcb36ac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
license: openrail++
tags:
- text-to-image
- stable-diffusion
- diffusers
widget:
- text: score_9, 2boys, male focus, multiple boys, yaoi, couple, princess carry, carrying, collared shirt, shirt, pants, jacket, looking at another, smile, wedding, absurdres, highres, year 2025
  parameters:
    negative_prompt: score_1, score_2, score_3, lowres, artistic error, film grain, scan artifacts, jpeg artifacts, chromatic aberration, dithering, halftone, screentones, multiple views, logo, too many watermarks, negative space, blank page
  output:
    url: images/sample01.png
  example_title: sample01
- text: score_9, A handsome anime boy playing acoustic guitar in living room at home, absurdres, highres
  parameters:
    negative_prompt: score_1, score_2, score_3, lowres, artistic error, film grain, scan artifacts, jpeg artifacts, chromatic aberration, dithering, halftone, screentones, multiple views, logo, too many watermarks, negative space, blank page
  output:
    url: images/sample02.png
  example_title: sample02
- text: score_9, tachibana makoto, free!, 1boy, male focus, solo, lying, on bed, bed, pillow, bedroom, shirt, pants, looking at viewer, one eye closed, sleepy, open mouth, absurdres, highres, year 2025
  parameters:
    negative_prompt: score_1, score_2, score_3, lowres, artistic error, film grain, scan artifacts, jpeg artifacts, chromatic aberration, dithering, halftone, screentones, multiple views, logo, too many watermarks, negative space, blank page
  output:
    url: images/sample03.png
  example_title: sample03
- text: score_9, 2boys, male focus, multiple boys, rating:general, arm around shoulder, tank top, shorts, bara, muscular male, muscular, baseball cap, hat, looking at viewer, absurdres, highres, year 2025
  parameters:
    negative_prompt: score_1, score_2, score_3, lowres, artistic error, film grain, scan artifacts, jpeg artifacts, chromatic aberration, dithering, halftone, screentones, multiple views, logo, too many watermarks, negative space, blank page
  output:
    url: images/sample04.png
  example_title: sample04
---

# AnimeBoysZeroXL

**Creating models is a labor of love, but it takes a significant amount of time and compute power to get them just right. If you’re enjoying my models, consider fueling my next project with a coffee on [Ko-fi](https://ko-fi.com/koolchh) ☕. Thank you for keeping this project going!**

<Gallery />

A dedicated model for high-quality anime-style male characters. This model is specifically optimized for males-only content, offering a wide range of aesthetic styles and high versatility.

## 🚀 Inference Guide
- **⚠️ Important**: This model uses Zero Terminal SNR with V-prediction. Please ensure you are using the correct settings during inference.
  - **ComfyUI Users**: Add the `ModelSamplingDiscrete` node into your workflow. Set `sampling` to `v_prediction`, `zsnr` to `true`.
  - **Automatic1111 Users**: Place the `.yaml` config file into the model folder. The .yaml file must have the exact same name as the model file, only with the `.yaml` extension instead of `.safetensors`. Set `Noise schedule for sampling` in settings to `Zero Terminal SNR`.
- **Prompting**: Always begin your prompt with a score tag (e.g. `score_9`). You can use any of these styles:
  - Tag soup: `score_X, tag1, tag2, tag3, ...`
  - Natural language: `score_X, [your description here]`
  - Mixed approach: `score_X, [description], tag1, tag2, ...`
  - *Tip*: If you find the style of the score tags is too strong, you could try dropping them from the prompt.
- **Negative Prompt**: Choose from one of these three presets depending on your needs:
  - **Minimal**: `score_1`
  - **Light**: `score_1, lowres, artistic error, scan artifacts, jpeg artifacts, multiple views, too many watermarks, negative space, blank page`
  - **Heavy**: `score_1, score_2, score_3, lowres, artistic error, film grain, scan artifacts, jpeg artifacts, chromatic aberration, dithering, halftone, screentones, multiple views, logo, too many watermarks, negative space, blank page`
- **CFG Scale**: A CFG scale of **3 to 5** is recommended. For finer control, I suggest using [dynamic thresholding](https://github.com/mcmonkeyprojects/sd-dynamic-thresholding).
  - *Pro-tip*: I set `mimic_scale` to match the CFG scale and set both minimum scales to the same lower value. I use `Half Cosine Up` for both modes.
- **Resolution**: To get started, try these dimensions:
  - **Portrait**: 832 × 1216
  - **Square**: 1024 × 1024
  - **Landscape**: 1216 × 832
  - *Some other supported sizes*: 768×1344, 768×1280, 896×1152, 960×1088, 1344×768, 1280×768, 1152×896, 1088×960.

## 🧨 Diffusers Example Usage

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "Koolchh/AnimeBoysZeroXL",
    torch_dtype=torch.float16,
    use_safetensors=True,
    variant="fp16"
)
pipe.to("cuda")

prompt = "score_9, 1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair"
negative_prompt = "score_1"

image = pipe(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=5,
    num_inference_steps=28
).images[0]
```

## 🧪 Training Details

AnimeBoysZeroXL was fine-tuned from [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl) using approximately 950k images. The knowledge cutoff is November 2025.

The following tags were used during training to help you steer the results toward your desired style.

### Score tags

- Each image is tagged with `score_X`, where `X` is a range from **1 to 9**.
  - `score_9` represents the highest aesthetic quality based on my personal preferences.

### Rating tags

| tag                   | rating       |
|-----------------------|--------------|
| `rating:general`      | general      |
| `rating:sensitive`    | sensitive    |
| `rating:questionable` | questionable |
| `rating:explicit`     | explicit     |

### Year tags

Use `year YYYY` (ranging from 2005 to 2025) to target specific era styles.

### Training configurations
- **Hardware**: 4 × Nvidia A100 SXM 80GB
- **Optimizer**: AdaFactor
- **Gradient Accumulation Steps**: 8
- **Effective Batch Size**: 128 (4 × 8 × 4)
- **Learning Rates**:
  - **U-Net**: 2e-5
  - **Text Encoders**: 1e-5
- **LR Schedule**: Constant with 250 warmup steps
- **Precision**: FP16 Mixed Precision

### 🔄 Changes from AnimeBoysXL v3.0
- **Tag Overhaul**: Quality tags have been removed. The 5-category aesthetic tags have been replaced with a more granular 9-category score tag system. Renamed rating tags for better clarity. Abolished the tag ordering scheme.
- **Captions**: A subset of highly aesthetic images was trained using natural language prompts for better comprehension.
- **Emphasis**: Highly aesthetic images now have more "repeats" in the training data.
- **Optimization**:
  - 5% caption dropout for unconditional guidance.
  - Trained with Zero Terminal SNR and V-prediction.
  - Implemented adaptive loss weighting.
  - No multi-resolution noise or debiased estimation loss.
  - Trained with input perturbation noise (gamma=0.1).
  - Trained with huber loss.
- **Merging**: This model is a merge across several iterations of the same training run for better stability.

## License

AnimeBoysZeroXL is a derivative model of [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl) by PurpleSmartAI. Please read their license before using the model.