Models that I personally recommend, periodically updated.
Doctor Shotgun
Doctor-Shotgun
AI & ML interests
Local ML enthusiast, LLM and diffusion finetuner, hobbyist developer
Recent Activity
updated a model about 17 hours ago
CPU-Hybrid-MoE/GLM-5-CPU-NUMA4-AMXINT8 updated a model about 17 hours ago
CPU-Hybrid-MoE/DeepSeek-V3.2-CPU-NUMA4-AMXINT8 updated a model about 17 hours ago
CPU-Hybrid-MoE/DeepSeek-R1-0528-CPU-NUMA4-AMXINT8Organizations
Doc's Diffusion
Models/loras for image diffusion.
LLM Speculative Decoding Experiments
Tiny language models meant to serve as draft models for speculative decoding.
-
Doctor-Shotgun/TinyLlama-1.1B-32k
Text Generation • 1B • Updated • 210 • 30 -
Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct
Text Generation • 1B • Updated • 157 • • 13 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta
Text Generation • Updated • 4 • 1 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft
Text Generation • Updated • 5 • 2
Magnum Diamond (24B/70B/123B)
Focusing on applying enough heat and pressure to dry, assistant-tuned models until they turn into creative writing gems!
-
Doctor-Shotgun/ML2-123B-Magnum-Diamond
Text Generation • 123B • Updated • 12 • 11 -
Doctor-Shotgun/L3.3-70B-Magnum-Diamond
Text Generation • 71B • Updated • 29 • 5 -
Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
Text Generation • 24B • Updated • 191 • 55 -
Doctor-Shotgun/ML2-123B-Magnum-Diamond-GGUF
Text Generation • 123B • Updated • 385 • 6
Qwen 3 ScatterMoE
Drop-in implementation of https://github.com/shawntan/scattermoe for efficient training of Qwen 3 MoE.
-
chargoddard/Qwen3-30B-A3B-Base-ScatterMoE
31B • Updated • 4 -
Doctor-Shotgun/Qwen3-30B-A3B-Instruct-2507-ScatterMoE
Text Generation • 31B • Updated • 11 • 1 -
Doctor-Shotgun/Qwen3-30B-A3B-Thinking-2507-ScatterMoE
Text Generation • 31B • Updated • 13 -
Doctor-Shotgun/Qwen3-Coder-30B-A3B-Instruct-ScatterMoE
Text Generation • 31B • Updated • 10 • 1
Doc's Choice
Models that I personally recommend, periodically updated.
Magnum Diamond (24B/70B/123B)
Focusing on applying enough heat and pressure to dry, assistant-tuned models until they turn into creative writing gems!
-
Doctor-Shotgun/ML2-123B-Magnum-Diamond
Text Generation • 123B • Updated • 12 • 11 -
Doctor-Shotgun/L3.3-70B-Magnum-Diamond
Text Generation • 71B • Updated • 29 • 5 -
Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
Text Generation • 24B • Updated • 191 • 55 -
Doctor-Shotgun/ML2-123B-Magnum-Diamond-GGUF
Text Generation • 123B • Updated • 385 • 6
Doc's Diffusion
Models/loras for image diffusion.
Qwen 3 ScatterMoE
Drop-in implementation of https://github.com/shawntan/scattermoe for efficient training of Qwen 3 MoE.
-
chargoddard/Qwen3-30B-A3B-Base-ScatterMoE
31B • Updated • 4 -
Doctor-Shotgun/Qwen3-30B-A3B-Instruct-2507-ScatterMoE
Text Generation • 31B • Updated • 11 • 1 -
Doctor-Shotgun/Qwen3-30B-A3B-Thinking-2507-ScatterMoE
Text Generation • 31B • Updated • 13 -
Doctor-Shotgun/Qwen3-Coder-30B-A3B-Instruct-ScatterMoE
Text Generation • 31B • Updated • 10 • 1
LLM Speculative Decoding Experiments
Tiny language models meant to serve as draft models for speculative decoding.
-
Doctor-Shotgun/TinyLlama-1.1B-32k
Text Generation • 1B • Updated • 210 • 30 -
Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct
Text Generation • 1B • Updated • 157 • • 13 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta
Text Generation • Updated • 4 • 1 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft
Text Generation • Updated • 5 • 2