Tiny models used for testing
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Qwen3.6-35B-A3B mixed-precision HIGGS model variants, plus base FP16/FP8/NVFP4 references.
-
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-heuristic
Image-Text-to-Text • 24B • Updated • 98 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-hybrid
Image-Text-to-Text • 24B • Updated • 99 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-noise
Image-Text-to-Text • 24B • Updated • 62 -
inference-optimization/Qwen3.6-35B-A3B-5.5-bits-mode-heuristic
Image-Text-to-Text • 26B • Updated • 45
Tiny models used for testing
Qwen3.6-35B-A3B mixed-precision HIGGS model variants, plus base FP16/FP8/NVFP4 references.
-
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-heuristic
Image-Text-to-Text • 24B • Updated • 98 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-hybrid
Image-Text-to-Text • 24B • Updated • 99 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-noise
Image-Text-to-Text • 24B • Updated • 62 -
inference-optimization/Qwen3.6-35B-A3B-5.5-bits-mode-heuristic
Image-Text-to-Text • 26B • Updated • 45
models 378
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-step21k
2B • Updated
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt2
0.6B • Updated
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt1-20260609-0052
0.6B • Updated • 5
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-ep0p11
2B • Updated • 95
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt1
0.6B • Updated • 136
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt0.5
0.6B • Updated • 11
inference-optimization/Qwen3-8B-speculator.dflash.swa.unified-ep0p28
2B • Updated
inference-optimization/Qwen3-8B-speculator.dflash.swa.unified-ep0p19
2B • Updated
inference-optimization/DFlash-SWA-Causal-Qwen3-8B-Magpie-Ultrachat
2B • Updated • 183
inference-optimization/DFlash-SWA-Causal-Qwen3-8B-PerfectBlend
2B • Updated • 51
datasets 23
inference-optimization/Qwen3.5-4B-responses
Viewer • Updated • 7.47k
inference-optimization/Qwen3.5-0.8B-responses
Viewer • Updated • 7.47k • 45
inference-optimization/Qwen3.5-9B-responses
Viewer • Updated • 7.67k • 38
inference-optimization/Qwen3-8B-Regenerated-Collection
Preview • Updated • 182
inference-optimization/Qwen3-30B-A3B-responses
Preview • Updated • 60
inference-optimization/Qwen3-32B-responses
Preview • Updated • 38
inference-optimization/ctest-Qwen3.6-27B-speculator-dataset
Viewer • Updated • 5.61k • 32
inference-optimization/Gemma4-Responses-Nemotron
Viewer • Updated • 762k • 59 • 1
inference-optimization/Longbench_Samples_Specdec
Viewer • Updated • 160 • 65
inference-optimization/ctest-subset-Qwen3.5-397B-A17B-FP8-dynamic-speculator-dataset
Viewer • Updated • 10k • 74