-
inference-optimization/test_tencentbac_fastmtp
Updated • 38 -
inference-optimization/test_qwen3_next_mtp
Updated • 41 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator
Text Generation • 2B • Updated • 58 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-MTP-ultrachat-epoch3
2B • Updated • 19
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
inference-optimization/test_tencentbac_fastmtp
Updated • 38 -
inference-optimization/test_qwen3_next_mtp
Updated • 41 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator
Text Generation • 2B • Updated • 58 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-MTP-ultrachat-epoch3
2B • Updated • 19
FP8-block, FP8-dynamic, NVFP4, w4a16, w8a8 quantized models of ibm-granite/granite-4.0-h-small and ibm-granite/granite-4.0-h-tiny models
models 215
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_heuristic
27B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_noise
26B • Updated • 16
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_hybrid
26B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.5_bits_mode_heuristic
25B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.5_bits_mode_noise
25B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.5_bits_mode_hybrid
25B • Updated • 14
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.0_bits_mode_heuristic
23B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.0_bits_mode_noise
23B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_6.0_bits_mode_hybrid
23B • Updated • 15
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_heuristic
22B • Updated • 14
datasets 6
inference-optimization/speculators-qwen3-30b-a3b-instruct
Preview • Updated • 28
inference-optimization/speculators-qwen3-32b-instruct
Preview • Updated • 41
inference-optimization/gpt-oss-20b-nan-hidden-states-repro
Updated • 29
inference-optimization/SWE-bench_Multilingual
Viewer • Updated • 300 • 13
inference-optimization/SWE-bench_Verified
Viewer • Updated • 500 • 82
inference-optimization/SWE-bench_Lite
Viewer • Updated • 323 • 57