rl__24GPU_base__swe_rebench_patched_oracle__r2egym-nl2bash-stack

RL-trained Qwen3-8B (81 steps, GRPO/RLOO-N)

Downloads last month
43
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support