Sangsang/ci-feedback_weighted_asym_bi_kl_fixed_ema_Qwen2.5-7B-Instruct_bw1p6_fw0p4_ema0p999_ep30 Text Generation • 8B • Updated 1 day ago • 24
Sangsang/ci-feedback_weighted_asym_bi_kl_fixed_ema_Qwen2.5-7B-Instruct_bw1p0_fw1p0_ema0p999_ep30 Text Generation • 8B • Updated 1 day ago • 32
Sangsang/ci-feedback_both_ema_plus_interp_Qwen2.5-7B-Instruct_jsd_b0p8_ema0p999_stw0p3_ep30 Text Generation • 8B • Updated 1 day ago • 25
Sangsang/ci-feedback_both_interp_Qwen2.5-7B-Instruct_from_Qwen2.5-7B-Instruct_jsd_b0p8_stw0p3_ep30 Text Generation • 8B • Updated 1 day ago • 33
Sangsang/ci-feedback_both_ema_Qwen2.5-7B-Instruct_jsd_b0p8_ema0p999_ep30 Text Generation • 8B • Updated 2 days ago • 37
Sangsang/ci-feedback_both_ema_Qwen2.5-7B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • 8B • Updated 2 days ago • 44
Sangsang/ci-feedback_disallowed_ema_Qwen2.5-7B-Instruct_jsd_b0p8_ema0p999_ep30 Text Generation • 8B • Updated 2 days ago • 31
Sangsang/ci-feedback_disallowed_ema_Qwen2.5-7B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • 8B • Updated 2 days ago • 39