z-lab/Kimi-K2.5-DFlash · Lower acceptance rate on tool-calling prompts compared to EAGLE-3

Lower acceptance rate on tool-calling prompts compared to EAGLE-3

by laixiaohang - opened 6 days ago

Hi, I've tested DFlash on my own dataset and found its performance is comparable to or slightly worse than EAGLE-3. My prompts are mainly tool-calling / function-calling related.

Is tool-calling a known weak spot for the current checkpoint? Are there plans to improve this scenario (e.g., training on agent/tool-use data)?
Thanks!

jianchen0311

Z Lab org 6 days ago

Yeah, this model wasn't trained on tool-calling data. Collecting tool-calling data is a little bit difficult and slow as we need to run in real environment and collect multi-round interactions. We will try to collect some tool-calling and agent data for the Kimi-K2.6 training, which should help improve the performance in agentic tasks.

laixiaohang

3 days ago

Got it, looking forward to it!

laixiaohang changed discussion status to closed 3 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment