Models used in CHARM: Calibrating Reward Models With Chatbot Arena Scores.
shawnxzhu
shawnxzhu
·
AI & ML interests
None yet
Organizations
None yet
datasets
10
shawnxzhu/DSAA6000Q-Mistral-7B-Instruct-v0.2-lima-dpo
Viewer
•
Updated
•
1.03k
•
2
shawnxzhu/CHARM-preference20K
Viewer
•
Updated
•
20k
•
2
shawnxzhu/CHARM-preference20K-Qwen2.5-72B-Instruct
Viewer
•
Updated
•
20k
•
2
shawnxzhu/CHARM-preference20K-Llama-3.1-70B-Instruct
Viewer
•
Updated
•
20k
•
2
shawnxzhu/CHARM-preference20K-Llama-3.1-8B-Instruct
Viewer
•
Updated
•
20k
•
2
shawnxzhu/CHARM-preference20K-GPT-4o-mini-2024-07-18
Viewer
•
Updated
•
20k
•
4
shawnxzhu/CHARM-preference20K-gemma-2-27b-it
Viewer
•
Updated
•
20k
•
1
shawnxzhu/CHARM-preference20K-gemma-2-9b-it
Viewer
•
Updated
•
20k
•
4
shawnxzhu/CHARM-preference20K-gemma-2-9b-it-SimPO
Viewer
•
Updated
•
20k
•
4
shawnxzhu/backward-curation
Preview
•
Updated