--- title: Wiki Live Challenge Leaderboard emoji: 🏆 colorFrom: yellow colorTo: red sdk: gradio sdk_version: 5.9.1 python_version: "3.11" app_file: app.py pinned: true license: apache-2.0 short_description: Leaderboard for Wiki Live Challenge benchmark --- # Wiki Live Challenge Leaderboard A live benchmark leaderboard for evaluating Deep Research Agents on Wikipedia-quality article generation. ## Features - **Wiki Writing Evaluation**: 39 criteria from Wikipedia's Manual of Style - Overall, Well-written, Neutral, Broad coverage - **Wiki Fact Evaluation**: Factual accuracy metrics - Coverage against Wikipedia - Reference accuracy ## Links - 🌐 **Website**: [Wiki Live Challenge](http://agentresearchlab.org/benchmarks/wiki-live-challenge/index.html#home) - 💻 **Code**: [github.com/WangShao2000/Wiki_Live_Challenge](https://github.com/WangShao2000/Wiki_Live_Challenge) - 📊 **Dataset**: [huggingface.co/datasets/muset-ai/Wiki_Live_Challenge](https://huggingface.co/datasets/muset-ai/Wiki_Live_Challenge) ## Citation ```bibtex @article{wang2026wikilive, author = {Shaohan Wang and Benfeng Xu and Licheng Zhang and Mingxuan Du and Chiwei Zhu and Xiaorui Wang and Zhendong Mao and Yongdong Zhang}, title = {Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles}, journal = {arXiv preprint}, year = {2026}, } ```