A physical commonsense reasoning benchmark for 100+ languages, written in collaboration with 300+ researchers from 65 countries.
Catherine Arnett
catherinearnett
AI & ML interests
multilingual NLP, tokenization
Recent Activity
updated a dataset 1 day ago
catherinearnett/bilingual_tokenizers updated a dataset 1 day ago
catherinearnett/low_german updated a dataset 1 day ago
catherinearnett/komi_permyakOrganizations
Multilingual Leaderboards
Leaderboards for languages other than English
- Running on CPU UpgradeAgents76
La Leaderboard
🌸76Evaluate open LLMs in the languages of LATAM and Spain.
- Running on CPU UpgradeAgents124
Open Chinese LLM Leaderboard
🏆124Explore LLM benchmark scores and submit your model
- Running on CPU UpgradeAgents179
Open Arabic LLM Leaderboard
🏆179Track, rank and evaluate open Arabic LLMs and chatbots
- Build errorAgents40
OpenLLM French leaderboard 🇫🇷
🥇40Explore and submit LLM benchmarks
Low Resource Language Datasets
B-GPT
Bilingual GPT-2 models with checkpoints
-
catherinearnett/B-GPT_en_nl_simultaneous
Text Generation • 0.1B • Updated • 1.44k -
catherinearnett/B-GPT_nl_en_simultaneous
Text Generation • 0.1B • Updated • 1.22k -
catherinearnett/B-GPT_en_nl_sequential
Text Generation • 0.1B • Updated • 903 -
catherinearnett/B-GPT_nl_en_sequential
Text Generation • 0.1B • Updated • 838
Monolingual Models with Checkpoints
Global PIQA
A physical commonsense reasoning benchmark for 100+ languages, written in collaboration with 300+ researchers from 65 countries.
B-GPT
Bilingual GPT-2 models with checkpoints
-
catherinearnett/B-GPT_en_nl_simultaneous
Text Generation • 0.1B • Updated • 1.44k -
catherinearnett/B-GPT_nl_en_simultaneous
Text Generation • 0.1B • Updated • 1.22k -
catherinearnett/B-GPT_en_nl_sequential
Text Generation • 0.1B • Updated • 903 -
catherinearnett/B-GPT_nl_en_sequential
Text Generation • 0.1B • Updated • 838
Multilingual Leaderboards
Leaderboards for languages other than English
- Running on CPU UpgradeAgents76
La Leaderboard
🌸76Evaluate open LLMs in the languages of LATAM and Spain.
- Running on CPU UpgradeAgents124
Open Chinese LLM Leaderboard
🏆124Explore LLM benchmark scores and submit your model
- Running on CPU UpgradeAgents179
Open Arabic LLM Leaderboard
🏆179Track, rank and evaluate open Arabic LLMs and chatbots
- Build errorAgents40
OpenLLM French leaderboard 🇫🇷
🥇40Explore and submit LLM benchmarks
Monolingual Models with Checkpoints
Low Resource Language Datasets