Quentin Gallouédec's picture

Hiring 💼

Quentin Gallouédec PRO

qgallouedec

huggingface

·

AI & ML interests

None yet

Recent Activity

upvoted an article about 2 hours ago

Preference Tuning LLMs with Direct Preference Optimization Methods

updated a dataset about 5 hours ago

hf-doc-build/doc-build-dev

updated a Space about 20 hours ago

qgallouedec/benchmark-dpo-refactor

View all activity

Organizations

upvoted an article about 2 hours ago

Article

Preference Tuning LLMs with Direct Preference Optimization Methods

+3

Jan 18, 2024

•

77

updated a dataset about 5 hours ago

hf-doc-build/doc-build-dev

Updated 13 minutes ago • 106k • 8

updated a Space about 20 hours ago

Benchmark Dpo Refactor

published a Space about 20 hours ago

Benchmark Dpo Refactor

updated 2 datasets 1 day ago

hf-doc-build/doc-build

Updated about 1 hour ago • 1.38M • 18

trl-internal-testing/zen-multi-image

Viewer • Updated 1 day ago • 76 • 4.09k • 1

upvoted a collection 2 days ago

AlphaGenome

Collection of AlphaGenome models. • 5 items • Updated 2 days ago • 22

New activity in google/functiongemma-tuning-lab 2 days ago

Update engine.py

#1 opened 2 days ago by

reacted to sergiopaniego's post with 🔥 2 days ago

Post

2393

New TRL + OpenEnv example! 💥

Fine tune an LLM for playing Sudoku using an RL env via OpenEnv

Includes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.

Enjoy!

Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynb

Script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py

1 reply

·

updated a dataset 2 days ago

trl-lib/trackio-dataset

Viewer • Updated less than a minute ago • 3.83k • 17.6k

upvoted an article 3 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

Dec 1, 2025

•

288

updated a dataset 3 days ago

trl-lib/documentation-images

Viewer • Updated 3 days ago • 9 • 63.6k

updated a Space 4 days ago

Trackio

Track and visualize data streams in real-time

New activity in trl-lib/trackio 4 days ago

i'm really liking the GPU usage and perf tracking here

#1 opened 5 months ago by

updated a dataset 6 days ago

trl-internal-testing/toolcall

Viewer • Updated 6 days ago • 24 • 2.87k

updated a dataset 8 days ago

qgallouedec/deepmath-completions-logs2

Viewer • Updated 8 days ago • 48 • 16

published a dataset 8 days ago

qgallouedec/deepmath-completions-logs2

Viewer • Updated 8 days ago • 48 • 16

upvoted a paper 10 days ago

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published 17 days ago • 146

upvoted a paper 11 days ago

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 90

upvoted a paper 13 days ago

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18