Fine-tuning GR00T for Chess Setup on Unitree G1

Raysolo · January 9, 2026, 5:55am

Hi @team, I’m working on a project to make the Unitree G1 set up a chessboard using a VLA model.

I see that LeRobot now supports the G1 and the GR00T integration on Arena. My plan is to teleoperate the G1 to collect the dataset and then fine-tune GR00T via LeRobot.

Does the current G1 support in LeRobot include the dexterity required for small object manipulation (chess pieces), or is it optimized primarily for locomotion?
Are there any existing reference configs for fine-tuning GR00T specifically for precise tabletop manipulation on the G1?
I am working purely in simulation and i only managed to get groot work in simulation through Isaac lab arena thank to the docker image provided by your team. I wanted to ask if there is any documentation i could follow that could help me with groot vla integration, training and finetuning in simulation similar to what your team achieved for me to recreate
this project is for my masters thesis and any advice you could provide me with would be greatly appreciated.

Thanks!

John6666 · January 10, 2026, 5:25am

For now, I gathered resources. (Detailed version)

Executive take for your setup (simulation-only G1, chess setup, GR00T + LeRobot + Isaac Lab-Arena)

Your plan is plausible, but the hard part is not “can I run GR00T on G1 in sim.” You already did that with the Arena Docker. The hard parts are:

Dexterity stack: chess pieces are small, need stable pinch grasps, and need mm-level placement without tipping. That is mostly controller + contacts + action interface, not “does LeRobot list G1 as supported.”
Contracts staying consistent across: Isaac Lab observations and actions → (HDF5) → LeRobot dataset schema → GR00T data_config and embodiment_tag → evaluation runner.
Long-horizon sequencing: full board setup is 32 placements plus regrasp and collision avoidance. You will almost certainly want a staged curriculum.

If you treat “set up a board” as a pipeline of measurable pick-and-place subtasks, and you keep your action/observation contracts boring and consistent, this becomes a solid thesis.

1) Does current LeRobot “G1 support” include chess-level dexterity, or is it locomotion-focused?

It is primarily bring-up + remote control + locomotion controller examples, not a dexterous tabletop manipulation guarantee.

On the Unitree G1 page, LeRobot’s example control step is explicitly “Run the locomotion policy,” including a GR00T WholeBodyControl locomotion controller and Holosoma locomotion controller. It also calls out a MuJoCo simulation mode toggle (is_simulation=True). (Hugging Face)

So what you should infer:

What it gives you: connectivity patterns, robot server/client control, and reference examples that make the robot move reliably in a supported control mode. (Hugging Face)
What it does not magically give you: a ready-made precision manipulation controller, calibrated grasp affordances, stable fingertip contact parameters for tiny pieces, or a “chess-ready” action space.

For chess, “dexterity required” depends on things LeRobot cannot “promise” at the integration-doc level:

End-effector model: are you using a gripper, a simplified pinch hand, or a full multi-DOF hand in simulation?
Contact and friction realism: small pieces will slip or tip if friction/compliance is off.
Action interface: joint position targets vs end-effector deltas. For tiny objects, you usually want a higher-level servo target (EE pose deltas + grasp) executed by IK/WBC.

Bottom line: treat LeRobot’s G1 support as infrastructure, and plan to supply the manipulation-specific pieces yourself.

2) Are there reference configs for fine-tuning GR00T for precise tabletop manipulation on G1?

You have two kinds of “reference config,” and they matter differently:

A. Arena’s G1 loco-manipulation GR00T training and evaluation configs (best “G1 embodiment wiring” reference)

Arena’s G1 loco-manip workflow shows exactly how NVIDIA wires G1 observations/actions into LeRobot-format data and GR00T:

Conversion uses convert_hdf5_to_lerobot.py with a YAML like g1_locomanip_config.yaml, mapping fields such as:
- state_name_sim: "robot_joint_pos"
- action_name_sim: "processed_actions"
- pov_cam_name_sim: "robot_head_cam"
- fps: 50 (Isaac Sim)
Post-training uses gr00t_finetune.py with a G1 data config class:
- --data_config=isaaclab_arena_gr00t.embodiments.g1.g1_sim_wbc_data_config:UnitreeG1SimWBCDataConfig (Isaac Sim)
Closed-loop evaluation uses a GR00T config file g1_locomanip_gr00t_closedloop_config.yaml specifying:
- action_horizon: 16
- data_config: unitree_g1_sim_wbc
- joint config YAMLs for policy and action joints (Isaac Sim)
The environment name used in LeRobot EnvHub is shown as g1_locomanip_pnp on the nvidia/isaaclab-arena-envs hub page. (Hugging Face)

This is not tabletop chess, but it is the most valuable “known-good” reference for G1 embodiment tagging + modality mapping + horizons + joints config.

B. Arena’s static manipulation GR00T training workflow (best “tabletop manipulation pipeline” reference)

Arena’s static manipulation workflow (GR1 microwave example) is the cleanest reference for the dataset → conversion → GR00T post-train path:

It states GR00T N1.5 requires LeRobot-format data and provides the converter + YAML-driven mappings. (Isaac Sim)
It explicitly says the converter creates a lerobot/ folder with parquet (states/actions), MP4 camera recordings, and metadata. (Isaac Sim)
It shows example training calls and even a LoRA-style lower-hardware option in the workflow. (Isaac Sim)

For chess, you are essentially building a new “static manipulation” task, but with a G1 upper body. So you combine:

Static manipulation pipeline structure (Isaac Sim)
with
G1 embodiment wiring patterns from the loco-manip configs (Isaac Sim)

That combination is the closest thing to “reference configs for precise tabletop manipulation on G1” that exists publicly right now.

3) Documentation you can follow to recreate GR00T VLA integration, training, fine-tuning in sim

Here is the shortest reproducible documentation chain that matches what you are trying to rebuild.

Step 0. Decide which GR00T “spine” you are using

You effectively have two mainstream options:

Option 1: GR00T N1.5 via LeRobot (LeRobot-native CLI and processors)

GR00T N1.5 is integrated into LeRobot, including a documented lerobot-train flow and explicit dependency constraints (notably FlashAttention on CUDA). (Hugging Face)

Option 2: Arena’s Isaac-GR00T submodule scripts (Arena-documented, works with their Docker)

Arena docs show python scripts/gr00t_finetune.py ... and the exact knobs they use for post-training and embodiment tags. (Isaac Sim)

Given you already run Arena’s Docker successfully, Option 2 is usually the least friction for a thesis timeline. You can still export to LeRobot later for broader comparisons.

Step 1. Collect demonstrations in Isaac Lab simulation

A practical reference for “teleop imitation data collection” in Isaac Lab is the IsaacLab teleop imitation demo. It shows an end-to-end imitation flow and uses an environment named Isaac-PickPlace-Locomanipulation-G1-Abs-v0 as one of its examples. (Isaac Sim)

For chess, you replicate the pattern:

Make Isaac-ChessPlace-G1-Abs-v0 (or similar)
Teleop episodes that do: approach → grasp → lift → place → retreat

Step 2. Convert Isaac Lab trajectories to LeRobot dataset format

Arena documents this explicitly:

“GR00T N1.5 requires the dataset to be in LeRobot format”
Conversion is done by convert_hdf5_to_lerobot.py --yaml_file ...
Output is parquet + MP4 + metadata under a lerobot/ folder (Isaac Sim)

The YAML is where most people break things. Your chess YAML should be boring and explicit:

state_name_sim: what you recorded as state (joint pos, EE pose, etc.)
action_name_sim: what you recorded as action (processed_actions, joint targets, etc.)
pov_cam_name_sim: the camera key you will always use
fps: match your control loop (50 Hz is a common choice in Arena examples) (Isaac Sim)

Step 3. Post-train GR00T

If you do N1.5 in LeRobot, the docs give a canonical training pattern with lerobot-train, and they warn FlashAttention is required (currently) and CUDA is required. (Hugging Face)

If you do Arena’s script path, Arena shows the exact gr00t_finetune.py usage and flags. (Isaac Sim)

Step 4. Evaluate in Isaac Lab-Arena at scale (and/or through LeRobot EnvHub)

LeRobot has explicit docs for “IsaacLab Arena & LeRobot” and how evaluation uses an env, then remaps keys via a rename_map and selects state_keys and camera_keys. (Hugging Face)

This is crucial for chess, because camera naming mismatches are one of the most common failure modes.

Also note the broader architecture context: NVIDIA’s Isaac Lab-Arena is positioned as evaluation-centric and integrates with Isaac Lab-Teleop, Isaac Lab-Mimic, and GR00T post-training. (NVIDIA Developer)

4) Thesis advice that actually moves the needle for chess setup

A. Don’t start with full-board setup

Full setup is the “boss level.” Treat it as a curriculum:

Single pawn placement into an empty board square
Single move (pick pawn from A2 to A3) with the board populated sparsely
Row of pawns (8 placements) with repeated grasp/place
Full setup (32 placements) only after (1)-(3) are stable

This makes your evaluation publishable: you can plot “sequential success vs N placements” and “success vs tolerance (mm).”

B. Make your environment metric-heavy

Chess is easy to score. Use that advantage.

Track at least:

Position error: distance from piece base center to square center
Yaw error: piece orientation error (optional for pawns, important for knights/rooks if you care)
Uprightness: dot(up_vector, world_up), plus a “tipped” threshold
Collisions: count collisions with other pieces or board lip
Grasp quality: slip events, regrasp count

Arena has explicit “Metrics Design” as a first-class concept in its docs navigation, and it is aligned with the evaluation-first philosophy. (Isaac Sim)

C. Use a hierarchical control interface early

For mm placement, raw joint control from a VLA is usually fragile.

A stable pattern is:

Policy outputs: EE delta pose + grasp open/close (mid-level)
Controller executes: IK/WBC to produce joint targets safely

Arena’s G1 examples clearly operate with joint-space configs and a WBC-style data_config (UnitreeG1SimWBCDataConfig). (Isaac Sim)
You can keep that structure, while changing the task.

D. Budget time for “sim contact realism”

Chess pieces are contact nightmares:

Narrow bases tip.
Small clearances create lots of near-collisions.
Friction and restitution matter.

A common thesis-grade contribution is: “policy + controller + domain randomization choices that make precision placement robust.” Even if you never go real-world, the ablations are meaningful.

Similar cases and common issues people hit online (and what they imply for your project)

These show up repeatedly in GR00T + G1 integration threads and are directly relevant to your chess pipeline.

1) “Moves once then stops” due to action chunking

GR00T often predicts an action chunk (horizon). Arena’s closed-loop config uses action_horizon: 16. (Isaac Sim)
There are issues where users effectively apply only the first action and the robot “moves once then stops.” (GitHub)
For chess, if your placement looks like “twitch then freeze,” check you are consuming the full chunk correctly.

2) Modality / embodiment mismatches (KeyError and missing configs)

There are reports of KeyError: 'unitree_g1' in Isaac-GR00T when modality configs do not align with the embodiment tag or expected naming. (GitHub)
Implication: lock versions, and keep a tiny “smoke test” that loads your processor/config before you generate 1000 demos.

3) Model artifact and processor config load failures

There are issues around loading certain G1 checkpoints where the tooling complains about missing model artifacts or processor configs and falls back. (GitHub)
Implication: test-load your chosen base checkpoint on day one, not after you collect data.

4) Joint indexing mismatches between Isaac Sim and Unitree conventions

Users report joint index mismatches and needing to reorder joints for G1. (GitHub)
Implication: for chess, a subtle joint mapping bug can look like “the hand is shaky” or “grasp fails randomly.” Verify joint ordering and limits early.

5) Dataset format footguns

Missing metadata files (episodes.jsonl, tasks.jsonl, modality.json) or mismatched camera keys break training. NVIDIA’s own datasets describe these files explicitly, and Arena docs emphasize the conversion step creates a full LeRobot dataset bundle. (Hugging Face)

Benchmarks, leaderboards, comparisons that are actually useful to reference in your writeup

LIBERO

LIBERO is a widely used manipulation benchmark with multiple suites and task generation. (GitHub)
LeRobot has a dedicated “Evaluating with LIBERO” doc and supports multi-suite evaluation. (Hugging Face)
There is also a public LIBERO VLA leaderboard space. (Hugging Face)

Use LIBERO in your thesis as:

A sanity-check baseline environment for your training stack
A reference for reporting success rates and multi-task evaluation

Isaac Lab-Arena ecosystem benchmarks

NVIDIA explicitly states partnerships to build task suites and future benchmarks on Arena, including Lightwheel RoboCasa tasks and Lightwheel LIBERO tasks. (NVIDIA Developer)
This matters because your chess environment can be framed as “a new high-precision arrangement benchmark task in Arena style.”

VLA leaderboard aggregators

There is a general VLA leaderboard site that tracks VLA benchmarks across sim environments. (VLA Leaderboard)
Treat this as a discovery tool, then cite primary papers for anything you rely on.

“Chess-like” long-horizon arrangement benchmarks

RoboCAS is explicitly about complex object arrangement scenarios and long-horizon planning in simulation, which is conceptually similar to chess setup. (arXiv)

“Good models on Hugging Face” that are relevant (and current as of late 2025)

These are concrete, Arena-aligned models you can use as baselines or sanity checks.

GR00T family

nvidia/GR00T-N1.5-3B (foundation base) (Hugging Face)
nvidia/GN1x-Tuned-Arena-G1-Loco-Manipulation (a tuned N1.5 checkpoint aligned with the Arena G1 loco-manip task) (Hugging Face)
N1.6 G1 checkpoint listings exist on HF model index pages (including GR00T-N1.6-G1-PnPAppleToPlate), but expect more “moving parts” around modality configs and artifacts based on public issues. (GitHub)

Strong non-GR00T baselines in the same Arena ecosystem (useful for comparisons)

nvidia/pi05-arena-gr1-microwave (Pi05 policy, updated Dec 2025) (Hugging Face)
nvidia/smolvla-arena-gr1-microwave (SmolVLA policy) (Hugging Face)

Even if your thesis focuses on GR00T, having a second policy family as a baseline makes your evaluation section stronger.

“Good datasets on Hugging Face” that fit your preference (no custom builder script)

Two practical rules if you want “downloadable datasets you can read directly”:

Prefer datasets that already ship in LeRobot-format folders (parquet + MP4 + metadata).
Prefer datasets that explicitly list the metadata files and modalities.

Arena datasets with pre-converted LeRobot data

Arena docs explicitly show you can download --include lerobot/* to get the pre-converted LeRobot dataset and skip conversion. (Isaac Sim)
Examples:

nvidia/Arena-GR1-Manipulation-Task (GR1 microwave task) (Isaac Sim)
nvidia/Arena-G1-Loco-Manipulation-Task (G1 pick-and-place loco-manip) (Isaac Sim)

NVIDIA PhysicalAI GR00T tuned tasks dataset (explicit LeRobot-format description)

nvidia/PhysicalAI-GR00T-Tuned-Tasks explicitly describes providing both HDF5 and “GR00T-Lerobot formatted datasets,” including the metadata files (episodes.jsonl, tasks.jsonl, modality.json, info.json) and MP4 videos. (Hugging Face)

LeRobotDataset v3.0 as your target publishing format

If you plan to publish your chess dataset, LeRobotDataset v3.0 is designed for parquet + MP4 with scalable metadata and streaming. (Hugging Face)
This aligns with your “no builder script” preference because the format is explicitly file-structured and Hub-native.

A concrete “do this next” plan (optimized for thesis execution)

Week 1: Contracts and smoke tests

Implement a minimal ChessPlace environment with one piece and one target square.
Record 2 episodes and run end-to-end:
- HDF5 → LeRobot conversion → one training step → one evaluation rollout.
Lock your observation keys and camera names.

Week 2: Controller stability before scale

Get reliable grasps and placements for a single pawn.
Tune contact parameters until you can place 10 pawns in a row with scripted IK. Only then scale demos.

Weeks 3 to 5: Data and curriculum

Collect 200–1000 demos:
- Mostly scripted/IK for clean placements
- Some teleop for recovery and crowded interactions
Train GR00T, track metrics vs tolerance, and run ablations.

Weeks 6+: Long-horizon and reporting

Evaluate sequential placement success vs N.
Write a failure taxonomy section with videos.

Links (curated, high-signal)

Core docs you will actually follow

LeRobot Unitree G1 hardware/sim control doc: https://huggingface.co/docs/lerobot/en/unitree_g1
LeRobot GR00T N1.5 policy + training requirements: https://huggingface.co/docs/lerobot/en/groot
LeRobot + IsaacLab Arena EnvHub integration: https://huggingface.co/docs/lerobot/envhub_isaaclab_arena
Arena static manipulation GR00T post-training workflow (conversion + training): https://isaac-sim.github.io/IsaacLab-Arena/main/pages/example_workflows/static_manipulation/step_4_policy_training.html
Arena G1 loco-manip post-training workflow (G1 conversion YAML + data_config class): https://isaac-sim.github.io/IsaacLab-Arena/main/pages/example_workflows/locomanipulation/step_3_policy_training.html
Arena G1 closed-loop evaluation workflow (action_horizon, config file, joints config paths): https://isaac-sim.github.io/IsaacLab-Arena/main/pages/example_workflows/locomanipulation/step_4_evaluation.html

Benchmarks and leaderboards

LIBERO benchmark repo: https://github.com/Lifelong-Robot-Learning/LIBERO
LeRobot “Evaluating with LIBERO”: https://huggingface.co/docs/lerobot/en/libero
LIBERO VLA leaderboard space: https://huggingface.co/spaces/HuggingFaceVLA/libero-vla-leaderboard
VLA leaderboard aggregator: https://vlaleaderboard.com/
RoboCAS benchmark paper (arrangement-focused, chess-adjacent): https://arxiv.org/abs/2407.06951

Models and envs (late-2025 relevant)

GR00T N1.5 base: https://huggingface.co/nvidia/GR00T-N1.5-3B
Tuned G1 loco-manip checkpoint: https://huggingface.co/nvidia/GN1x-Tuned-Arena-G1-Loco-Manipulation
Pi05 Arena model: https://huggingface.co/nvidia/pi05-arena-gr1-microwave
SmolVLA Arena model: https://huggingface.co/nvidia/smolvla-arena-gr1-microwave
Arena env hub pack (shows g1_locomanip_pnp etc.): https://huggingface.co/nvidia/isaaclab-arena-envs

Datasets that already align with LeRobot-style parquet+MP4 packaging

Arena GR1 manipulation dataset: https://huggingface.co/datasets/nvidia/Arena-GR1-Manipulation-Task
Arena G1 loco-manip dataset: https://huggingface.co/datasets/nvidia/Arena-G1-Loco-Manipulation-Task
PhysicalAI GR00T tuned tasks dataset (explicit metadata listing): https://huggingface.co/datasets/nvidia/PhysicalAI-GR00T-Tuned-Tasks

Summary (key points)

LeRobot “G1 support” is mostly bring-up + locomotion examples, not chess-level dexterity.
The best public “G1 + GR00T wiring” references are Arena’s G1 loco-manip configs and workflows.
The best “tabletop manipulation pipeline” reference is Arena’s static manipulation workflow (conversion + post-training).
Biggest recurring pitfalls: action chunking, embodiment/modality mismatches, model artifact loading, and dataset key/camera naming. (GitHub)
For a thesis, win by: curriculum, metrics, controller stability, and reproducible end-to-end smoke tests.

Raysolo · January 10, 2026, 7:16am

Thank you so much for all this information. I really appreciate it

Topic		Replies	Views
VLA models (Post-Training Isaac GR00T N1.5) Models	2	67	November 10, 2025
Help : integrating a Agilex PiPER robot arm to lerobot Intermediate	7	291	October 2, 2025
FastLoRAChat Instruct-tune LLaMA on consumer hardware with shareGPT data Show and Tell	0	708	April 19, 2023
For fine-tuned LLAMA 2 Beginners	0	312	October 16, 2023
Agents Console - a Hugging Face Space by RFTSystems Spaces	0	27	December 15, 2025