Choose a Policy Architecture
ACT (Action Chunking with Transformers)
Fast training (1–2 h on 16 GB GPU). Excellent for pick-and-place and short-horizon tasks. The default choice for O6 tabletop manipulation.
Diffusion Policy
Slower training (3–5 h) but handles multi-modal action distributions better. Use for tasks with multiple valid completion paths.
Step 1: Train with ACT
python -m lerobot.scripts.train \
--dataset_repo_id=your-username/o6-pick-place \
--policy.type=act \
--policy.chunk_size=100 \
--training.num_epochs=200 \
--training.batch_size=32 \
--training.lr=1e-4 \
--output_dir=./checkpoints/o6-act-v1
# Monitor training loss
tensorboard --logdir=./checkpoints/o6-act-v1/logs/
ACT training typically stabilizes within 100–150 epochs on a 50-episode dataset. Watch for the validation loss to plateau before stopping. Do not stop early based on training time alone.
Step 2: Offline Evaluation
Before deploying to hardware, evaluate the checkpoint on held-out episodes from your dataset:
python -m lerobot.scripts.eval \
--pretrained_policy_name_or_path=./checkpoints/o6-act-v1 \
--env.type=linker_bot_o6_sim \
--eval.n_episodes=20 \
--eval.batch_size=10
If your environment does not have a sim model, use the dataset replay evaluator instead:
python -m lerobot.scripts.visualize_dataset \
--repo-id your-username/o6-pick-place \
--episode-index 0 \
--policy-checkpoint ./checkpoints/o6-act-v1
Step 3: Live Deployment on the O6
python -m lerobot.scripts.control_robot \
--robot.type=linker_bot_o6 \
--control.type=evaluate \
--pretrained_policy_name_or_path=./checkpoints/o6-act-v1 \
--control.fps=30 \
--control.num_episodes=20 \
--control.episode_time_s=30
For each trial:
- Place the object in the standard starting position.
- Confirm the arm is in home position.
- Start the episode. Do not intervene unless the arm is about to contact itself or the mount.
- Record success (task completed) or failure, and note the failure mode.
Interpreting Results
| Success Rate | Action |
|---|---|
| ≥70% | Path complete. Share your results and contribute your dataset. |
| 50–69% | Collect 20–30 more episodes targeting your top failure mode. Retrain. |
| Below 50% | Review dataset quality. Check camera alignment and starting position consistency. Consider a simpler task variant. |
The Data Flywheel
Continuous improvement after your first deployment:
- Identify your top 2–3 failure modes from the evaluation log.
- Collect 20–30 targeted demonstrations covering those failures.
- Mix new episodes with your original dataset (50/50 or weight toward failures).
- Retrain from scratch (not from checkpoint — fresh training avoids catastrophic forgetting on small datasets).
- Re-evaluate. Repeat until target success rate is reached.
Sharing Your Results
# Push your trained policy to HuggingFace Hub
python -m lerobot.scripts.push_policy_to_hub \
--pretrained_policy_path=./checkpoints/o6-act-v1 \
--repo-id=your-username/o6-pick-place-act
Unit 5 Complete When...
Your ACT (or Diffusion Policy) checkpoint achieves ≥70% success rate across 20 live evaluation trials on the O6. You have logged all trial results and identified any remaining failure modes. Your dataset and policy checkpoint are backed up and optionally shared on HuggingFace Hub.
Path Complete
You have gone from unboxing the LinkerBot O6 to a working imitation learning policy in production. Share your results in the SVRC Forum and contribute your dataset to the dataset registry.