Booster K1

Data Collection

Humanoid data collection is fundamentally different from arm-only workflows. The K1 has 22+ degrees of freedom, must maintain balance during teleoperation, and requires synchronized multi-modal capture. This page covers the challenges, methods, dataset format, and safety protocol.

Why It's Different

Humanoid Data Collection Challenges

Collecting high-quality demonstrations on a full-size humanoid requires addressing challenges that don't exist on desktop arms.

⚖

Balance During Teleoperation

The K1 must maintain whole-body balance while the operator controls the arms. Arm movements shift the center of mass, requiring the locomotion controller to compensate continuously. Rapid arm commands can destabilize the robot.

📊

High-Dimensional State

Full-body joint state includes 22 DOF plus IMU, head pose, and optional hand state — 30+ dimensions per timestep. Dataset files are significantly larger than arm-only datasets. Storage planning is essential.

📷

Multi-Camera Synchronization

Humanoid tasks typically require egocentric (head-mounted) and exocentric (external) cameras. Synchronizing multiple video streams with joint telemetry at 50 Hz+ requires careful pipeline design.

👥

Operator Fatigue

VR-based whole-body teleoperation is physically demanding. Sessions longer than 30 minutes per operator significantly degrade demonstration quality. Plan for operator rotation in extended collection campaigns.

How to Control the Robot

Teleoperation Methods for Humanoids

Two primary methods are supported for upper-body teleoperation. Locomotion is always controlled via velocity commands from a gamepad or autonomously.

VR Whole-Body Teleoperation Recommended

Uses Meta Quest 3 or similar VR headset to track operator head and hand pose. The K1's head and arm joints mirror the operator's movements in real time. Provides the most natural and expressive demonstrations.

Setup: Quest 3 + SteamVR, k1_vr_teleop ROS2 node, operator wears gloves for hand tracking.

Latency: ~20ms head, ~40ms arm end-to-end.

Best for: Manipulation tasks, pick-and-place, whole-body loco-manipulation.

Leader-Follower Upper Body Advanced

A second human-scale exoskeleton or leader arm system mirrors the follower K1's upper body. Joint angles are mapped directly from leader to follower. Does not require VR hardware.

Setup: Requires a compatible leader arm system (e.g., OpenArm bimanual kit or custom exoskeleton). Contact SVRC for partner configurations.

Best for: Precise bimanual manipulation where tracking accuracy is critical.

Locomotion during teleoperation

Upper-body teleoperation is typically combined with gamepad-controlled locomotion. The operator uses a wireless gamepad to command walking velocity while the VR system controls the arms and head:

# Launch combined teleop: VR for upper body + gamepad for locomotion
ros2 launch k1_teleop k1_combined_teleop.launch.py \
  vr_device:=quest3 \
  gamepad:=xbox \
  robot_ip:=192.168.10.102

Data Format

Whole-Body Dataset Format (30+ DoF)

Each episode records synchronized joint state, camera frames, and metadata. The format is compatible with LeRobot and HuggingFace datasets.

Episode structure

episode_000001/
  joint_states.npy      # [T, 44] — positions, velocities, torques for 22 joints
  imu.npy               # [T, 6]  — accel (3) + gyro (3) from torso IMU
  head_pose.npy         # [T, 2]  — yaw and pitch in radians
  head_cam.mp4          # 1280x720 @ 30 fps, head-mounted egocentric
  left_cam.mp4          # 1280x720 @ 30 fps, left wrist
  right_cam.mp4         # 1280x720 @ 30 fps, right wrist
  external_cam.mp4      # 1920x1080 @ 30 fps, fixed external view
  timestamps.npy        # [T] unix timestamps for joint_states
  metadata.json         # task name, operator, duration, success label

Joint state schema (22 joints × 2 values each)

# joint_states.npy shape: [timesteps, 44]
# Columns: [q0_pos, q0_vel, q1_pos, q1_vel, ..., q21_pos, q21_vel]

# Joint index mapping:
# 0-5:   Left leg (hip_pitch, hip_roll, hip_yaw, knee, ankle_pitch, ankle_roll)
# 6-11:  Right leg (same order)
# 12:    Waist (yaw)
# 13:    Head yaw
# 14:    Head pitch
# 15-21: Left arm (shoulder_pitch, shoulder_roll, shoulder_yaw,
#                   elbow_pitch, wrist_pitch, wrist_roll, wrist_yaw)
# 22-28: Right arm (same order)
# Note: total 29 joints in extended K1 config; base K1 has 22

Recording a session with k1_agent.py

# Start the platform agent (streams telemetry to RoboticsCenter)
python k1_agent.py \
  --robot-ip 192.168.10.102 \
  --platform-url https://fearless-backend-533466225971.us-central1.run.app \
  --record \
  --task "pick up red block" \
  --cameras head_cam,left_wrist,right_wrist,external

# Episodes auto-numbered and saved to ./recordings/

Convert to LeRobot format

python convert_k1_to_lerobot.py \
  --input-dir ./recordings/ \
  --output-dir ./dataset/ \
  --repo-id your-username/k1-pick-place

Required Reading

Safety Protocol During Data Collection

⚠ Data collection sessions require all standard safety protocols Every data collection session must follow the same safety rules as any other K1 operation. Do not relax safety because you are "just recording."

✓Spotter required at all times — one dedicated person monitors the robot and holds the e-stop. The teleoperator cannot simultaneously monitor safety.
✓3 m × 3 m clear perimeter — no bystanders, no cables, no equipment in the operational area during any live session.
✓Episode duration limit: 60 seconds — keep episodes short. Shorter episodes are easier to quality-filter and reduce risk from prolonged operation.
✓30-minute operator rotation — rotate teleoperators every 30 minutes in VR sessions. Fatigue degrades demonstration quality and increases error rates.
✓Immediately abort and enter DAMP on any instability — if the K1 shows any unexpected oscillation or drift, hit the e-stop and restart from DAMP. Do not try to stabilize manually.
✓Log all incidents — document any falls, near-falls, or aborted episodes. This data is useful for dataset quality filtering and for improving safety procedures.

Quality Control

Episode Quality Checklist

Review each episode before adding it to your training dataset. Poor-quality demonstrations will degrade your policy.

✓The task was completed successfully end-to-end (no partial completions in training data)
✓Robot maintained stable balance throughout — no stumbles, oscillations, or compensatory jerks
✓All camera streams have complete frames with no dropped segments
✓Joint state timestamps are continuous (no gaps > 25 ms at 40 Hz recording)
✓Demonstration is smooth and deliberate — not rushed, not over-corrected
✓The object and task scene are visible in at least two camera streams throughout

Data Collection Pipeline Overview →

Ready to Train Your First Humanoid Policy?

Once you have collected quality demonstrations, head to the Booster K1 learning path for the full train-and-deploy workflow.

Booster K1 Learning Path → Safety Page Ask on Forum