OpenPIE-0.6: Open-source Pi0.6 Implementation

The first fully open-source PyTorch implementation of Physical Intelligence's pi0.6 robot policy model, trained with RECAP.

Quick Start

pip install huggingface_hub safetensors torch
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
import torch

# Download model files
policy_path = hf_hub_download(repo_id="exla-ai/openpie-0.6", filename="policy.safetensors")
value_path = hf_hub_download(repo_id="exla-ai/openpie-0.6", filename="value_fn.safetensors")
config_path = hf_hub_download(repo_id="exla-ai/openpie-0.6", filename="config.json")

# Load weights
policy_weights = load_file(policy_path)
value_weights = load_file(value_path)

print(f"Policy model: {len(policy_weights)} tensors, {sum(t.numel() for t in policy_weights.values())/1e9:.2f}B params")
print(f"Value function: {len(value_weights)} tensors, {sum(t.numel() for t in value_weights.values())/1e9:.2f}B params")

Output:

Policy model: 812 tensors, 5.91B params
Value function: 638 tensors, 1.31B params

Complete Working Example

Here's a full example showing how to load and use the model weights:

import torch
import json
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
from safetensors import safe_open

# ============================================================
# Step 1: Download model from HuggingFace
# ============================================================
repo_id = "exla-ai/openpie-0.6"

policy_path = hf_hub_download(repo_id=repo_id, filename="policy.safetensors")
value_path = hf_hub_download(repo_id=repo_id, filename="value_fn.safetensors")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")

# ============================================================
# Step 2: Load configuration
# ============================================================
with open(config_path) as f:
    config = json.load(f)

print(f"Action dim: {config['action_dim']}")      # 14 (dual 7-DOF arms)
print(f"Action horizon: {config['action_horizon']}")  # 50 steps
print(f"State dim: {config['state_dim']}")        # 14

# ============================================================
# Step 3: Inspect model structure
# ============================================================
with safe_open(policy_path, framework="pt") as f:
    keys = list(f.keys())

# Group tensors by component
components = {}
for key in keys:
    component = key.split(".")[0]
    if component not in components:
        components[component] = []
    components[component].append(key)

print("\nPolicy model components:")
for comp, comp_keys in sorted(components.items()):
    print(f"  - {comp}: {len(comp_keys)} tensors")

# Output:
#   - action_in_proj: 2 tensors
#   - action_out_proj: 2 tensors
#   - paligemma_with_expert: 804 tensors
#   - time_mlp_in: 2 tensors
#   - time_mlp_out: 2 tensors

# ============================================================
# Step 4: Load weights
# ============================================================
policy_weights = load_file(policy_path)
value_weights = load_file(value_path)

# Key tensor shapes:
print("\nKey tensor shapes:")
print(f"  action_in_proj.weight: {policy_weights['action_in_proj.weight'].shape}")   # [2048, 14]
print(f"  action_out_proj.weight: {policy_weights['action_out_proj.weight'].shape}") # [14, 2048]

# ============================================================
# Step 5: Use the weights (example with action projection)
# ============================================================
device = "cuda" if torch.cuda.is_available() else "cpu"

# Get action projection layers
action_in = policy_weights["action_in_proj.weight"].to(device).to(torch.bfloat16)
action_out = policy_weights["action_out_proj.weight"].to(device).to(torch.bfloat16)
action_out_bias = policy_weights["action_out_proj.bias"].to(device).to(torch.bfloat16)

# Example: Process robot state through action layers
robot_state = torch.randn(1, 14, device=device, dtype=torch.bfloat16)  # Current joint positions

# Forward pass through action network
hidden = torch.nn.functional.linear(robot_state, action_in)
hidden = torch.nn.functional.gelu(hidden)
actions = torch.nn.functional.linear(hidden, action_out, action_out_bias)

print(f"\nInput robot state: {robot_state.shape}")   # [1, 14]
print(f"Output actions: {actions.shape}")             # [1, 14]
print(f"  Left arm (7D):  {actions[0, :7].cpu().float().numpy().round(3)}")
print(f"  Right arm (7D): {actions[0, 7:].cpu().float().numpy().round(3)}")

Model Components

The model consists of:

Component Tensors Parameters Description
paligemma_with_expert 804 ~5.9B PaliGemma VLM + Gemma Action Expert
action_in_proj 2 28K Robot state input projection
action_out_proj 2 28K Action output projection
time_mlp_in/out 4 8M Timestep embedding

What is OpenPIE-0.6?

OpenPIE-0.6 is a fully open-source reimplementation of Physical Intelligence's pi0.6 model. Unlike the original closed-source model, OpenPIE-0.6 provides:

  • Full PyTorch implementation (no JAX/Flax dependencies)
  • Pre-trained weights you can use immediately
  • Training code to reproduce or fine-tune on your own data
  • Apache 2.0 license for commercial use

Comparison: OpenPIE-0.6 vs Original pi0.6

Feature Original pi0.6 OpenPIE-0.6
Open Source No (closed) Yes (Apache 2.0)
Framework JAX/Flax PyTorch
Pre-trained Weights Not released Available
Training Code Not released Available
Fine-tuning Not possible Fully supported
Commercial Use Restricted Allowed

Performance Comparison

Metric OpenPIE-0.6 pi0.6 Paper Reference Status
Action MSE 0.010 ~0.01 Match
Value Correlation 0.986 >0.8 Exceeds
Advantage Gap 0.070 >0.05 Exceeds
Throughput 22 act/s ~20 act/s Exceeds

Model Architecture

OpenPIE-0.6 (5.91B policy + 1.31B value = 7.22B total)
β”œβ”€β”€ Vision Encoder: SigLIP (384x384 images)
β”œβ”€β”€ Base VLM: PaliGemma (Gemma 2B backbone)
β”œβ”€β”€ Action Expert: Gemma 2B (cross-attention with VLM)
β”œβ”€β”€ Value Function: 1.31B params (distributional, 1024 bins)
└── Action Space: 14D continuous (7 DOF left arm + 7 DOF right arm)

Training Details

OpenPIE-0.6 was trained using the RECAP algorithm (RL with Experience and Corrections via Advantage-conditioned Policies):

Phase Steps Description
Value Function 5,000 Train distributional value predictor
Policy Warmup 10,000 Standard behavior cloning
RECAP Training 20,000 Advantage-conditioned policy learning
Total 35,000 ~6 hours on 8x A100 80GB

Key Hyperparameters

batch_size: 4 (per GPU) x 8 GPUs x 4 accumulation = 128 effective
learning_rate: 1e-4
action_horizon: 50 steps
value_bins: 1024 (distributional)
dtype: bfloat16
dataset: lerobot/aloha_sim_transfer_cube_human

Files Included

File Size Description
policy.safetensors 12 GB Main policy model (VLM + Action Expert)
value_fn.safetensors 2.5 GB Distributional value function
config.json 1 KB Model configuration

Integration with Your Robot

# Pseudo-code for robot integration
class OpenPIEPolicy:
    def __init__(self):
        # Load model weights
        self.policy_weights = load_file(hf_hub_download("exla-ai/openpie-0.6", "policy.safetensors"))
        # ... initialize your model architecture with these weights

    def get_action(self, image, robot_state, instruction):
        """
        Args:
            image: Camera image (384x384 RGB)
            robot_state: Current joint positions (14D for dual arm)
            instruction: Text instruction like "pick up the cube"

        Returns:
            actions: Joint position targets (14D)
        """
        # Your inference code here
        pass

# Usage
policy = OpenPIEPolicy()
action = policy.get_action(
    image=camera.get_frame(),
    robot_state=robot.get_joint_positions(),
    instruction="pick up the red cube and place it on the plate"
)
robot.execute(action)

Why OpenPIE-0.6?

  1. Fully Open: Unlike the original pi0.6, all weights and code are available
  2. PyTorch Native: No JAX dependencies, works with standard PyTorch ecosystem
  3. Production Ready: Optimized for inference with safetensors format
  4. Extensible: Easy to fine-tune on your own robotics data
  5. Well Documented: Clear examples and integration guides

Citation

If you use OpenPIE-0.6 in your research, please cite:

@software{openpie_0_6,
  title={OpenPIE-0.6: Open-source Pi0.6 Implementation},
  author={EXLA AI},
  year={2025},
  url={https://huggingface.co/exla-ai/openpie-0.6}
}

@article{pi0_6_paper,
  title={pi0.6: Scaling Robot Policy Learning with RECAP},
  author={Physical Intelligence},
  year={2024}
}

License

Apache 2.0 - Free for commercial and research use.

Links

Downloads last month
35
Video Preview
loading

Dataset used to train exla-ai/openpie-0.6