OpenPIE-0.6: Open-source Pi0.6 Implementation

The first fully open-source PyTorch implementation of Physical Intelligence's pi0.6 robot policy model, trained with RECAP.

Quick Start

pip install huggingface_hub safetensors torch

from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
import torch

# Download model files
policy_path = hf_hub_download(repo_id="exla-ai/openpie-0.6", filename="policy.safetensors")
value_path = hf_hub_download(repo_id="exla-ai/openpie-0.6", filename="value_fn.safetensors")
config_path = hf_hub_download(repo_id="exla-ai/openpie-0.6", filename="config.json")

# Load weights
policy_weights = load_file(policy_path)
value_weights = load_file(value_path)

print(f"Policy model: {len(policy_weights)} tensors, {sum(t.numel() for t in policy_weights.values())/1e9:.2f}B params")
print(f"Value function: {len(value_weights)} tensors, {sum(t.numel() for t in value_weights.values())/1e9:.2f}B params")

Output:

Policy model: 812 tensors, 5.91B params
Value function: 638 tensors, 1.31B params

Complete Working Example

Here's a full example showing how to load and use the model weights:

import torch
import json
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
from safetensors import safe_open

# ============================================================
# Step 1: Download model from HuggingFace
# ============================================================
repo_id = "exla-ai/openpie-0.6"

policy_path = hf_hub_download(repo_id=repo_id, filename="policy.safetensors")
value_path = hf_hub_download(repo_id=repo_id, filename="value_fn.safetensors")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")

# ============================================================
# Step 2: Load configuration
# ============================================================
with open(config_path) as f:
    config = json.load(f)

print(f"Action dim: {config['action_dim']}")      # 14 (dual 7-DOF arms)
print(f"Action horizon: {config['action_horizon']}")  # 50 steps
print(f"State dim: {config['state_dim']}")        # 14

# ============================================================
# Step 3: Inspect model structure
# ============================================================
with safe_open(policy_path, framework="pt") as f:
    keys = list(f.keys())

# Group tensors by component
components = {}
for key in keys:
    component = key.split(".")[0]
    if component not in components:
        components[component] = []
    components[component].append(key)

print("\nPolicy model components:")
for comp, comp_keys in sorted(components.items()):
    print(f"  - {comp}: {len(comp_keys)} tensors")

# Output:
#   - action_in_proj: 2 tensors
#   - action_out_proj: 2 tensors
#   - paligemma_with_expert: 804 tensors
#   - time_mlp_in: 2 tensors
#   - time_mlp_out: 2 tensors

# ============================================================
# Step 4: Load weights
# ============================================================
policy_weights = load_file(policy_path)
value_weights = load_file(value_path)

# Key tensor shapes:
print("\nKey tensor shapes:")
print(f"  action_in_proj.weight: {policy_weights['action_in_proj.weight'].shape}")   # [2048, 14]
print(f"  action_out_proj.weight: {policy_weights['action_out_proj.weight'].shape}") # [14, 2048]

# ============================================================
# Step 5: Use the weights (example with action projection)
# ============================================================
device = "cuda" if torch.cuda.is_available() else "cpu"

# Get action projection layers
action_in = policy_weights["action_in_proj.weight"].to(device).to(torch.bfloat16)
action_out = policy_weights["action_out_proj.weight"].to(device).to(torch.bfloat16)
action_out_bias = policy_weights["action_out_proj.bias"].to(device).to(torch.bfloat16)

# Example: Process robot state through action layers
robot_state = torch.randn(1, 14, device=device, dtype=torch.bfloat16)  # Current joint positions

# Forward pass through action network
hidden = torch.nn.functional.linear(robot_state, action_in)
hidden = torch.nn.functional.gelu(hidden)
actions = torch.nn.functional.linear(hidden, action_out, action_out_bias)

print(f"\nInput robot state: {robot_state.shape}")   # [1, 14]
print(f"Output actions: {actions.shape}")             # [1, 14]
print(f"  Left arm (7D):  {actions[0, :7].cpu().float().numpy().round(3)}")
print(f"  Right arm (7D): {actions[0, 7:].cpu().float().numpy().round(3)}")

Model Components

The model consists of:

Component	Tensors	Parameters	Description
`paligemma_with_expert`	804	~5.9B	PaliGemma VLM + Gemma Action Expert
`action_in_proj`	2	28K	Robot state input projection
`action_out_proj`	2	28K	Action output projection
`time_mlp_in/out`	4	8M	Timestep embedding

What is OpenPIE-0.6?

OpenPIE-0.6 is a fully open-source reimplementation of Physical Intelligence's pi0.6 model. Unlike the original closed-source model, OpenPIE-0.6 provides:

Full PyTorch implementation (no JAX/Flax dependencies)
Pre-trained weights you can use immediately
Training code to reproduce or fine-tune on your own data
Apache 2.0 license for commercial use

Comparison: OpenPIE-0.6 vs Original pi0.6

Feature	Original pi0.6	OpenPIE-0.6
Open Source	No (closed)	Yes (Apache 2.0)
Framework	JAX/Flax	PyTorch
Pre-trained Weights	Not released	Available
Training Code	Not released	Available
Fine-tuning	Not possible	Fully supported
Commercial Use	Restricted	Allowed

Performance Comparison

Metric	OpenPIE-0.6	pi0.6 Paper Reference	Status
Action MSE	0.010	~0.01	Match
Value Correlation	0.986	>0.8	Exceeds
Advantage Gap	0.070	>0.05	Exceeds
Throughput	22 act/s	~20 act/s	Exceeds

Model Architecture

OpenPIE-0.6 (5.91B policy + 1.31B value = 7.22B total)
├── Vision Encoder: SigLIP (384x384 images)
├── Base VLM: PaliGemma (Gemma 2B backbone)
├── Action Expert: Gemma 2B (cross-attention with VLM)
├── Value Function: 1.31B params (distributional, 1024 bins)
└── Action Space: 14D continuous (7 DOF left arm + 7 DOF right arm)

Training Details

OpenPIE-0.6 was trained using the RECAP algorithm (RL with Experience and Corrections via Advantage-conditioned Policies):

Phase	Steps	Description
Value Function	5,000	Train distributional value predictor
Policy Warmup	10,000	Standard behavior cloning
RECAP Training	20,000	Advantage-conditioned policy learning
Total	35,000	~6 hours on 8x A100 80GB

Key Hyperparameters

batch_size: 4 (per GPU) x 8 GPUs x 4 accumulation = 128 effective
learning_rate: 1e-4
action_horizon: 50 steps
value_bins: 1024 (distributional)
dtype: bfloat16
dataset: lerobot/aloha_sim_transfer_cube_human

Files Included

File	Size	Description
`policy.safetensors`	12 GB	Main policy model (VLM + Action Expert)
`value_fn.safetensors`	2.5 GB	Distributional value function
`config.json`	1 KB	Model configuration

Integration with Your Robot

# Pseudo-code for robot integration
class OpenPIEPolicy:
    def __init__(self):
        # Load model weights
        self.policy_weights = load_file(hf_hub_download("exla-ai/openpie-0.6", "policy.safetensors"))
        # ... initialize your model architecture with these weights

    def get_action(self, image, robot_state, instruction):
        """
        Args:
            image: Camera image (384x384 RGB)
            robot_state: Current joint positions (14D for dual arm)
            instruction: Text instruction like "pick up the cube"

        Returns:
            actions: Joint position targets (14D)
        """
        # Your inference code here
        pass

# Usage
policy = OpenPIEPolicy()
action = policy.get_action(
    image=camera.get_frame(),
    robot_state=robot.get_joint_positions(),
    instruction="pick up the red cube and place it on the plate"
)
robot.execute(action)

Why OpenPIE-0.6?

Fully Open: Unlike the original pi0.6, all weights and code are available
PyTorch Native: No JAX dependencies, works with standard PyTorch ecosystem
Production Ready: Optimized for inference with safetensors format
Extensible: Easy to fine-tune on your own robotics data
Well Documented: Clear examples and integration guides

Citation

If you use OpenPIE-0.6 in your research, please cite:

@software{openpie_0_6,
  title={OpenPIE-0.6: Open-source Pi0.6 Implementation},
  author={EXLA AI},
  year={2025},
  url={https://huggingface.co/exla-ai/openpie-0.6}
}

@article{pi0_6_paper,
  title={pi0.6: Scaling Robot Policy Learning with RECAP},
  author={Physical Intelligence},
  year={2024}
}

License

Apache 2.0 - Free for commercial and research use.

exla-ai
/

openpie-0.6