Papers
arxiv:2605.26368

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Published on May 25
· Submitted by
Vukasin Bozic
on May 28
Authors:
,
,
,
,
,

Abstract

PaGeR is a framework that adapts 3D foundation models for perspective imagery to reconstruct 360-degree scenes from panoramic images, enabling simultaneous prediction of depth, normals, and sky masks with high performance.

AI-generated summary

Geometry estimation from perspective images has greatly advanced, maturing to the point where off-the-shelf foundation models are able to reconstruct 3D scene structure not only from multi-view imagery, but even from a single view. A natural extension is 3D reconstruction from panoramas, with the exciting prospect of recovering a full 360-degree scene from a single panoramic image. In this work, we introduce PaGeR (Panoramic Geometry Reconstruction), a framework to lift powerful 3D foundation models designed for perspective imagery to the panorama domain. Our strategy is to start from a pre-trained transformer for 3D reconstruction and turn it into a unified high-performance model that predicts scale-invariant depth, metric depth, surface normals, and sky masks from both perspective and omnidirectional images, in a single forward pass. By keeping architectural changes to a minimum and mixing perspective and panoramic images during training, PaGeR retains the rich 3D prior of the underlying foundation model while learning to also estimate geometrically consistent 360-degree scenes from single panoramas. We extensively test our method in both indoor and outdoor environments and find that it delivers state-of-the-art performance and excellent zero-shot performance across a wide range of scenes.

Community

Paper author Paper submitter

TL;DR: PaGeR turns a perspective 3D foundation model into a single-pass 360° geometry estimator — from one equirectangular image it predicts scale-invariant depth, metric depth (in metres), surface normals, and sky segmentation at full panoramic resolution.

We introduce PaGeR (Panoramic Geometry Reconstruction), which lifts a multi-view perspective foundation model (Depth Anything 3) to the panoramic domain via a fixed 6×504×504 cubemap, so VRAM and runtime stay constant regardless of input resolution. A single forward pass returns Scale-invariant + metric depth, world-frame normals, and a sky mask. We also release two new datasets — ZüriPano (real eval) and PanoInfinigen (synthetic training).

🔗 Project page: https://pager360.github.io · 🤗 Demo: https://huggingface.co/spaces/prs-eth/PaGeR · Collection (models + datasets): https://huggingface.co/collections/prs-eth/pager-697241d06b3733a6f18e4d39 · Code: https://github.com/prs-eth/PaGeR

Happy to answer any questions!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.26368
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 3

Datasets citing this paper 2

Spaces citing this paper 1

Collections including this paper 1