| --- |
| license: mit |
| tags: |
| - 3d-reconstruction |
| - affordance-grounding |
| - flow-matching |
| - rgbd |
| arxiv: 2601.09211 |
| --- |
| |
| # Affostruction |
|
|
| **Affostruction: 3D Affordance Grounding with Generative Reconstruction** — CVPR 2026, Denver. |
|
|
| [Chunghyun Park](https://chrockey.github.io/)<sup>1</sup>, |
| [Seunghyeon Lee](https://llishyun.github.io/)<sup>1</sup>, |
| [Minsu Cho](http://cvlab.postech.ac.kr/~mcho/)<sup>1,2</sup> |
|
|
| <sup>1</sup>POSTECH <sup>2</sup>RLWRLD |
|
|
| - Paper: https://arxiv.org/abs/2601.09211 |
| - Project page: https://chrockey.github.io/Affostruction/ |
| - Code: https://github.com/chrockey/Affostruction |
|
|
| ## Pretrained checkpoints |
|
|
| Each subfolder contains a flat `config.json` + `model.safetensors`. SLAT + mesh/gaussian decoders come from [`microsoft/TRELLIS-image-large`](https://huggingface.co/microsoft/TRELLIS-image-large). |
|
|
| | Model | Description | # Params | |
| |-------|-------------|---------:| |
| | `reconstruction` | Multi-view RGBD sparse-structure flow | 164.6 M | |
| | `affordance` | Text-conditioned affordance heatmap flow | 185.4 M | |
|
|
| ## Usage |
|
|
| ```python |
| from affostruction import AffostructionPipeline |
| |
| pipeline = AffostructionPipeline.from_pretrained("chrockey/Affostruction").cuda() |
| outputs = pipeline.run(input_dict, queries=["Point to the part you would sit on."]) |
| |
| coords = outputs["coords"] # (N, 4) sparse voxel coords |
| probs = outputs["affordance"][0]["probs"] # (N,) per-voxel heatmap in [0, 1], paired with coords |
| ``` |
|
|
| > **Need mesh / gaussian?** Pass `formats=["mesh", "gaussian"]` — decoding is opt-in. |
| > |
| > **Reconstruction only?** Drop `queries`. |
|
|
| End-to-end examples in the [GitHub repo](https://github.com/chrockey/Affostruction): `examples/affostruction.py` (full pipeline), `examples/reconstruction.py` (reconstruction-only), and `examples/reconstruction_unposed.py` (experimental unposed single-view). |
|
|