Figure 2. Overview of the HyperPose framework. All components are jointly optimised end-to-end.
Multi-view videos generated by HyperPose across four datasets.
Figure 3. HyperPose generates high-fidelity, geometrically consistent multi-view videos — without any pose supervision.
Evaluated on four challenging benchmarks. Metrics: FID ↓, Recall/Precision ↑, NFS ↑ (3D geometry), Depth FID ↓ (Bedroom only). .
| Method | Depth FID ↓ | FID ↓ | Recall ↑ | NFS ↑ |
|---|---|---|---|---|
| GRAF | 97.4 | 70.7 | 0.00 | 19.4 |
| π-GAN | 124.1 | 56.3 | 0.11 | 9.7 |
| GIRAFFE | 145.6 | 42.8 | 0.02 | 16.9 |
| GIRAFFE-HD | – | 27.7 | 0.13 | – |
| HyperPose | 49.5 | 12.5 | 0.23 | 28.2 |
| Method | FID ↓ | Recall ↑ | Precision ↑ | NFS ↑ |
|---|---|---|---|---|
| GRAF | 91.1 | 0.00 | 0.53 | 9.3 |
| π-GAN | 56.8 | 0.18 | 0.49 | 24.4 |
| GIRAFFE | 38.4 | 0.02 | 0.51 | 13.5 |
| GIRAFFE-HD | 10.3 | – | – | – |
| HyperPose | 5.8 | 0.37 | 0.60 | 29.9 |
| Method | FID ↓ | Recall ↑ | Precision ↑ | NFS ↑ |
|---|---|---|---|---|
| GRAF | 107.0 | 0.00 | 0.35 | 8.5 |
| π-GAN | 48.4 | 0.12 | 0.41 | 21.4 |
| GIRAFFE | 31.3 | 0.04 | 0.51 | 14.2 |
| GIRAFFE-HD | 14.2 | 0.10 | 0.55 | – |
| StyleNeRF | 14.0 | – | – | – |
| HyperPose | 7.5 | 0.30 | 0.53 | 19.2 |
| Method | FID ↓ | Recall ↑ | Precision ↑ | NFS ↑ |
|---|---|---|---|---|
| GRAF | 46.3 | 0.09 | 0.67 | 21.3 |
| π-GAN | 48.8 | 0.10 | 0.64 | 22.1 |
| GIRAFFE | 49.3 | 0.04 | 0.68 | 30.6 |
| GIRAFFE-HD | 24.3 | 0.17 | 0.67 | – |
| HyperPose | 10.8 | 0.39 | 0.62 | 44.5 |
HyperPose outperforms all baselines across every dataset and metric. The CUB gain (FID 10.8 vs. 24.3) highlights the strength of continuous pose modeling under large geometric variation.
LSUN Bedroom 128². Each component is validated in isolation.
| Pose Disentangle. | Lnon-match | FID ↓ | Precision ↑ | NFS ↑ |
|---|---|---|---|---|
| ✗ | ✗ | 13.4 | 0.51 | 28.4 |
| ✓ | ✗ | 12.6 | 0.54 | 28.0 |
| ✓ | ✓ | 12.5 | 0.56 | 28.2 |
Pose disentanglement improves FID and Precision; Lnon-match adds further gain.
| Method | FID ↓ | Recall ↑ | Depth FID ↓ |
|---|---|---|---|
| w/ Lregression | 12.8 | 0.21 | 138.4 |
| HyperPose | 10.8 | 0.23 | 49.5 |
Our contrastive approach dramatically improves 3D geometry (Depth FID 49.5 vs. 138.4).