Abstract
This paper proposes an end-to-end method for estimating camera poses using ray regression, a diffusion model-based ray inference approach. The conventional ray regression model outputs moments and directions, which are then converted into the final pose through traditional methods; however, this conversion process can introduce errors. In this work, we replace the conversion process with a deep learning network to achieve more stable pose estimation performance. Furthermore, the proposed model incorporates an additional rendering network for image reconstruction, demonstrating not only camera pose estimation but also the scalability to scene reconstruction. Leveraging the learned features, the model enables image rendering from novel viewpoints. Experimental results demonstrate that the proposed end-to-end method outperforms the conventional ray regression approach under the same training conditions, achieving approximately a 16% improvement in camera pose estimation and a nearly 30% gain in translation accuracy.
| Original language | English |
|---|---|
| Article number | 4624 |
| Journal | Electronics (Switzerland) |
| Volume | 14 |
| Issue number | 23 |
| DOIs | |
| State | Published - Dec 2025 |
Keywords
- camera pose estimation
- end-to-end
- point cloud
- ray
Fingerprint
Dive into the research topics of 'End-to-End Camera Pose Estimation with Camera Ray Token'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver