Skip to main navigation Skip to search Skip to main content

End-to-End Camera Pose Estimation with Camera Ray Token

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes an end-to-end method for estimating camera poses using ray regression, a diffusion model-based ray inference approach. The conventional ray regression model outputs moments and directions, which are then converted into the final pose through traditional methods; however, this conversion process can introduce errors. In this work, we replace the conversion process with a deep learning network to achieve more stable pose estimation performance. Furthermore, the proposed model incorporates an additional rendering network for image reconstruction, demonstrating not only camera pose estimation but also the scalability to scene reconstruction. Leveraging the learned features, the model enables image rendering from novel viewpoints. Experimental results demonstrate that the proposed end-to-end method outperforms the conventional ray regression approach under the same training conditions, achieving approximately a 16% improvement in camera pose estimation and a nearly 30% gain in translation accuracy.

Original languageEnglish
Article number4624
JournalElectronics (Switzerland)
Volume14
Issue number23
DOIs
StatePublished - Dec 2025

Keywords

  • camera pose estimation
  • end-to-end
  • point cloud
  • ray

Fingerprint

Dive into the research topics of 'End-to-End Camera Pose Estimation with Camera Ray Token'. Together they form a unique fingerprint.

Cite this