‘Equivariant Single View Pose Prediction via Induced and Restricted Representations’

“Learning about the three-dimensional world from two-dimensional images is a fundamental problem in computer vision. An ideal neural network architecture for such tasks would leverage the fact that objects can be rotated and translated in three dimensions to make predictions about novel images. However, imposing SO(3)-equivariance on two-dimensional inputs is difficult because the group of three-dimensional rotations does not have a natural action on the two-dimensional plane. … We show that an algorithm that learns a three-dimensional representation of the world from two dimensional images must satisfy certain geometric consistency properties.”

Find the paper and full list of authors at ArXiv.

View on Site: ‘Equivariant Single View Pose Prediction via Induced and Restricted Representations’