This thesis was partially developed at Verizon Connect Research Italy, in Florence.
Student: Simone Magistri.

Vehicle pose estimation from vehicle cameras is a crucial component of road scene understanding. In this thesis we propose a deep light-weight method to predict vehicle pose from a single RGB dash-cam image.
To this aim, we customize and adapt state of the art deep learning techniques for general object pose estimation to the vehicle pose estimation task. Furthermore, we define a novel objective function that takes into account errors at different granularity to improve neural network training. To keep the model light-weight and fast, we rely upon MobileNetV2 as
backbone.
Tested both on benchmark pose estimation data (Pascal3D+) and on actual vehicle camera data (nuScenes), our method is shown to outperform the state of the art in vehicle pose estimation, in terms of both accuracy and memory footprint.