Abstract
Saliency maps play a major role in understanding the decision-making process of 3D models by illustrating the importance of individual points from the input to model predictions. However, saliency maps typically suffer from inaccuracies due to not considering the potential classification of contributions made by a point. In this paper, a two-stage explainability method for 3D object tracking is proposed to generate a refined saliency map (RSM), which refines the contributions of points to positive and negative based on their actual effects on tracking performances. Specifically, in stage I, a point-wise growing downsampling algorithm is developed to generate subsets of the search area, under which the model’s behavior is evaluated to precisely identify the points with negative contributions. Subsequently, a voxel-wise downsampling algorithm is performed along with the deviation metric to select points with positive contributions in stage II. Experiments demonstrate that RSM can generate high-quality explanations to popular 3D trackers.