GCN-Based Multi-Modal Multi-Label Attribute Classification in Anime Illustration Using Domain-Specific Semantic Features
Ziwen Lan, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:07:30
Recently, providing explainable deep learning models has sparked a lot of attention. in this paper, we take a further step in this direction. We introduce a time-efficient method, called Ablation-CAM++, which can generate smooth visual explanations of CNN model predictions. Our approach uses the concept of studying the ablation analysis to determine the importance of activation maps w.r.t. the target class, similar to Ablation-CAM. However, instead of focusing on the individual importance of each activation map, we group activation maps using a clustering technique. Then, we construct a binary tree for each group by recursively splitting these groups, studying the ablation of each subgroup, and applying tree pruning. We perform qualitative and quantitative evaluations of our visual explanations against Ablation-CAM and Grad-CAM. Our approach can provide visual explanations in less than half of the time of Ablation-CAM. Using average drop and average increase evaluation metrics on 2000 images of the ImageNet validation set, we provide a comparison of the effect of applying different clustering techniques in our method.