Visual Decision Rules for Segmentation

Alvin Wan*, Daniel Ho*, Younjin Song, Henk Tillman, Sarah Adel Bargal, Joseph E. Gonzalez

Our models, Segmentation Neural-Backed Decision Trees (SegNBDT), are competitive with state-of-the-art neural networks on segmentation and feature visual decision rules.


Alvin Wan*, Daniel Ho*, Younjin Song, Henk Tillman, Sarah Adel Bargal, Joseph E. Gonzalez

*denotes equal contribution


University of California, Berkeley

Boston University

Published (Preprint)

June 11, 2020


The black-box nature of neural networks limits model decision interpretability, in particular for high-dimensional inputs in computer vision and for dense pixel prediction tasks like segmentation. To address this, prior work combines neural networks with decision trees. However, such models

  1. perform poorly when compared to state-of-the-art segmentation models
  2. fail to produce decision rules with spatially-grounded semantic meaning

In this work, we build a hybrid neural-network and decision-tree model for segmentation that (1) attains neural network segmentation accuracy and (2) provides semi-automatically constructed visual decision rules such as "Is there a window?". We obtain semantic visual meaning by extending saliency methods to segmentation and attain accuracy by leveraging insights from neural-backed decision trees, a deep learning analog of decision trees for image classification. Our model SegNBDT attains accuracy within ~2-4% of the state-of-the-art HRNetV2 segmentation model while also retaining explainability; we achieve state-of-the-art performance for explainable models on three benchmark datasets -- Pascal-Context (49.12%), Cityscapes (79.01%), and Look Into Person (51.64%). Furthermore, user studies suggest visual decision rules are more interpretable, particularly for incorrect predictions. Code and pretrained models can be found on Github.


Our work culminates in three key contributions that you can takeaway for future research: