Banana🍌: Banach Fixed-Point Network for Pointcloud Segmentation with Inter-Part Equivariance

  • 🌲Stanford University

  • 🍄University of Pennsylvania



Summary

  • To the best of our knowledge, we are the first to provide a strict definition of inter-part equivariance for pointcloud segmentation and introduce a learning framework with such equivariance by construction.
  • We propose a fixed-point framework with one-step training and iterative inference and show that the per-step equivariance induces an overall equivariance upon convergence.
  • We design a part-aware equivariant message-passing network with stable convergence.
  • Experiments show our strong generalization under inter-part configuration changes even when they cause subsequent changes in pointcloud geometry or topology.

Video

Method Overview

We train the network with ground-truth segmentation as both network input and output. To avoid the trivial solution which is the identity function on the segmentation labels, we limit the network expressivity with a small Lipschitz constant. At inference time, given an input point cloud, we apply Banach fixed-point iterations on the segmentation labels starting with a random initialization. We show that the per-step equivariance during the iteration process induces an overall inter-part equivariance at the final convergent state.



We further bring our formulation to concrete model designs by proposing a novel part-aware equivariant network. Key to our network is a message-passing module weighted by the input segmentation which only allows information propagation within each part. The module is plugged into a pointcloud convolution network for segmentation label updates. Here we employ Vector Neurons, an SE3-equivariant backbone to extract the per-part features, and also enable global information exchange with invariant features.



Results on Shape2Motion

To demonstrate the generalizability of our approach, we train our model on objects in a single rest state which aligns with many synthetic datasets featuring static shapes. We then test our model on articulated states with both global and inter-part pose transformations applied.

Training on rest state

Inference on novel states: VNN

Inference on novel states: Ours

Results on chair scans

We train our model using a synthetic dataset constructed from clean ShapeNet chair models with all instances lined up and facing the same direction.

We then test our model on the real chair scans with diverse scene configurations.

Inference: Z

Inference: SO(3)

Inference: Pile

If you have any questions, please contact Congyue Deng (congyue@stanford.edu).