Our part segmentation method leverages a disentangled representation for shape and appearance to discover semantic part segmentations without supervision.

Abstract

We address the problem of discovering part segmentations of articulated objects without supervision. In contrast to keypoints, part segmentations provide information about part localizations on the level of individual pixels. Capturing both locations and semantics, they are an attractive target for supervised learning approaches. However, large annotation costs limit the scalability of supervised algorithms to other object categories than humans. Unsupervised approaches potentially allow to use much more data at a lower cost. Most existing unsupervised approaches focus on learning abstract representations to be refined with supervision into the final representation. Our approach leverages a generative model consisting of two disentangled representations for an object's shape and appearance and a latent variable for the part segmentation. From a single image, the trained model infers a semantic part segmentation map. In experiments, we compare our approach to previous state-of-the-art approaches and observe significant gains in segmentation accuracy and shape consistency. Our work demonstrates the feasibility to discover semantic part segmentations without supervision.

Results

and applications of our model.

Oral Talk at GCPR 2020

Click on to get to the YouTube recording.

Oral Slides

Click on image to see the oral slide deck from GCPR 2020.

Segmentation on Human Object Class

Our method allows to learn part segmentations without supervision with a pixel-accurate precision.

Segmentation on Bird Object Class

Our method is able to generalize to new object categories, such as birds.

Segmentation IOU on Human Object Class

We evaluate the segmentation accuracy on the human object class represented by the Deepfashion, Exercise and Pennaction dataset. Our method outperforms existing methods on all datasets.

Segmentation IOU on Bird Object Class

We evaluate the segmentation accuracy on the bird object class represented by the CUB dataset. Our method outperforms all existing methods.

Part-based Appearance Transfer: Swapping no Appearance

Our method allows fine-grained part-based appearance transfer, which is mich more accurate than keypoint-based methods.

Part-based Appearance Transfer: Swapping Chest Appearance

Our method allows fine-grained part-based appearance transfer, which is mich more accurate than keypoint-based methods.

Part-based Appearance Transfer: Swapping Chest and Arm Appearance

Our method allows fine-grained part-based appearance transfer, which is mich more accurate than keypoint-based methods.

Part-based Appearance Transfer: Swapping Chest, Arm, Hip and Leg Appearance

Our method allows fine-grained part-based appearance transfer, which is mich more accurate than keypoint-based methods.

Analysis of Shape Consistency

We evaluate the shape-consistency of our method against those of supervised and keypoint-based methods by calculating the Percentage of Correct keypoints (PCK). Note that our method significantly increases shape consistency, in comparison to the state-of-the-art unsupervised method.

Acknowledgement

This work has been supported in part by the BW Stiftung project "MULT!nano", the German Research Foundation (DFG) project 421703927, and the German federal ministry BMWi within the project "KI Absicherung". This page is based on a design by TEMPLATED.