Visual Synthesis and Interpretable AI with Disentangled Representations

Deep learning has significantly improved the expressiveness of representations. However, present research still fails to understand why and how they work and cannot reliably predict when they fail. Moreover, the different characteristics of our physical world are commonly intermingled, making it impossible to study them individually. We incorporate novel paradigms for disentangling multiple object characteristics and present interpretable models to translate arbitrary network representations into semantically meaningful, interpretable concepts. We also obtain disentangled generative models that explain their latent representations by synthesis while being able to alter different object characteristics individually.

Talk given in August 2021

Selected Publications

2021

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

Esser, Patrick; Rombach, Robin; Blattmann, Andreas; Ommer, Björn

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis Conference

Neural Information Processing Systems (NeurIPS), 2021.

Links | BibTeX

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

Blattmann, Andreas; Milbich, Timo; Dorkenwald, Michael; Ommer, Björn

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Conference

Proceedings of the International Conference on Computer Vision (ICCV), 2021.

Links | BibTeX

Geometry-Free View Synthesis: Transformers and no 3D Priors

Rombach, Robin; Esser, Patrick; Ommer, Björn

Geometry-Free View Synthesis: Transformers and no 3D Priors Conference

Proceedings of the Intl. Conf. on Computer Vision (ICCV), 2021.

Links | BibTeX

Taming Transformers for High-Resolution Image Synthesis

Esser, Patrick; Rombach, Robin; Ommer, Björn

Taming Transformers for High-Resolution Image Synthesis Conference

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

Links | BibTeX

2020

Network-to-Network Translation with Conditional Invertible Neural Networks

Rombach, Robin; Esser, Patrick; Ommer, Björn

Network-to-Network Translation with Conditional Invertible Neural Networks Conference

Neural Information Processing Systems (NeurIPS) (Oral), 2020.

Abstract | Links | BibTeX

Unsupervised Part Discovery by Unsupervised Disentanglement

Braun, Sandro; Esser, Patrick; Ommer, Björn

Unsupervised Part Discovery by Unsupervised Disentanglement Conference

Proceedings of the German Conference on Pattern Recognition (GCPR) (Oral), Tübingen, 2020.

Abstract | Links | BibTeX

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

Rombach, Robin; Esser, Patrick; Ommer, Björn

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs Conference

IEEE European Conference on Computer Vision (ECCV), 2020.

Abstract | Links | BibTeX

Network Fusion for Content Creation with Conditional INNs

Rombach, Robin; Esser, Patrick; Ommer, Björn

Network Fusion for Content Creation with Conditional INNs Conference

CVPRW 2020 (AI for Content Creation), 2020.

Abstract | Links | BibTeX

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

Esser, Patrick; Rombach, Robin; Ommer, Björn

A Disentangling Invertible Interpretation Network for Explaining Latent Representations Conference

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

Abstract | Links | BibTeX

2019

Content and Style Disentanglement for Artistic Style Transfer

Kotovenko, Dmytro; Sanakoyeu, Artsiom; Lang, Sabine; Ommer, Björn

Content and Style Disentanglement for Artistic Style Transfer Conference

Proceedings of the Intl. Conf. on Computer Vision (ICCV), 2019.

Links | BibTeX

Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis

Esser, Patrick; Haux, Johannes; Ommer, Björn

Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis Conference

Proceedings of the Intl. Conf. on Computer Vision (ICCV), 2019.

Abstract | Links | BibTeX

Unsupervised Part-Based Disentangling of Object Shape and Appearance

Lorenz, Dominik; Bereska, Leonard; Milbich, Timo; Ommer, Björn

Unsupervised Part-Based Disentangling of Object Shape and Appearance Conference

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Oral + Best paper finalist: top 45 / 5160 submissions), 2019.

Links | BibTeX

2018

A Variational U-Net for Conditional Appearance and Shape Generation

Esser, Patrick; Sutter, Ekaterina; Ommer, Björn

A Variational U-Net for Conditional Appearance and Shape Generation Conference

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (short Oral), 2018.

Abstract | Links | BibTeX