Kernel Transformer Networks for Compact Spherical Convolution

Kernel Transformer Network

In our prior work, we propose spherical convolutional neural network that can transfer off-the-shelf CNNs to 360° images for visual recognition. However, spherical convolutional neural network increases the model size significantly, which makes the model hard to train and deploy. In this work, we propose the Kernel Transformer Network that learns a function that transforms a kernel to account for the distortion in the equirectangular projection of 360° images. The transformation formulation can greatly reduce the model size for spherical convolution and can transfer to multiple source CNNs for multiple recognition tasks.

[top]

Kernel Transformer Network


Idea

Instead of learning spherical convolution kernels for the distorted visual content, we can learn a function transformation that can account for the distortion and generate the desirable spherical convolution kernels from the source kernel.

KTN idea

It can be considered as a generalization of the convolution operation, where the kernel is a function of the location as well as the source kernel.

KTN definition

Transferability

Because KTN takes the source model as input, it can transfer multiple source CNNs with the same architecture to 360° images for different visual recognition tasks.

KTN transferability

Architecture

KTN architecture
Training loss
[top]

Results


7g2k0eEQUaM-02975
GMvNWHLiw3Y-02125
GMvNWHLiw3Y-12625
IfUVo0RY9qs-01950
itS1LuDznDQ-01125
lZ23WSc-VnE-01937
OyNHGJ3zIS8-01937
rUbx2us-ZLg-00975
[top]

Publication


[top]