Making 360° Video Watchable in 2D:

Learning Videography for Click Free Viewing

In our prior work, we propose the Pano2Vid problem that aims to generate normal-field-of-view (NFOV) videos that look like human captured given a 360° video. We propose the AutoCam algorithm that solves the Pano2Vid problem by learning to control virtual camera within 360° video axis from human captured NFOV videos.

In this work, we propose three improvements over the AutoCam algorithm. First, we generalize the task of Pano2Vid to allow changes in the field-of-view (FOV), i.e. zooming, which is a commonly used techniqe in videography. Second, we present a coarse-to-fine trajectory search algorithm that iteratively refines the camera control while reducing the search space to improve the computational efficiency. Finally, we generate a diverse set of output videos given an input 360° video to account for the fact that valid Pano2Vid solutions are often multimodal.

[top]

Improvements over AutoCam

Zoom Lens

The new algorithm enables zooming in virtual camera control. Zooming not only makes the camera control more natural but also improves the quality of capture-worthiness score.

Coarse to Fine Trajectory Search

The AutoCam algorithm finds the camera trajectories over all candidate glimpses. It has to process all glimpses, which is computationally intensive. The new algorithm imporves computational efficiency by using a two stage trajectory search algorithm to avoid processing all glimpses.

Diverse Trajectory Search

The original AutoCam algorithm may generate redundant outputs such that the camera trajectories are almost identical. The new algorithm search the trajectories iteratively and encourage the diversity between outputs.

[top]

Video Examples

Zooming allows the algorithm to emphasize particular object and moment in the video.

This example shows why it is important to generate diverse trajectories from the same input video. The two outputs demonstrate different ways to capture a video in the same scene.

These two examples show that zooming helps to learn a better capture-worthiness model. The content captured by the new algorithm is more interesting in both cases.

Failure Cases

In this example, the algorithm focus on the videographers, but the players on the playground should be more importatnt than the videographers. The algorithm does not reason about the importance of different objects in the scene, and further information is necessary to solve the problem.

Annotation Interface

These are two trajectories annotated by the same editor. Note the orientation of the 360° video is shifted by 180° in the second example. This encourage the editor to annotate different trajectories and avoid the bias introduced by the interface.

[top]

Publication

Yu-Chuan Su, Kristen Grauman, "Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing," CVPR 2017 (Spotlight)
[arXiv] [spotlight] [poster] [code] [HumanEdit]

[top]

Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing

Improvements over AutoCam

Zoom Lens

Coarse to Fine Trajectory Search

Diverse Trajectory Search

Video Examples

Failure Cases

Annotation Interface

Publication

Making 360° Video Watchable in 2D:

Learning Videography for Click Free Viewing