Leaving Some Stones Unturned:

Dynamic Feature Prioritization for Activity Detection in Streaming Video

Yu-Chuan Su and Kristen Grauman
The University of Texas at Austin

Abstract

Current approaches for activity recognition often ignore constraints on computational resources: 1) they rely on extensive feature computation to obtain rich descriptors on all frames, and 2) they assume batch-mode access to the entire test video at once. We propose a new active approach to activity recognition that prioritizes "what to compute when" in order to make timely predictions. The main idea is to learn a policy that dynamically schedules the sequence of features to compute on selected frames of a given test video. In contrast to traditional static feature selection, our approach continually re-prioritizes computation based on the accumulated history of observations and accounts for the transience of those observations in ongoing video. We develop variants to handle both the batch and streaming settings. On two challenging datasets, our method provides significantly better accuracy than alternative techniques for a wide range of computational budgets.

[top]

Approach


Approach figure

We formulate the problem as a Markov decision process (MDP) and learn the policy using reinforcement learning. To apply reinforcement learning, we define the following components:

We apply standard Q-learning with linear function approximation for the action-value function Q. Please see the paper for details.

[top]

Experiment Results


We show the quantitative results under streaming and untrimmed detection setting with different video representations. We show the policies learned by the algorithm. Please refer to the paper for experiment details and more results.

[top]

Example Recognition Episodes


We visualize the recognition episodes under streaming setting. These videos show how the policy operates in test time.

• ADL — Bag-of-Object

• UCF-101 — Bag-of-Object

• UCF-101 — CNN

[top]

Publication


[top]