In our paper, we present the following contributions for action recognition:
Motion compensation: We decompose visual motion into dominant and residual motions.
We estimate the dominant motion by an affine motion model, which is a good trade-off between accuracy and efficiency.
The residual motion field, which call w-flow, obtained by canceling the dominant motion (predominantly camera motion) is more related to the actions.
This w-flow is employed for both extraction of space-time trajectories and for the computation of descriptors.
Here is an example of trajectories obtained by using optical flow, affine flow and w-flow:
Trajectories from optical flow
Trajectories from affine flow
Trajectories from w-flow
Kinematic features: A motion descriptor is proposed which is based on differential motion scalar quantities, divergence, curl and shear features. This descriptors is named
as DCS (divergence-curl-shear) descriptor.
VLAD in actions: VLAD coding technique proposed in image retrieval provides a substantial improvement for action recognition.