Optical flow with the Lucas-Kanade algorithm

Optical flow is the apparent motion of pixels between consecutive frames of a video. Ideally we’d like to find a vector that describes this motion or a displacement vector that indicates the relative position of a particular pixel in another frame

Trying to do this for every pixel in an image is known as dense optical flow. However this is can be quite computation heavy, so we turn our attention to a different class of algorithms called sparse optical flow algorithms, that track only a specific number of points in the image.

The Lucas-Kanade algorithm

The Lucas-Kanade algorithm is popular in problems concerning sparse optical flow. This is because it requires small windows of local interest.

Optical Flow makes two assumptions:

1- Consistent Brightness:
A pixel’s brightness remains the same as it moves from frame to frame

2-Temporal persistence
The motion of an image is coherent. This means that the motion of our window of interest changes relatively slowly from one frame to the next

Let’s first look at the motion in one dimension:

Mathematically, the equation for constant brightness can be expressed as:

I(x(t),t)=I(x(t+dt),t+dt)

Since the brightness of a pixel is considered to be constant over time, the intensity of a pixel at time t is equal to the intensity of a pixel at time t+dt . Of course, we can also conclude that the partial derivative is:

∂f(x) /  ∂t = 0

where f(x) ≡ I (x(t),t)

For our second assumption, we can note that the motion is very small from frame to frame. So we can say that the change between frames is differentially small. Now since we’re only looking at one dimension

We know that the function describing brightness f(x,t) is dependent on t, I(x(t),t), so we can substitute this definition and apply the chain rule, to obtain:

Ix.v + It = 0

Ix here is the spatial derivative across the first image while It is the temporal derivative.

We need to find the value of v

By switching the terms around we get :

v= -It/Ix

However the assumptions that we made at the beginning, i.e. the assumption that the brightness is stable and the motion of the pixel from frame to frame is small is not always true. However, if we are reasonably close to obtaining a solution for v, we can iterate toward an answer.

We can do this by keeping our spatial derivative same and changing the temporal derivative. Eventually the iterations should converge.

Lets try this with two dimensions:

I(x,y,t)= I(x + Δx,y+Δy,t+Δt)

We can reduce this expression to:

[u(x,y,t)∂I(x,y,t)/∂x]+[v(x,y,t)∂I(x,y,t)/∂y ]+[∂I(x.y,t)/∂t]=0

Here, I’m calling the velocity component in x direction u and the velocity in the y direction v.

However, now we have two variables u and v but only one equation. This equation is underconstrained!

To get around this, a third assumption is made:

3. Surrounding pixels to the pixel of interest have a similar motion to the pixel.

This implies that we can use the surrounding pixels, ie. get a patch of pixels, to set up more equations to solve for the two unkn0wns:

A little bit of matrix math, and we obtain the solution as:

where our first matrix is represented by A and the other as matrix b.

From this equation, we see that the ATA is invertible when it has two large eigenvectors. This occurs if the point we use has gradients in two directions. This property is best seen in corners which is why it makes sense that the corners are good features to track!

Lucas-Kanade pyramid

Earlier we’d assumed that the motion from frame to frame was small. Unfortunately, in the real world, this is rarely the case where in videos captured by a camera, noncoherent and large motions are quite commonplace.

As a workaround, we can track the motion in a large space with an image pyramid and then use our assumed velocity to work down the image pyramid until we arrive at the original image size. That is, we solve for the top layer and use the solution as an estimate for the next level and so on.

Implications:

From the derivation, we can see that the Lucas-Kanade algorithm isn’t going to work well on images with smooth shading. The more texture an image has, the better it is for tracking. It’s also quite difficult to track edges , since edges have large eigenvectors in one direction only.

Further reading:

I hope this serves as a brief look under the hood of the workings of the Lucas-Kanade algorithm. For a more insightful view -check out these resources:

Design a site like this with WordPress.com
Get started