type to search

How to write a tracker

Asked by

So, we've all used a standard point tracker in our favorite compositor, but I've always felt I didn't really understand them as well as I should. So, I thought it would be interesting to try and track down some information on how exactly to do it. Unfortunately, most of the descriptions I've found so far have been extremely vague and not quite specific enough, or else they go into detail on some super algorithm for trying to do fully automatic 3D tracking. Open source apps don't seem to have them, but commercial compositing packages all seem to have very similar trackers, so there must be some fairly standard techniques that are well known.

So, how does one write a basic 2D point tracker?

or Cancel

2 answers


lewis saunders

The obvious approach for a pattern-based tracker like that in Shake, Flame and friends is to calculate the cross-correlation function between the pattern you're looking for and the search window. This gives you a measure of how well-matched the pattern is for every possible position in the window.

Any signal processing or engineering mathematics textbook will tell you more about correlation. The ever-excellent Paul Bourke has some examples working with 2D images here:

It's pretty slow as it is, but can be optimised in a whole bunch of ways, some of which are detailed here in a paper from ILM about the tracker used for Forest Gump and subsequently in Shake:

There are some implementation details in the Fxguide "Art of tracking article" too:

There are open-source trackers out there in the computer vision world, such as in the OpenCV and VTK toolkits, but I think they tend to be more focused on tracking a lot of features at once for a camera solve rather than more precise single-feature tracking. They use similar techniques to optical flow or motion estimation - less precise but much faster than pattern-matching. I recall hearing that the Pixel Farm re-jigged PFTrack to use pattern tracking recently though, as part of the stereo update.

NN comments

“Cross Correlation” was the magic term that I needed for my googling to get me where I was trying to go. I was finding lots of stuff like OpenCV which focuses on many automatically selected features, rather than allowing a user to reliably select a single point and track that. Thanks for the answer! (And, thanks also to Julik)

or Cancel

julik [ Editor ]

I'd go like this: 1) define XY coord for the starting point, the search region (WHERE to look) and the reference region (WHAT to look for, inner box). Save the contents of the reference region into a buffer. 2) For each frame starting from the frame you placed your point in - move the coords of the track region INSIDE the search region. Preblur or downrez the two. Look for correlation (when the current sampled region from the search region has the least difference with the sample from the search region - you found your match) - set the point at the middle of the found subregion of the search region - move the search region to the place where your new point is - advance frame

The secret sauce is in how you compare the pieces of the search region with your reference region. You could do a downrez and do (pixel in the search region - pixel in the overlay) and look for blacks. You could do a transform on both and check for matches that way. Etc. etc. What you also could do is find the sources of Icarus (what later became PFTrack) and try to look at their 2dt code.

Or, alternatively you could look at this

or Cancel

Your answer

You need to join VFX Overflow to complete this action, click here to do so.