Notes on optimising linear features in polygons

Notes on optimising linear features in polygons


Gouraud shading is one example of a linearly interpolated feature. Other examples include linear texture mapping etc. In each case you are interpolating a value across the polygon. There are various properties I have discovered which makes it possible to optimise the algorithm somewhat.

The gradient across the polygon is constant in any one direction. So, the gradient is the same across any of the green lines on the left. This means, of course, that it only needs to be calculated once for the whole polygon. For the best accuracy, calculate the gradient at the widest part of the polygon (purple line). This saves one divide per line on polygon per feature. So, if you were Gouraud shading and texture mapping, this would save three divides per scanline.

If you remember back to the previous article on Gouraud shading, a small adjustment was made on each line to increase the accuracy. These adjustments also change linearly down each edge of the polygon. Now, I haven't investigated this throughly (my monitor has blown up, so I can't do any programming till next week), so I am kind of guessing here. I think that the the adjustment increases by gradient*deltaX, where deltaX is the change of X down an edge of the polygon. But, the adjustment should never go above gradient, so if the adjustment does go above gradient, subtract gradient from it. This should save you one multiply per scanline per feature.
Now, I haven't actually tested any of this yet, so If anyone tries this and has good / bad results I would love to hear if any of this works.


Cache Considerations

Now, I must admit that I'm really still learning this area, but there are a few things I have picked up that might help here. I have been reading some documents on optimising code for the Pentium.

If you can try to keep using the same small bit of memory over and over again, then it's more efficent because that memory stays in the cache and can be accessed quickly. A few articles previously, I suggested using a Y-buffer to store the data produced by the scan converter. I suggested that you have one entry in the buffer for every scanline on the screen. Polygons rendered in the upper part of the screen would be stored in the upper part of the buffer, and those rendered lower down would be stored in the lower parts. From a cache point of view, this can be rather inefficient. If you are rendering thousands of tiny polygons, it would be better if they were always stored in the upper part of the buffer, and then offset to the screen when they are actually rendered. This means that the top few lines of the buffer will be used over and over again, and should be kept in the processor's cache.