Notes on optimising linear features in polygons
Notes on optimising linear features in polygons
|
Gouraud shading is one example of a linearly interpolated feature. Other examples include linear
texture mapping etc. In each case you are interpolating a value across the polygon. There are
various properties I have discovered which makes it possible to optimise the algorithm somewhat.
The gradient across the polygon is constant in any one direction. So, the gradient is the same
across any of the green lines on the left. This means, of course, that it only needs to be
calculated once for the whole polygon. For the best accuracy, calculate the gradient at the
widest part of the polygon (purple line). This saves one divide per line on polygon per feature.
So, if you were Gouraud shading and texture mapping, this would save three divides per scanline.
If you remember back to the previous article on Gouraud shading, a small adjustment was made on
each line to increase the accuracy. These adjustments also change linearly down each edge of the
polygon. Now, I haven't investigated this throughly (my monitor has blown up, so I can't do any
programming till next week), so I am kind of guessing here. I think that the the adjustment
increases by gradient*deltaX, where deltaX is the change of X down
an edge of the polygon. But, the adjustment should never go above gradient, so if the
adjustment does go above gradient, subtract gradient from it. This should save
you one multiply per scanline per feature.
Now, I haven't actually tested any of this yet, so If anyone tries this and has good / bad results
I would love to hear if any of this works.
|
Cache Considerations
Now, I must admit that I'm really still learning this area, but there are a few things I have
picked up that might help here. I have been reading some documents on optimising code for the
Pentium.
If you can try to keep using the same small bit of memory over and over again, then it's
more efficent because that memory stays in the cache and can be accessed quickly. A few articles
previously, I suggested using a Y-buffer to store the data produced by the scan converter. I
suggested that you have one entry in the buffer for every scanline on the screen. Polygons rendered
in the upper part of the screen would be stored in the upper part of the buffer, and those rendered
lower down would be stored in the lower parts. From a cache point of view, this can be rather
inefficient. If you are rendering thousands of tiny polygons, it would be better if they were
always stored in the upper part of the buffer, and then offset to the screen when they are
actually rendered. This means that the top few lines of the buffer will be used over and over
again, and should be kept in the processor's cache.