02/09/2009 GLM optimization results

GLM transformation functions take as a first parameter a matrix to transform by the transformation function. It could be not and simply build a transformation matrix to multiply after by this matrix. Because transformation matrix are filled with a lot of zeros, a dedicate implementation instead of the matrix product could be a lot more efficient.

I get though all these optimisations and the results are as expected. 'rotate' from ~900 cycles to ~675 cycles, translate from ~459 cycles to ~153 cycles and scale from ~432 cycles to ~126 cycles. On Q6600 FPU!

Finally, I write the code specifically so that compiler could easily optimized it for SIMD instructions but obviously my next step is to write a SIMD version.

Available for the next release!

ATI Tootle: Vertex cache optimization and overdraw minimization >
< My Google reader shared items
Copyright © Christophe Riccio 2002-2016 all rights reserved
Designed for Chrome 9, Firefox 4, Opera 11 and Safari 5