30/11/2011 November 2011 OpenGL drivers status

Summary of the month

Unfortunately, AMD doesn't provide new OpenGL 4.2 drivers this month as Catalyst 11.11 are OpenGL 4.1 drivers. I am really hopying we will get something fresh next month!

The evolutions recorded this month are the result of the newly released OpenGL Samples Pack 4.2.2.0 which has been tested with last month and this month drivers. Beside a new bug highlighted by the new 420-image-store sample on NVIDIA the main element to notice is that the 420-buffer-uniform sample pass succesfully on AMD. However, if I am still using the same AMD drivers since last month, how this is possible? Yes, I change the sample. It uses "dynamically uniform expression" indexing instead of indexing with "general integer expression" the block array.

Dynamically Uniform Expressions vs General Integer Expressions

A fragment-shader expression is dynamically uniform if all fragments evaluating it get the same resulting value. When loops are involved, this refers to the expression's value for the same loop iteration. When functions are involved, this refers to calls from the same call point. This is similarly defined for other shader stages, based on the per-instance data they process. Note that constant expressions are trivially dynamically uniform. It follows that typical loop counters based on these are also dynamically uniform. GLSL 4.20.8 specification, section 3.8.3 Dynamically Uniform Expressions

This definsion raises a question: So dynamically uniform expressions only apply to the fragment shader stage? This diesn't really make much sense to me because the concept deal with the coherence between different fragment shader execusion but this coherence could apply to other stages as well.

The other question is obvisouly where does dynamically uniform expressions apply? A little more of investigation tells us for the sampler arrays and this led me to include the sample 400-sampler-array-gtc as a feature request more than a year ago. The specification also clearly specify indexing of atomic counters arrays as quote below.

When aggregated into arrays within a shader, samplers can only be indexed with a dynamically uniform integral expression, otherwise results are undefined GLSL 4.20.8 specification, section 4.1.7.1 Samplers

When aggregated into arrays within a shader, atomic counters can only be indexed with a dynamically uniform integral expression, otherwise results are undefined. GLSL 4.20.8 specification, section 4.1.7.3 Atomic Counters

However, in GLSL there is another opaque type that can be declared as an array: image. Interestingly the behaviour is not consistent but it seems really hard to imagine that samplers and images would follow radical different memory access coherency requirements.

When aggregated into arrays within a shader, images can be indexed with general integer expressions. GLSL 4.20.8 specification, section 4.1.7.2 Images

Following this path, we are reaching another type which evolve memory access and can be declared as an array: uniform blocks.

Any integral expression can be used to index a uniform block array, as per section 4.1.9 "Arrays". GLSL 4.20.8 specification, section 4.3.7 Interface Blocks

This leads us to the explanation of why AMD implementation has never pass the uniform buffer array samples of the OpenGL Samples Pack: It's likely to be an hardware limitation that the OpenGL specification hasn't address correctly and uniformally.

This analyse concludes that NVIDIA OpenGL 4 hardware support "general integer expressions" indexing of uniform block, sampler, image and atomic counters arrays but AMD OpenGL 4 hardware to date doesn't. As a feature request, I am not really interested by "general integer expressions" indexing of such arrays but it could be good enough to limit the correhence constraint to the gl_Instance rate... maybe on future hardware but I am afraid that geometry shader instancing prevent us such intermediate strategy.

As a general recommandation for OpenGL programmers, using "dynamically uniform expressions" to index uniform block, sampler, image and atomic counters arrays will keep us safe of running into cross-platform problems.

  • White: Unsupported or untested.
  • Blue: The sample works but it doesn't follow the OpenGL specification.
  • Green: The sample works following the OpenGL specification.
  • Orange: The sample doesn't work correctly but a workaround is possible.
  • Red: The sample does't work and I haven't found any workaround.
  • Black: Really distubing problem!

The test

These tests have been done on Windows 7 64 with the OpenGL Samples Pack 4.2.2.0 on an GeForce GTX 470 and a Radeon HD 5850.

OpenGL Samples Pack 4.2.2.0, OpenGL specification testsAMD Catalyst 11.10 preview 3 (16/10/2011)NVIDIA Forceware 285.62 (25/10/2011)NVIDIA Forceware 290.36 (28/11/2011)
420-transform-feedback-instancedCan't readback built-in variables. max_vertices affects the alignment in the transform feedback buffer
420-texture-storage
420-texture-pixel-store
420-texture-compressedTexture storage with BPTC generates invalid operation errors
420-test-depth-conservative
420-sampler-fetch
420-memory-barrier
420-image-unpackUnpack isn't correct?
420-image-storeScissor test dysfunctional?Scissor test dysfunctional?
420-image-load
420-draw-base-instance
420-direct-state-access-extUnsupported DSA storage functions
420-buffer-uniform
420-atomic-counterglMapBufferRange on atomic counter fails
410-program-separate-dsa-ext
410-program-binary
410-program-64
410-primitive-tessellation-5Bug on the shader interface matching: Block member not active with linked separated program
410-primitive-tessellation-2
410-primitive-instanced
410-fbo-layered
400-transform-feedback-streammax_vertices affects the alignment in the transform feedback buffer
400-transform-feedback-object
400-texture-buffer-rgb
400-sampler-gather
400-sampler-fetch
400-sampler-array
400-program-varying-structs
400-program-varying-blocks
400-program-subroutine
400-program-64
400-primitive-tessellation
400-primitive-smooth-shading
400-primitive-instanced
400-fbo-rtt-texture-array
400-fbo-rtt
400-fbo-multisample
400-fbo-layered
400-draw-indirect
400-blend-rtt
330-texture-pixel-store
330-transform-feedback-separated
330-transform-feedback-interleaved
330-primitive-point-spritePop free clippingPop free clipping
330-fbo-srgb
330-error-sampler-offset
330-draw-without-vertex-array
330-buffer-typei32 vertex input data not supported
OpenGL Samples Pack 4.2.2.0, proprietary featuresAMD Catalyst 11.10 preview 3 (16/10/2011)NVIDIA Forceware 285.62 (25/10/2011)NVIDIA Forceware 290.36 (28/11/2011)
420-texture-copy-nvNV_copy_image not supported
420-primitive-bindless-nvNV_shader_buffer_load not supported
420-fbo-multisample-position-amdAMD_sample_positions not supportedAMD_sample_positions not supported
420-fbo-multisample-dsa-nvNV_texture_multisample not supported
420-draw-indirect-amdAMD_multi_draw_indirect not supportedAMD_multi_draw_indirect not supported
420-test-depth-clamp-separate-amdAMD_depth_clamp_separate not supportedAMD_depth_clamp_separate not supported
OpenGL Samples Pack 4.2.2.0, specification bugs workaroundAMD Catalyst 11.10 preview 3 (16/10/2011)NVIDIA Forceware 285.62 (25/10/2011)NVIDIA Forceware 290.36 (28/11/2011)
420-glsl-interface-matching-array-gtcCan write a valid vertex shader output with no valid geometry shader input possibleCan write a valid vertex shader output with no valid geometry shader input possibleCan write a valid vertex shader output with no valid geometry shader input possible
400-sampler-array-gtcNo workaround for this specification bugAllows dynamic indexing of the sampler arrayAllows dynamic indexing of the sampler array
330-draw-instanced-array-dsa-gtcNo workaround for this specification bugNo workaround for this specification bugNo workaround for this specification bug
Contribute to The OpenGL Community, report your bugs >
< OpenGL Samples Pack 4.2.2.0 released
Copyright Christophe Riccio 2002-2013 all rights reserved
Designed for Chrome 9, Firefox 4, Opera 11 and Safari 5