29/02/2012 February 2012 OpenGL drivers status, update 2

This month the last release drivers appears to be for GeForce, Forceware 295.73; and for Radeon, Catalyst 12.x preview 8.96.

Catalyst 12.x preview 8.96 looks like a leak but it is available online and performs relatively well as it fixes few bugs. Following some discussions about SNORM texture and texture format conversions in OpenGL, I decided to make the following write up for this status for programmers interested by these greedy OpenGL details.

Texture format to internal format conversions

Here is the following up some discussions resulting of the SNORM textures status from last month.

Typically with OpenGL textures, we are living in a convenient world where whatever texture format data will be converted automatically by the implementation to whatever internal format the user requested or to be precise, to whatever internal format with a precision at least big enough to store the request internal format. This is to say for example a lot of GPUs doesn’t support anymore RGB5A1 format anymore but this format remains part of OpenGL 4.2 specification. An implementation may implement this format with RGBA8 instead for example. This conversion support is so wide that we can even submit RGBA32F data and convert it to DXT1 using glTexImage2D if we wish… I am not saying I can imagine a good reason to do that but it’s possible.

With the release of EXT_texture_integer, it seems that all these conversions were not really making the authors of this extension in a good mood so that this extension prevent any kind of conversion by forcing the application which want to use an integer texture to use one of the following format without any format conversion possible.

  • GL_RED_INTEGER
  • GL_GREEN_INTEGER
  • GL_BLUE_INTEGER
  • GL_ALPHA_INTEGER
  • GL_RGB_INTEGER
  • GL_RGBA_INTEGER
  • GL_BGR_INTEGER
  • GL_BGRA_INTEGER
  • GL_LUMINANCE_INTEGER (compatibility profile)
  • GL_LUMINANCE_ALPHA_INTEGER (compatibility profile)

In parallel to this extension, EXT_texture_snorm extension was released providing snorm textures (normalized texture between -1.0 and 1.0 instead of 0.0 and 1.0). This extension written against OpenGL 3.0, followed the same precedent than any other OpenGL core textures which means that we could convert any texture format to a snorm texture internal format.

Unfortunately and probably because OpenGL tends to build up more exceptions than the French grammar does, OpenGL core specification (from 3.1 onward) allows conversion of only signed format data (GL_BYTE, GL_SHORT, GL_INTEGER) to snorm textures.

What implementations do? AMD implementation exposes EXT_texture_snorm so we should expect that we could create snorm texture from unsigned integer data. Unfortunately and this is the result of the status, the implementation result isn’t correct. On NVIDIA side, EXT_texture_snorm is not exposed so we should get an OpenGL error when trying to convert unsigned format data to snorm texture. However, this pass and create a functional texture.

Create a classic OpenGL UNORM texture:
  • glGenTextures(1, &TextureName);
  • glBindTexture(GL_TEXTURE_2D, TextureName);
  • // Memory allocation
  • glTexStorage2D(GL_TEXTURE_2D, Levels, GL_RGBA8, Width, Height);
  • // Convertion and memory copy
  • glTexSubImage2D(GL_TEXTURE_2D, Level, 0, 0, Wdith, Height, GL_RGBA, GL_UNSIGNED_BYTE, Data);
Create an integer texture:
  • glGenTextures(1, &TextureName);
  • glBindTexture(GL_TEXTURE_2D, TextureName);
  • // Memory allocation
  • glTexStorage2D(GL_TEXTURE_2D, Levels, GL_RGBA8UI, Width, Height);
  • // Convertion and memory copy
  • glTexSubImage2D(GL_TEXTURE_2D, Level, 0, 0, Wdith, Height, GL_RGBA_INTEGER, GL_UNSIGNED_BYTE, Data);
Create a SNORM texture:
  • glGenTextures(1, &TextureName);
  • glBindTexture(GL_TEXTURE_2D, TextureName);
  • // Memory allocation
  • glTexStorage2D(GL_TEXTURE_2D, Levels, GL_RGBA8_SNORM, Width, Height);
  • // Convertion and memory copy
  • glTexSubImage2D(GL_TEXTURE_2D, Level, 0, 0, Wdith, Height, GL_RGBA, GL_BYTE, Data);

04/03/2012 UPDATE

I recieved few feedbacks and I am taking advantage of this quick update to comment them and update the drivers status.

Conversion of SNORM texture data

Some new evidences demonstrate that I was wrong on the snorm texture and the OpenGL specification allows to convert anything to SNORM textures.

Data are taken from the currently bound pixel unpack buffer or client memory as a sequence of signed or unsigned bytes (GL data types byte and ubyte), signed or unsigned short integers (GL data types short and ushort), signed or unsigned integers (GL data types int and uint), or floating point values (GL data types half and float). These elements are grouped into sets of one, two, three, or four values, depending on the format, to form a group. Table 3.3 summarizes the format of groups obtained from memory; it also indicates those formats that yield indices and those that yield floating-point or integer components. OpenGL 4.2 specification, 3.7.2 Transfer of Pixel Rectangles - Unpacking

gl-410-fbo-layered broken on NVIDIA?

I didn't really took time to figure out the bottom of this but it looks like there is a problem: The sample displays a black screen, no OpenGL error generates... but the sample is still working on AMD. I can't exclude that I did something wrong.

Vertex inputs and structures

The GLSL 4.20 clearly specify that vertex input can't be structures but it NVIDIA implementation supports them. Even if the GLSL compiler should generate an error, it sounds like a good idea and it doesn't look like it's an issue for the enumeration API.

"Vertex shader inputs can also form arrays of these types, but not structures." GLSL 4.20.11 specification, section 4.3.4 Input Variables

Implicit conversions

I can't say I really understand GLSL implicit conversions. It would be me they would all generate GLSL error and I think an implementation should as least generate an warning for each like a C++ compiler would do. GLSL defines a clear list for the implicit conversion section "4.1.10 Implicit Conversions". In this list, the allowed conversions are always between type with the same number of components. It seems that in some case AMD implementation allows implicit conversions between vectors of different sizes.

If you have feedback on this, please don't hesitate to drop me a mail.

  • White: Unsupported or untested.
  • Blue: The sample works but it doesn't follow the OpenGL specification.
  • Green: The sample works following the OpenGL specification.
  • Orange: The sample doesn't work correctly but a workaround is possible.
  • Red: The sample does't work and I haven't found any workaround.
  • Black: Really distubing problem!

Once again, don't forget to contribute to the OpenGL community by reporting your bugs!

The test

These tests have been done on Windows 7 64 with the OpenGL Samples Pack 4.2.4 branch, still in development, on an GeForce GTX 470 and a Radeon HD 5850.

05/03/2012 UPDATE 2

I think that OpenGL 4.2 is so much better than OpenGL 4.1 because it completely clarifies the interface matching between shader stages. However, I discover a new corner case: With linked programs, if a built-in block is declared on the vertex shader stage but the next shader stage doesn't declare it, this is undefined and either lead to a silent error (NVIDIA implemetation) or a sort of luck (AMD implemetation where it's working).

OpenGL Samples Pack 4.2.4 wip, OpenGL specification testsAMD Catalyst 12.2 preview, 8.94 (25/01/2012)AMD Catalyst 12.x preview, 8.96 (14/02/2012)NVIDIA Forceware 290.53 (22/12/2012)NVIDIA Forceware 295.73 (22/02/2012)
420-transform-feedback-instanced
420-texture-storageAllows an implicit cast on texture coordinates parameter
420-texture-pixel-store
420-texture-conversionImmutable texture and BC7 conversions is not workingImmutable texture and BC7 conversions is not working
420-texture-compressed
420-test-depth-conservative
420-sampler-fetch
420-memory-barrier
420-interface-matchingglGetAttribLocation fails to return the location hereStructure for vertex inputs supported
420-image-unpackUnpack isn't correct?
420-image-storeglClear is skipped for the first frame
420-image-load
420-fbo-layeredIf a vertex shader declares a built-in block and the geometry shader doesn't the result is undefined.If a vertex shader declares a built-in block and the geometry shader doesn't the result is undefined.
420-draw-base-instance
420-direct-state-access-extUnsupported DSA storage functionsUnsupported DSA storage functions
420-buffer-uniform
420-atomic-counter
410-program-separate-dsa-ext
410-program-binaryMay crash if the binary is not AMD'sMay crash if the binary is not AMD's
410-program-64
410-primitive-tessellation-5Bug on the shader interface matching: Block member not active with linked separated programBug on the shader interface matching: Block member not active with linked separated program
410-primitive-tessellation-2
410-primitive-instanced
400-transform-feedback-stream
400-transform-feedback-objectEXT_transform_feedback extension string missingEXT_transform_feedback extension string missing
400-texture-buffer-rgb
400-sampler-gather
400-sampler-fetch
400-sampler-array
400-program-varying-structs
400-program-varying-blocks
400-program-subroutine
400-program-64
400-primitive-tessellation
400-primitive-smooth-shading
400-primitive-instanced
400-fbo-rtt-texture-array
400-fbo-rtt
400-fbo-multisample
400-fbo-layered
400-draw-indirect
420-debug-outputDebugControl doesn't work, null-terminated strings generate errorsDebugControl doesn't work, null-terminated strings generate errors
400-blend-rtt
330-transform-feedback-separated
330-transform-feedback-interleaved
330-texture-pixel-store
330-texture-formatSNORM conversion not performedSNORM conversion not performedEXT_texture_snorm string missingEXT_texture_snorm string missing
330-primitive-point-spritePop free clippingPop free clipping
330-fbo-srgb
330-draw-without-vertex-attrib
330-buffer-typei32 vertex input data not supported
OpenGL Samples Pack 4.2.4-wip, proprietary featuresAMD Catalyst 12.2 preview, 8.94 (25/01/2012)AMD Catalyst 12.x preview, 8.96 (14/02/2012)NVIDIA Forceware 290.53 (22/12/2012)NVIDIA Forceware 295.73 (22/02/2012)
420-texture-copy-nv
420-test-depth-clamp-separate-amdAMD_depth_clamp_separate not supportedAMD_depth_clamp_separate not supported
420-primitive-bindless-nvNV_shader_buffer_load not supportedNV_shader_buffer_load not supported
420-fbo-srgb-decode-extEXT_texture_sRGB_decode not supportedEXT_texture_sRGB_decode not supported
420-fbo-multisample-position-amdAMD_sample_positions not supportedAMD_sample_positions not supported
420-fbo-multisample-dsa-nvNV_texture_multisample not supportedNV_texture_multisample not supported
420-draw-indirect-amdAMD_multi_draw_indirect not supportedAMD_multi_draw_indirect not supported
420-buffer-pinned-amdAMD_pinned_memory not supportedAMD_pinned_memory not supported
420-buffer-barrier-gtcWorks as desiredWorks as desiredGenerates an invalid operation as specifiedGenerates an invalid operation as specified
420-blend-op-amdThis is a Radeon 6900+ series featureThis is a Radeon 6900+ series feature
330-fbo-multisample-explicit-nv
Moving to github >
< G-Truc Creation 7.0.3 sources released
Copyright Christophe Riccio 2002-2016 all rights reserved
Designed for Chrome 9, Firefox 4, Opera 11 and Safari 5