19/04/2010 OpenGL 3.4 / 4.1: Expectations / Wish-list (DSA!)

This is my last series post resulting of GDC 2010 event: OpenGL 3.3 and OpenGL 4.0 release! This article is so long that I decided to split it into 4 posts. I could have actually make it longer as I didn't even spoke about all the features I read or thought about. Let's say it's just a quite introduction of OpenGL future!

  • 1 - OpenGL 3.3 review
  • 2 - OpenGL 4.0 review
  • 3 - OpenGL 3.4 / 4.1: Expectations / Wish-list (<- here we are!)

Direct State Access

Let's start straight to the point: Direct State Access (DSA) is the most wanted feature for OpenGL 3.4 and OpenGL 4.1 by most OpenGL developers. When OpenGL 3.0 get released in a terrible controversy (well diserved!) the only ray of light came from the DSA extension.

DSA is an alternative to the traditional 'bind and edit' way of OpenGL. How someone like me who value traditions could want to get ride of an old concept? Because this 'bind and edit' is a pain for software design, could be inefficient especially with multithreaded drivers and it makes really hard to port an OpenGL software to Direct3D when this port haven't been take care from the ground up.

When I'm writting a code I find it to be a great practice to design an API that tell me how to use it even without documentation. This is hardly possible but I see it as the optimal API, a direction I will try to tend. 'bind and edit' is just not anything close to this idea especially because it is decorated with 'selectors' link glActiveTexture

  • The matrix mode (deprecated)
  • The current bound texture for each supported texture target
  • The active texture
  • The active client texture
  • The current bound program for each supported program target
  • The current bound buffer for each supported buffer target
  • The current GLSL program
  • The current framebuffer object
  • The current VAO object (OpenGL 3.0, tricky with immutable / partly mutable / fully mutable uses)
  • The current bound framebuffer object for draw and read. (OpenGL 3.0)
  • The current bound renderbuffer object. (OpenGL 3.0)
  • The current bound sampler for each supported texture target (OpenGL 3.3)
  • The current transform feedback object (OpenGL 4.0)

Chances are that this number of selectors will increase in the future (immutable state object? blend object?). The 'bind and edit' model asks the question of object use cases (object edits / call draws). Do we really need to affect the binded objects for the draw call when we want to edit an object? How do we want to manage the current states of the renderer?

At draw call, we just want to draw but with the 'bind and edit' model we might need to check current states just to make sure that edit code side of the software didn't perturbe one or another state. We might end up with just checking everything which is a massive CPU overload. Worse! To do these states checking, a developer might think of a using glGet* functions to request the current state but this is such a bad idea for effectiveness.

The way I decide to resolve this problem is by using macro state objects (a set of states of similar semantics) assign a unique identifier for each, an compiling the state into an OpenGL display list. When I bind the macro object, I just execute the display list which change all the states and probably more than what I need. Simple and efficient (on nVidia, I never made tests on AMD). However, display lists are 'deprecated'...

I am lucky. I wrote a multithreaded OpenGL renderer where one thread was dedicated to the OpenGL renderer and use to process all the commands given by as many threads as possible which each of them add a lot to do already: the application with user event handling, procedural geometry updates, procedural texture generations, etc. One thread to feed them all.

The design allows with one frame of latency and a reduce cost (a single mutex per frame to swap message sender / recipient lists) for the message manager (to communicate from all the threads to the OpenGL thread) to handle the scenario, maximizing macro task multithreading. One problem with this design was the management of feedbacks from the OpenGL thread to the other threads. All the communications between threads required to use the message manager to be safe. With this design, if you need to read a single value, you need to pass by the message system, wait until the message is processed (up to 1 frame latency!) before getting the result... We need to stall the rendering pipeline for 1 frame!

I'm sure AMD and nVidia design clever solutions to hide this multithreading latency but still it remains somewhere where glGet* isn't an optimal scenario.

Old-fashioned state checking with possible under-layered bi-directional thread communications:
  • void setTexture2DParameter(GLuint Texture, GLenum Target, GLenum Pname, GLint Param)
  • {
  • GLenum SavedTexture2D;
  • glGetIntegerv(GL_TEXTURE_BINDING_2D, &SavedTexture2D);
  • // SavedTexture2D could be the same texture object than Texture...
  • if(SavedTexture2D != Texture2D)
  • glBindTexture(GL_TEXTURE_2D, Texture);
  • glTexParameteri(GL_TEXTURE_2D, Pname, Param);
  • if(SavedTexture2D != Texture2D)
  • glBindTexture(GL_TEXTURE_2D, SavedTexture2D);
  • }

We really don't want to call this setTexture2DParameter function. In a real and complex software, it's actually quite tempting from time to time to write that kind of hack... Moreover, we are changing the current active 'texture unit' without actually knowing which one it is. Even if the current texture object is restaured, in a debugging perspective... it's odd!

DSA way, no state checking required with possible under-layered one direction thread communication:
  • // We directly change a texture object without affecting the bound texture object
  • // which could actually be the bound texture object.
  • glTextureParameteriEXT(Texture, Target, Pname, Param);

Nice isn't it? With DSA, it's nice and handy.

I take the example of texture because OpenGL 3.3 introduces the first DSA object in the API with the sampler object.

Creation and use of a sampler object:
  • glGenSamplers(1, &this->SamplerName);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_EDGE);
  • glSamplerParameterfv(this->SamplerName, GL_TEXTURE_BORDER_COLOR, &glm::vec4(0.0f)[0]);
  • glSamplerParameterf(this->SamplerName, GL_TEXTURE_MIN_LOD, -1000.f);
  • glSamplerParameterf(this->SamplerName, GL_TEXTURE_MAX_LOD, 1000.f);
  • glSamplerParameterf(this->SamplerName, GL_TEXTURE_LOD_BIAS, 0.0f);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_COMPARE_MODE, GL_NONE);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL);
  • glSamplerParameterf(this->SamplerName, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_MAX_ANISOTROPY_EXT, 16.f);
  • ...
  • // Doesn't affect any texture object but used by texture unit 0.
  • glBindSampler(0, this->SamplerName);

Now: scenario! This sampler is used for a chunked terrain renderer and for some reasons on the configuration, the software is too slow and we want to reduce the texture filtering quality. A sampler object is shared and used already by a set of chunks. We don't have to mess the texture unit were it is binded or we don't have to mess with current binded sampler. 'Edit' and 'draw' cases are independant. Bonus of the new sampler object, we don't have to browse all the chucks to find the onces who uses this sampler!

Update a sampler object:
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_NEAREST);
  • glSamplerParameteri(this->SamplerName, GL_TEXTURE_MAX_ANISOTROPY_EXT, 1.f);

Sampler objects, even if we could complain for some things, as the first DSA API, succeeds and calls for more DSA.

From time to time during development, selectors become bugs prone. What if glActireTexture(GL_TEXTURE0) is unexpectingly call somewhere in the code without developers knowlegde of it? It can just completely mess up the complete rendered image without any OpenGL error. This is actually why I never call glBindTexture without glActireTexture just the line before: to be sure that I am actually binding the texture where I expect to bind it. With DSA, we would simply have a function like glBindNamedTexture(uint unit, enum target, uint texture) which garanties the correct behavior. On top of this, the API itself tell us how to use it: wonderful and no rick that some developers wouldn't be aware of the glActireTexture selector or simply forgot it by mistake.

When generalized to the whole API, we notice that the DSA API tell you more about how to use the API, which function needs to be called before one function, each function calling for more functions to be called before itself. By this way, DSA is a better self documented API.

The current GL_EXT_direct_state_access extension has few part I don't really like. For example, it doesn't allow to only edit object, it also allows to edit binding point values like the fonction glMultiTexParameterivEXT function that allows to edit a texture unit binding point. It fells to me that it is messing around with the 'draw' case and for debugging purpose, I prefer to check at an object level rather than at each parameter value level.

The idea of immutable objects at draw fells more reliable at software development level and maybe more efficient at drivers development level. It's likely that an object might be create and edit at one place but use for drawing at multiple places and multiple combinaison orders.

The super awful function: glMultiTexImage2DEXT changes the data of a texture from where it is binded but which you don't know what is it name if we don't keep track of the object name or query it... Odd and scary!

  • glNamedTextureImage2D
  • glNamedTextureSubImage2D
  • glCopyNamedTextureSubImage2D
  • glBindNamedTexture(uint unit, enum target, uint texture)
  • glNamedBufferSubData
  • glMapNamedBuffer
  • glNamedProgramUniform1i
  • glNamedRenderbufferStorage
  • glNamedFramebufferTexture2D
  • glNamedFramebufferTexture2D
  • glGenerateNamedTextureMipmap(uint texture, enum target)
  • glNamedFramebufferDrawBuffers(uint framebuffer, sizei n, const enum *bufs)
  • glNamedFramebufferReadBuffer(uint framebuffer, enum mode)
  • glNamedVertexArrayVertexAttribFormat
  • ...
  • void glMultiTexParameterfEXT(enum texunit, enum target, enum pname, float param);
  • void glMultiTexImage2DEXT(enum texunit, ...);
  • void glGenerateMultiTexMipmapEXT(enum texunit, enum target);
  • void glMultiTexRenderbufferEXT(enum texunit, enum target, uint renderbuffer);
  • ...

The OpenGL API has few others clues related on the DSA direction took by OpenGL. The OpenGL 4.0 function glDrawTransformFeedback which second parameter takes a transform feedback object. Following the 'edit and bind' principe, the ARB would have create a dedicated binding point GL_DRAW_TRANSFORM_FEEDBACK just like they did for framebuffer blit with the binding point GL_READ_FRAMEBUFFER and GL_DRAW_FRAMEBUFFER. More obvious, the OpenGL 3.2 function glUniformBlockBinding which takes as first parameter a 'program' name where all the others glUniform* functions affect the current binded program.

Well, you understand it: I really but I mean really hope to see DSA in OpenGL 3.4 and OpenGL 4.1! Number 1 in my wishlist. Would it really be in OpenGL 3.4 and OpenGL 4.1? From my understanding and what I read here and there nVidia is really up for DSA but some ARB members are not. Well... some might be AMD. At OpenGL 3.3 release the lack of DSA was the main cloud in a sky of congratulations. I really like the always constructive contribution of a former Bizzard developer but now working for Valve (Thanks for the update Rob!) who said at OpenGL 3.3 and OpenGL 4.0 release:

"If a dozen separate developers all shout loudly for DSA for example, this could effectively raise its priority for an upcoming release"Rob Barris

Well, that's my personal loud shout!

OpenGL 3.4 / 4.1: Expectations / Wish-list (major) >
< OpenGL Samples Pack 3.3.1.1 released (Updated)
Copyright Christophe Riccio 2002-2016 all rights reserved
Designed for Chrome 9, Firefox 4, Opera 11 and Safari 5