WebGPU and WSL in Safari

Sep 12, 2019

by Myles Maxfield

WebGPU is a new API being developed by Apple and others in the W3C which enables high-performance 3D graphics and data-parallel computation on the Web. This API represents a significant improvement over the existing WebGL API in both performance and ease of use. Starting in Safari Technology Preview release 91, beta support is available for WebGPU API and WSL, our proposal for the WebGPU shading language.

A few months ago we discussed a proposal for a new shading language called Web High-Level Shading Language, and began implementation as a proof of concept. Since then, we’ve shifted our approach to this new language, which I will discuss a little later in this post. With help from our friends at Babylon.js we were able to adapt a demo to work with WebGPU and Web Shading Language (WSL):

You can see the demo in action with Safari Technology Preview 91 or later, and with the WebGPU experimental feature enabled.1 The demo utilizes many of the best features of WebGPU. Let’s dig in to see what make WebGPU work so well, and why we think WSL is a great choice for WebGPU.

WebGPU JavaScript API

Just like WebGL, the WebGPU API is accessed through JavaScript because most Web developers are already familiar with working in JavaScript. We expect to create a WebGPU API that is accessed through WebAssembly in the future.

Pipeline State Objects and Bind Groups

Most 3D applications render more than a single object. In WebGL, each of those objects requires a collection of state-changing calls before that object could be rendered. For example, rendering a single object in WebGL might look like:

gl.UseProgram(program1);
gl.frontFace(gl.CW);
gl.cullFace(gl.FRONT);
gl.blendEquationSeparate(gl.FUNC_ADD, gl.FUNC_MIN);
gl.blendFuncSeparate(gl.SRC_COLOR, gl.ZERO, gl.SRC_ALPHA, gl.ZERO);
gl.colorMask(true, false, true, true);
gl.depthMask(true);
gl.stencilMask(1);
gl.depthFunc(gl.GREATER);
gl.drawArrays(gl.TRIANGLES, 0, count);

On the other hand, rendering a single object in WebGPU might look like:

encoder.setPipeline(renderPipeline);
encoder.draw(count, 1, 0, 0);

All the pieces of state in the WebGL example are wrapped up into a single object in WebGPU, named a “pipeline state object.” Though validating state is expensive, with WebGPU it is done once when the pipeline is created, outside of the core rendering loop. As a result, we can avoid performing expensive state analysis inside the draw call. Also, setting an entire pipeline state is a single function call, reducing the amount of “chatting” between Javascript and WebKit’s C++ browser engine.

Resources have a similar story. Most rendering algorithms require a set of resources in order to draw a particular material. In WebGL, each resource would be bound one-by-one, leading to code that looks like:

gl.bindBufferRange(gl.UNIFORM_BUFFER, 0, materialBuffer1, 0, size1);
gl.bindBufferRange(gl.UNIFORM_BUFFER, 1, materialBuffer2, 0, size2);
gl.activeTexture(0);
gl.bindTexture(gl.TEXTURE_2D, materialTexture1);
gl.activeTexture(1);
gl.bindTexture(gl.TEXTURE_2D, materialTexture2);

However, in WebGPU, resources are batched up into “bind groups.” When using WebGPU, the behavior of the above code is represented as simply:

encoder.setBindGroup(0, bindGroup);

In both of these examples, multiple objects are gathered up together and baked into a hardware-dependent format, which is when the browser performs validation. Being able to separate object validation from object use means the application author has more control over when expensive operations occur in the lifecycle of their application.

Run-time Performance

We expect WebGPU to perform faster and handle larger workloads than WebGL. To measure performance, we adopted the test harness from our 2D graphics benchmark MotionMark and wrote two versions of a performance test that had a simple yet realistic workload, one in WebGL and the other in WebGPU.

The test measures how many triangles with different properties can be drawn while maintaining 60 frames per second. Each triangle renders with a different draw call and bind group; this mimics the way many games render objects (like characters, explosions, bullets, or environment objects) in different draw calls with different resources. Because much of the validation logic in WebGPU is performed during the creation of the bind group instead of inside the draw call, both of these calls execute much faster than the equivalent calls in WebGL.

Web Shading Language

The WebGPU community group is actively discussing which shading language or languages should be supported by the specification. Last year, we proposed a new shading language for WebGPU. We are pleased to announce that a beta version of this language is available in Safari Technology Preview 91.

Because of community feedback, our approach toward designing the language has evolved. Previously, we designed the language to be source-compatible with HLSL, but we realized this compatibility was not what the community was asking for. Many wanted the shading language to be as simple and as low-level as possible. That would make it an easier compile target for whichever language their development shop uses.

There are many Web developers using GLSL today in WebGL, so a potential browser accepting a different high level language, like HLSL, wouldn’t suit their needs well. In addition, a high-level language such as HLSL can’t be executed faithfully on every platform and graphics API that WebGPU is designed to execute on.

So, we decided to make the language more simple, low-level, and fast to compile, and renamed the language to Web Shading Language to match this pursuit.2

The name change reflects a shift in our approach, but the major tenets of WSL have not changed. It’s still a language that is high-performance. It’s still text-based, which integrates well with existing web technologies and culture. WSL’s semantics still bake in safety and portability. It still works great as a compile target. As expected for every web technology, the language is well-defined with a robust specification. And because the WebGPU Community Group owns the language, it matures in concert with the WebGPU API.

In an upcoming post, we’ll talk more about the technical details of this language update. Here, let’s discuss further about WSL’s performance characteristics.

Wire Size

The Babylon.js demo above demonstrates an area where WSL really shines. The shaders in the demo use complex physically-based algorithms to draw with the highest realism possible. Babylon.js creates these complex shading algorithms by stitching together snippets of shader source code at run-time. This way, Babylon’s shaders are tuned at run-time to the specific type of content the application contains. Since WSL is a textual language like GLSL, this kind of workflow can be accomplished simply with string concatenation.

Building shaders by string concatenation and compiling them directly is a dramatic improvement over languages which cannot do this kind of manipulation easily, like bytecode-based languages. For these other languages, generation of shaders at run-time requires the execution of an additional compiler, written in JavaScript or WebAssembly, to compile the generated source text into the bytecode form.

This extra step slows down performance in two ways. First, the execution of the additional compiler takes time. Second, the compiler source would have to be served with the rest of the web page’s resources, which increases the page size and lengthens the loading time of the web page. On the other hand, since WSL’s format is textual, there’s no additional compile step; you can just write it and run it!

As a point of comparison, let’s take the Babylon.js demo shown above. Babylon.js renders scenes by joining GLSL shader snippets at runtime. When using WSL, Babylon.js would simply serve the WSL snippets just like they’re serving GLSL snippets today. To do the same with SPIR-V, Babylon.js would serve their shader snippets along with a compiler that compiles these textual snippets down to SPIR-V. Chrome recently announced their support for WebGPU with SPIR-V using the same demo. Here is a comparison of the gzipped wire sizes for the shaders as WSL, and that of the shaders as GLSL and the SPIR-V compiler:

The top bar shows the wire size if Babylon.js used WSL directly. The middle bar shows the wire size of the GLSL source, plus the compiler that Babylon.js uses to compile their snippets into SPIR-V. The bottom bar shows the wire size if Babylon.js simply served the results of their SPIR-V compilation. Because of the dynamic nature of Babylon.js’s shaders, they assemble their shaders at runtime and compile them dynamically, so they require the SPIR-V compiler. However, other websites’ shaders may be more static, in which case they wouldn’t need the compiler and could serve the SPIR-V directly, like in the bottom bar in the graph above.

Compilation Performance

We’ve heard from developers that shader compilation performance is extremely important. One way the community group addressed this was designing WebGPU to support asynchronous compilation, which avoids blocking successive rendering commands. Asynchronous compilation isn’t a magic bullet, though. If the next rendering command uses a shader currently being compiled, that rendering command must wait for the compilation complete. So, we’ve put significant effort into making WSL compilation times fast.

The WebGPU community created a compute shader to demonstrate the flocking characteristics of “boids,” and we turned that sample into a compiler performance benchmark. Here we compare the compilation time of WSL with the run-time of the GLSL compiler to SPIR-V running as a WebAssembly library:

As you can see, we’re continuing to optimize compilation performance, and it’s been increasing both throughout and after STP 91’s release.

Give WebGPU a try!

We’re very excited to have implemented a beta version of WebGPU and WSL in the latest version of Safari Technology Preview. Try it out and let us know how well it works for you! We’re interested in feedback so we can evolve the WebGPU and WSL language to better suit everyone’s needs.

And be sure to check out our gallery of WebGPU samples. We’ll be keeping this page updated with the latest demos. Many thanks to Sebastien Vandenberghe and David Catuhe from Babylon.js for their help with Babylon.js in Safari.

For more information, you can contact me at mmaxfield@apple.com or @Litherum, or you can contact our evangelist, Jonathan Davis.

1 To enable WebGPU beta support, in Develop menu, select Experimental Features > WebGPU.
2 We still pronounce it “whistle”, though.