Scenes from an Acid Test

Mar 27, 2008

by Maciej Stachowiak

In our earlier Acid3 post, I mentioned that the final test we passed, test 79, was super tough. This test covered many details of SVG text layout and font support. To pass it, we had to fix many bugs. Unless you are a hardocore standards geek you may not find all these details exciting, but since we talked a bit about our other recent fixes, I thought I’d tell the story. This subtest was originally contributed by Cameron McCormack.

Bug 1: DOM Support for the altGlyph element

I started on test 79 as a weekend hacking lark. While I have hacked a little on the SVG code, it wasn’t my top area of expertise, and I hadn’t looked at the text layout code at all, so I figured it would be an interesting learning experience. Little did I know that we’d end up racing to fix it.

The first failure we saw on test 79 was that getNumberOfCharacters on an SVG <text> element gave an answer that was off by one. The test hinted that this could be due to improper handling of UTF-16 (something covered in the SVG 1.1 Errata). However, on further study, I realized that our UTF-16 handling was right, but the <text> element contained an <altGlyph> element which we did not recognize, and therefore, according to SVG rules, did not render. For those who are new to text layout, a “glyph” refers to a particular symbol that will be drawn on the screen – most often every character has its own glyph, but it can get more complicated than that.

So, as a first step, in r31240 I made <altGlyph> fully recognized by the engine and allowed it to render initially as a <tspan> element. That’s sort of SVG’s equivalent of an HTML <span>, it just renders its text contents.

After landing the fix, I saw that the code testing character positions got a bad position for character 2.

Interlude: Modified Test Case

At this point, I realized that we’d probably get a lot of bad positions, and it would help to see all the problems at one go, instead of one at a time. Acid3 only reports the first failure on any given test. So I saved off a local copy, hacked it to report every failure on test 79, and changed it to test the spaces between characters instead of the character positions. Any error in measuring one character would propagate to all subsequent characters, so it was more useful to look at the differences in positions (“advances” in typography lingo). This immediately made it obvious that by far the most common failure was failure to support multi-character glyphs in SVG Fonts correctly.

Bug 2: Multi-character Glyphs

SVG has a feature to allow specification of glyphs that match more than one character. You can use this to do things like create ligatures. For example, it is common for proportional fonts to include a special glyph for the combination “fi”, since those letters can look nicer if drawn together in a special way.

We had the basics of code to support this. However, the problem was that our code only measured one character at a time. There was no way for the font code to decide to pick a glyph for more than one character because it only saw one at a time, and there was no way for it to report back how many characters a glyph matched. So even though the drawing code sort of got this right, the metrics were based on one-character glyphs. In r31310 I fixed this.

Interlude: Studying the Test Case

After this fix, I studied the test case to classify the remaining failures. I realized there was probably more than one bug, and if I documented them other people could help, or just continue the bug for me if I did not have time.

I found four separate remaining bugs: incorrect priority of SVG glyph matching (it picked the longest match instead of the first), failure to consider the xml:lang attribute in glyph matching, failure to support explicit kerning pairs, and failure to render the actual alternate glyph selected by altGlyph.

Bugs 3 & 4: Glyph Selection

After studying the SVG spec for text, I realized that the data structure we were using for glyph selection was not right. SVG fonts require you to pick the first glyph that matches, in the order in which they are specified, with a number of different rules in addition to text (you have to consider language and the multiple forms of Arabic characters for instance). We just had a hashtable of glyphs, and looked for a candidate by checking possible character runs from longest to shortest. That matched in the wrong priority, and did not properly handle lang at all.

I decided to change the GlyphMap to an n-ary trie (or “prefix tree”), with a vector of possible glyphs at each node, and a hashtable for child nodes. I also stored a “priority” value based on the position of the glyph in the tree. To do lookup, I would walk the trie one character at a time based on available characters, collecting all the glyphs found at each level. Next, the glyphs are sorted by priority. Finally, this vector of candidate glyphs is scanned to find the first that matches the additional constraints such as Arabic form.

Opera Brings its “A” Game

Making a fancy new data structure was a fair bit more complicated than my last two fixes, so I was hacking on this for a couple of days. On Wednesday morning, I heard that Opera was going to announce an Acid3 score of 98/100. The WebKit developers who had been noodling on Acid3 realized that we had fixes in progress for most of the remaining issues, and could likely pass the test by the end of the day, and get it out there.

At this point, I would like to commend the Opera developers for their achievement. At the time they posted the 100/100 screenshot, we had no idea there was a bug in test 79, and hoped at best to be the first to release a 100/100 public build, and both our teams could get some credit and positive exposure. It was not until we were fixing the very last bug in our code that we spotted the bug in the test, as you’ll see below.

Anyway, once we realized it was game on and Opera was much further ahead than we expected, we decided to complete our fixes. Eric fixed handling of charset encoding errors in XML, Antti got two SMIL test fixes in, and I cleaned this patch up, made a regression test, and landed it in r31324 and r31325. (We have a policy that every fix has to come with an automated regression test, no matter how much we’d like to rush – this is so we can move faster in the long run, by having a comprehensive regression test suite that will catch mistakes. Even with all that I had a bit of a commit oops but the team caught it right away, thus the two revisions.)

Bug 5: altGlyph Rendering

At this point, I had a problem. Niko, who had originally written the SVG font and text code, was busy with school, and I’d just learned all about the SVG text system by fixing these bugs, so I was the top expert and sort of on the hook. But there were two bugs left on these test, and neither was trivial to fix. I decided to split the remaining work with rendering master Dave Hyatt. I asked him to take on the altGlyph fix while I took care of kerning.

It turned out there were actually two different problems here. The altGlyph lookup was by ID, but somehow the glyph element wasn’t getting into the id map. With help from Darin and others on the team, Hyatt managed to figure out that we were handling the “id” attribute improperly for the SVG <glyph> element.

The second problem was actually choosing the alternate glyph and using it for metrics and rendering. This turned out to be simpler than it might seem, as it could be handled almost entirely in the SVG font machinery.

You can see the gory details of Hyatt’s fix in r31338. I helped him out with the test case for this, and for a while he thought his code was broken because I sent him a test that showed red boxes when it passes. Kids, never do this in your browser tests. Green for pass, red for fail.

Bug 6: Kerning

The final bug was support for kerning. It was pretty easy to add support to the SVG DOM for the <hkern> element, and to build up a vector of kerning pairs. What was tricky was storing enough info to decide when a kerning pair applied, and the match algorithm to help determine applicability. Kerning pairs can be specified in two ways in the SVG spec, by Unicode value or glyph name.

Darin was looking over my shoulder as I was reading the spec, because I hoped for his help on part of this, the matching algorithm. It was at this point we realized the test had a bug. It had an <hkern> element with a u1 attribute of “EE”, and expected that kerning to apply. But that’s neither a comma-separated sequence of characters nor a Unicode range, which is the syntax the SVG 1.1 spec for <hkern> elements requires. This meant that you couldn’t possibly pass this test without violating the SVG spec. I let Acid3 editor Ian Hickson know right away, and confirmed the error with Cameron, the original author of test 79. While they mulled over how to fix the test (Ian has been very responsive to fixing the few errors that crept into this very complex test), Darin and I continued to work on the final bug.

At last, we managed to pass around the glyph info and apply the matching correctly. We decided to handle the full complexity of the SVG spec for this, even though we could have cut corners just to pass the test, since it didn’t test every possible form of kerning pair. In the meantime, Ian had fixed the bug in the test and blogged about it. At last I saw the 100/100 on my screen for the first time. Everyone egged me on to commit but I insisted on getting the patch reviewed and creating a regression test, cause on WebKit that is how we roll. Finally with r31342 we had a score of 100/100, available to the public. And the rest is history.

By Way of Contrast

As you can see, this test was pretty rigorous, and we did a lot of hard work to make it pass. We fixed a lot of bugs, ranging from basic to obscure and easy to advanced. We didn’t have code just lying around that got us most of the way there.

Not all of the tests on Acid3 were equally stringent. Tests 75 and 76 are SVG Animation tests, contributed by Erik Dahlström. We had the beginnings of an implementation of this feature, but it was fairly incomplete and we had it disabled with ifdefs, and had chosen not to ship it in past releases. We believe in incremental development of new features, so we were prepared to turn it on and fix the bugs needed to pass the tests. We thought this would take a lot of extra work. To our surprise, it turned out that the first few of the changes we made on top of the old code were enough to pass the two SVG animation tests.

Antti checked these fixes in, but we don’t think our pre-existing incomplete implementation is truly satisfactory, and it probably wouldn’t be a good idea to ship it as-is in a major public release. We’re not yet satisfied with this code, but it’s a good step forward for our open development trunk.

Thanks

In the end, the main thing that surprised me about Acid3 was how fun it was. Back when Acid2 was all the rage, Hyatt did pretty much all the fixes himself, so the rest of us didn’t have a chance to get into the spirit as much. But Acid3, and the slightly silly competitive atmosphere around it makes Web standards fun.

Web standards can often seem boring compared to super fast performace, whizzy new features, and even the basic Web compatibility work of making sites work properly. Interoperability is critical to the Web as an open platform, but it can be difficult to explain to regular users why it’s so important. The Acid tests make web standards fun, for browser developers, for Web designers, and for regular users. Whatever the intrinsic value of the tests may be, I think we should all thank Ian Hickson and all the test contributors.

I’d also like to thank Opera for giving us some serious competition and making this a real horse race. We have huge respect for their developers and all the work they do on Web standards.