Surfin’ Safari

CSS Animation

Posted by Dean Jackson on Thursday, February 5th, 2009 at 3:40 pm

WebKit now supports explicit animations in CSS. As a counterpart to transitions, animations provide a way to declare repeating animated effects, with keyframes, completely in CSS.


Screenshot of the leaves animation

With a recent nightly build, you can see the above animation in action.

Let’s take a look at how to use CSS animations, starting with an example of a bouncing box.

Bouncing text animation

See this example live in a recent WebKit nightly build.

Specifying animations is easy. You first describe the animation effect using the @-webkit-keyframes rule.

@-webkit-keyframes bounce {
 from {
   left: 0px;
 }
 to {
   left: 200px;
 }
}

A @-webkit-keyframes block contains rule sets called keyframes. A keyframe defines the style that will be applied for that moment within the animation. The animation engine will smoothly interpolate style between the keyframes. In the above example we define an animation called “bounce” to have two keyframes: one for the start of the animation (the “from” block) and one for the end (the “to” block).

Once we have defined an animation, we apply it using -webkit-animation-name and related properties.

div {
 -webkit-animation-name: bounce;
 -webkit-animation-duration: 4s;
 -webkit-animation-iteration-count: 10;
 -webkit-animation-direction: alternate;
}

The above rule attaches the “bounce” animation, sets the duration to 4 seconds, makes it execute a total of 10 times, and has every other iteration play in reverse.

Now, suppose you want to party like it is 1995 and make your own super-blink style. In this case we specify an animation with multiple keyframes, each with different values for opacity, background color and transform.

@-webkit-keyframes pulse {
 0% {
   background-color: red;
   opacity: 1.0;
   -webkit-transform: scale(1.0) rotate(0deg);
 }
 33% {
   background-color: blue;
   opacity: 0.75;
   -webkit-transform: scale(1.1) rotate(-5deg);
 }
 67% {
   background-color: green;
   opacity: 0.5;
   -webkit-transform: scale(1.1) rotate(5deg);
 }
 100% {
   background-color: red;
   opacity: 1.0;
   -webkit-transform: scale(1.0) rotate(0deg);
 }
}

.pulsedbox {
 -webkit-animation-name: pulse;
 -webkit-animation-duration: 4s;
 -webkit-animation-direction: alternate;
 -webkit-animation-timing-function: ease-in-out;
}

Screenshot of the pulsing text animation

Again, see this example live in a recent WebKit nightly build.

Keyframes are specified using percentage values, defining the moment along the animation duration that the keyframe snapshots. The “from” and “to” keywords are equivalent to “0%” and “100%” respectively.

CSS Animations is one of the enhancements to CSS proposed by the WebKit project that we’ve been calling CSS Effects (eg. gradients, masks, transitions). The goal is to provide properties that allow Web developers to create graphically rich content. In many cases animations are presentational, and therefore belong in the styling system. This allows developers to write declarative rules for animations, replacing lots of hard-to-maintain animation code in JavaScript.

The other good news is that the WebKit on iPhone 2.0 already supports CSS Animations (as well as CSS Transforms and CSS Transitions). The iPhone implementation has been optimized for the platform so you get fantastic performance. Combining animations, transitions and transforms allows for some really impressive content.

We’re documenting these enhancements on webkit.org and are proposing the specifications to standards bodies. Note that since they are currently features specific to WebKit they are implemented with a -webkit- prefix, although the specifications do not use the prefix.

You can find more examples on Apple’s Web Applications Developer Center.

As always, leave feedback in the comments and file bugs at http://bugs.webkit.org/.

Gavin Barraclough is a WebKit Reviewer

Posted by Sam Weinig on Wednesday, December 17th, 2008 at 12:43 am

Gavin Barraclough is now a qualified WebKit Reviewer. Gavin was the driving force behind the initial versions of both SquirrelFish Extreme (our JavaScript JIT) and WREC (our RegExp JIT) and has done a tremendous job of ushering them towards maturity. Please join me in congratulating Gavin.

Tor Arne Vestbø is a WebKit Reviewer

Posted by Maciej Stachowiak on Thursday, October 23rd, 2008 at 5:41 am

Tor Arne Vestbø is now a qualified WebKit Reviewer. Tor Arne has done a huge amount of work on the Qt port of WebKit, including advanced work such as a Phonon port of the media back end for the <video> element. He’s also helped to enhance our cross-platform abstraction layer. Please join me in congratulating Tor Arne on his reviewer status and thanking him for all of his contributions to WebKit.

Web Inspector Redesign

Posted by Timothy Hatcher on Tuesday, September 30th, 2008 at 4:06 pm

It has been nine months since our last Web Inspector update and we have a lot of cool things to talk about. If you diligently use the Web Inspector in nightly builds, you might have seen some of these improvements, while other subtle changes might have gone unnoticed.

Some of the Web Inspector improvements were contributed by members of the WebKit community. We really want to get the whole community involved with making this the best web development tool available. Remember, most of the Web Inspector is written in HTML, JavaScript, and CSS, so it’s easy to get started making changes and improvements.

Redesigned Interface

First and foremost, the Web Inspector is now sporting a new design that organizes information into task-oriented groups — represented by icons in the toolbar. The toolbar items (Elements, Resources, Scripts, Profiles and Databases) are named after the fundamental items you will work with inside the respective panels.

Console

The Console is now accessible from any panel. Unlike the other panels, the Console is not just used for one task — it might be used while inspecting the DOM, debugging JavaScript or analyzing HTML parse errors. The Console toggle button is found in the status bar, causing it to animate in and out from the bottom of the Web Inspector. The Console can also be toggled by the Escape key.

Error and warning counts are now shown in the bottom right corner of the status bar. Clicking on these will also open the Console.

In addition to the visual changes to the Console, we have also greatly improved usability by adding auto-completion and tab-completion. As you type expressions, property names will automatically be suggested. If there are multiple properties with the same prefix, pressing the Tab key will cycle through them. Pressing the Right arrow key will accept the current suggestion. The current suggestion will also be accepted when pressing the Tab key if there is only one matched property.

Our compatibility with Firebug’s command line and window.console APIs has also been greatly improved by Keishi Hattori (服部慶士), a student at The University of Tokyo (東京大学) who tackled this area as a summer project.

Elements Panel

The Elements panel is largely the same as the previous DOM view — at least visually. Under the hood we have made number of changes and unified everything into one DOM tree.

  • Descend into sub-documents — expanding a frame or object element will show you the DOM tree for the document inside that element.
  • Automatic updates — the DOM tree will update when nodes are added to or removed from the inspected page.
  • Inspect clicked elements — enabling the new inspect mode lets you hover around the page to find a node to inspect. Clicking on a node in the page will focus it in the Elements panel and turn off the inspect mode. This was contributed by Matt Lilek.
  • Temporarily disable style properties — hovering over an editable style rule will show checkboxes that let you disable individual properties.

  • Style property editing — double click to edit a style property. Deleting all the text will delete the property. Typing or pasting in multiple properties will add the new properties.
  • Stepping for numeric style values — while editing a style property value with a number, you can use the Up or Down keys to increment or decrement the number. Holding the Alt/Option key will step by 0.1, while holding the Shift key will step by 10.

  • DOM attribute editing — double click to edit a DOM element attribute. Typing or pasting in multiple attributes will add the new attributes. Deleting all the text will delete the attribute.
  • DOM property editing — double click to edit a DOM property in the Properties pane. Deleting all the text will delete the property, if allowed.
  • Metrics editing — double click to edit a any of the CSS box model metrics.
  • Position metrics — the Metrics pane now includes position info for absolute, relative and fixed positioned elements.

Resources Panel

The Resources panel is a supercharged version of the previous Network panel. It has a similar looking timeline waterfall, but a lot has been done to make it even more useful.

  • Graph by size — click Size in the sidebar to quickly see the largest resources downloaded.
  • Multiple sorting options — there are many sorting methods available for the Time graph, including latency and duration.
  • Latency bars — the Time graph now shows latency in the bar with a lighter shade. This is the time between making the request and the server’s first response.
  • Unified resource views — clicking a resource in the sidebar will show you the data pulled from the network (not downloaded again), including the request and response headers.
  • View XHRs — the time and size graphs also show XMLHttpRequests. Selecting an XHR resource in the sidebar will show the XHR data and headers.

Scripts Panel

The previous standalone Drosera JavaScript debugger has been replaced with a new JavaScript debugger integrated into the Web Inspector. The new integrated JavaScript debugger is much faster than Drosera, and should be much more convenient.

From the Scripts panel you can see all the script resources that are part of the inspected page. Clicking in the line gutter of the script will set a breakpoint for that line of code. There are the standard controls to pause, resume and step through the code. While paused you will see the current call stack and in-scope variables in the right-hand sidebar.

The Web inspector has a unique feature regarding in-scope variables: it shows closures, “with” statements, and event-related scope objects separately. This gives you a clearer picture of where your variables are coming from and why things might be breaking (or even working correctly by accident).

Profiles Panel

The brand new JavaScript Profiler in the Profiles panel helps you identify where execution time is spent in your page’s JavaScript functions. The sidebar on the left lists all the recorded profiles and a tree view on the right shows the information gathered for the selected profile. Profiles that have the same name are grouped as sequential runs under a collapsible item in the sidebar.

There are two ways to view a profile: bottom up (heavy) or top down (tree). Each view has its own advantages. The heavy view allows you to understand which functions have the most performance impact and the calling paths to those functions. The tree view gives you an overall picture of the script’s calling structure, starting at the top of the call-stack.

Below the profile are a couple of data mining controls to facilitate the dissection of profile information. The focus button (Eye symbol) will filter the profile to only show the selected function and its callers. The exclude button (X symbol) will remove the selected function from the entire profile and charge its callers with the excluded function’s total time. While any of these data mining features are active, a reload button is available that will restore the profile to its original state.

WebKit’s JavaScript profiler is fully compatible with Firebug’s console.profile() and console.profileEnd() APIs, but you can also specify a title in console.profileEnd() to stop a specific profile if multiple profiles are being recorded. You can also record a profile using the Start/Stop Profiling button in the Profiles panel.

Databases Panel

The Databases panel lets you interact with HTML 5 Database storage. You can examine the contents of all of the page’s open databases and execute SQL queries against them. Each database is shown in the sidebar. Expanding a database’s disclosure triangle will show the database’s tables. Selecting a database table will show you a data grid containing all the columns and rows for that table.

Selecting a database in the sidebar will show an interactive console for evaluating SQL queries. The input in this console has auto-completion and tab-completion for common SQL words and phrases along with table names for the database.

Search

Accompanying the task-oriented reorganization, the search field in the toolbar now searches the current panel with results being highlighted in the context of the panel. Targeting the search to the current panel allows each panel to support specialized queries that are suited for the type of information being shown. The panels that support specialized queries are Elements and Profiles.

The Elements panel supports XPath and CSS selectors as queries in addition to plain text. Any search you perform will be attempted as a plain text search, a XPath query using document.evaluate() and a CSS selector using document.querySelectorAll(). All the search results will be highlighted in the DOM tree, with the first match being revealed and selected.

The Profiles panel supports plain text searches of the function names and resource URLs. Numeric searches are also supported that match rows in the profile’s Self, Total and Calls columns. To facilitate powerful numeric searching, there are a few operators and units that work to extend or limit your results. For example you can search for “> 2.5ms” to find all the functions that took longer than 2.5 milliseconds to execute. In addition to “ms”, the other supported units are: “s” for time in seconds and “%” for percentage of time. The other supported operators are “< ”, “<=”, “>=” and “=”. When no units are specified the Calls column is searched.

In all the panels pressing Enter in the search field or ⌘G (Ctrl+G on Windows and Linux) will reveal the next result. Pressing ⇧⌘G (Ctrl+Shift+G on Windows and Linux) will reveal the previous result. In the Resources, Scripts and Profiles panels the search will be performed on the visible view first and will automatically jump to the first result only if the visible view has a match.

Availability and Contributing

All of these things are available now in the Mac and Windows nightly builds. Give them a try today, and let us know what you like (or don’t like).

If you would like to contribute, there are some really interesting tasks in the list of Web Inspector bugs and enhancements, and other contributors in the #webkit chat room are pretty much always available to provide help and advice.

Full Pass of Acid3

Posted by Maciej Stachowiak on Thursday, September 25th, 2008 at 5:29 pm

Today we would like to announce that WebKit is the first browser engine to fully pass Acid3. A while back, we posted that we scored 100/100 and matched the reference rendering. Now, thanks to recent speedups in JavaScript, DOM and rendering, we have passed the third condition, smooth animation on reference hardware.

Here is a screenshot of a successful run:

Here is the timing reference dialog you get by clicking on the “A” in Acid3 that confirms we pass the smooth animation condition on a 2.4GHz MacBook Pro:

To try it for yourself, grab a nightly. Keep in mind that on slower machines, the timing may not be perfect, and you need to do a cached run of the test (load it once, close window, open new window, load it again) to avoid delays from the network.

Introducing SquirrelFish Extreme

Posted by Maciej Stachowiak on Thursday, September 18th, 2008 at 9:00 pm

Just three months ago, the WebKit team announced SquirrelFish, a major revamp of our JavaScript engine featuring a high-performance bytecode interpreter. Today we’d like to announce the next generation of our JavaScript engine - SquirrelFish Extreme (or SFX for short). SquirrelFish Extreme uses more advanced techniques, including fast native code generation, to deliver even more JavaScript performance.

For those of you who follow WebKit development and are interested in contributing, we’d like to report our results and what we did to achieve them.

How Fast is It?

This chart shows WebKit’s JavaScript performance in different versions - bigger bars are better.

bar graph showing WebKit 3.0: 5.4; WebKit 3.1: 18.8; SquirrelFish: 29.9; SquirrelFish Extreme: 63.6

The metric is SunSpider runs per minute. We present charts this way because “bigger is better” is easier to follow when you have a wide range of performance results. As you can see, SquirrelFish Extreme as of today is more than twice as fast as the original SquirrelFish, and over 10 times the speed you saw in Safari 3.0, less than a year ago. We are pretty pleased with this improvement, but we believe there is more performance still to come.

Quite a few people contributed to these results. I will mention a few who worked on some key tasks, but I’d also like to thank all of the many WebKit contributors who have helped with JavaScript and performance.

What makes it so fast?

SquirrelFish Extreme uses four different technologies to deliver much better performance than the original SquirrelFish: bytecode optimizations, polymorphic inline caching, a lightweight “context threaded” JIT compiler, and a new regular expression engine that uses our JIT infrastructure.

1. Bytecode Optimizations

When we first announced SquirrelFish, we mentioned that we thought that the basic design had lots of room for improvement from optimizations at the bytecode level. Thanks to hard work by Oliver Hunt, Geoff Garen, Cameron Zwarich, myself and others, we implemented lots of effective optimizations at the bytecode level.

One of the things we did was to optimize within opcodes. Many JavaScript operations are highly polymorphic - they have different behavior in lots of different cases. Just by checking for the most common and fastest cases first, you can speed up JavaScript programs quite a bit.

In addition, we’ve improved the bytecode instruction set, and built optimizations that take advantage of these improvements. We’ve added combo instructions, peephole optimizations, faster handling of constants and some specialized opcodes for common cases of general operations.

2. Polymorphic Inline Cache

One of our most exciting new optimizations in SquirrelFish Extreme is a polymorphic inline cache. This is an old technique originally developed for the Self language, which other JavaScript engines have used to good effect.

Here is the basic idea: JavaScript is an incredibly dynamic language by design. But in most programs, many objects are actually used in a way that resembles more structured object-oriented classes. For example, many JavaScript libraries are designed to use objects with “x” and “y” properties, and only those properties, to represent points. We can use this knowledge to optimize the case where many objects have the same underlying structure - as people in the dynamic language community say, “you can cheat as long as you don’t get caught”.

So how exactly do we cheat? We detect when objects actually have the same underlying structure — the same properties in the same order — and associate them with a structure identifier, or StructureID. Whenever a property access is performed, we do the usual hash lookup (using our highly optimized hashtables) the first time, and record the StructureID and the offset where the property was found. Subsequent times, we check for a match on the StructureID - usually the same piece of code will be working on objects of the same structure. If we get a hit, we can use the cached offset to perform the lookup in only a few machine instructions, which is much faster than hashing.

Here is the classic Self paper that describes the original technique. You can look at Geoff’s implementation of the StructureID class in Subversion to see more details of how we did it.

We’ve only taken the first steps on polymorphic inline caching. We have lots of ideas on how to improve the technique to get even more speed. But already, you’ll see a huge difference on performance tests where the bottleneck is object property access.

3. Context Threaded JIT

Another major change we’ve made with SFX is to introduce native code generation. Our starting point is a technique called a “context threaded interpreter”, which is a bit of a misnomer, because this is actually a simple but effective form of JIT compiler. In the original SquirrelFish announcement, we described our use of direct threading, which is about the fastest form of bytecode intepretation short of generating native code. Context threading takes the next step and introduces some native code generation.

The basic idea of context threading is to convert bytecode to native code, one opcode at a time. Complex opcodes are converted to function calls into the language runtime. Simple opcodes, or in some cases the common fast paths of otherwise complex opcodes, are inlined directly into the native code stream. This has two major advantages. First, the control flow between opcodes is directly exposed to the CPU as straight line code, so much dispatch overhead is removed. Second, many branches that were formerly inside opcode implmentations are now inline, and made visible and highly predictable to the CPU’s branch predictor.

Here is a paper describing the basic idea of context threading. Our initial prototype of context threading was created by Gavin Barraclough. Several of us helped him polish it and tune the performance over the past few weeks.

One of the great things about our lightweight JIT is that there’s only about 4,000 lines of code involved in native code generation. All the other code remains cross platform. It’s also surprisingly hackable. If you thought compiling to native code is rocket science, think again. Besides Gavin, most of us have little prior experience with native codegen, but we were able to jump right in.

Currently the code is limited to x86 32-bit, but we plan to refactor and add support for more CPU architectures. CPUs that are not yet supported by the JIT can still use the interpreter. We also think we can get a lot more speedups out of the JIT through techniques such as type specialization, better register allocation and liveness analysis. The SquirrelFish bytecode is a good representation for making many of these kinds of transforms.

4. Regular Expression JIT

As we built the basic JIT infrastructure for the main JavaScript language, we found that we could easily apply it to regular expressions as well, and get up to a 5x speedup on regular expression matching. So we went ahead and did that. Not all code spends a bunch of time in regexps, but with the speed of our new regular expression engine, WREC (the WebKit Regular Expression Compiler), you can write the kind of text processing code you’d want to do in Perl or Python or Ruby, and do it in JavaScript instead. In fact we believe that in many cases our regular expression engine will beat the highly tuned regexp processing in those other languages.

Since the SunSpider JavaScript benchmark has a fair amount of regexp content, some may feel that developing a regexp JIT is an “unfair” advantage. A year ago, regexp processing was a fairly small part of the test, but JS engines have improved in other areas a lot more than on regexps. For example, most of the individual tests on SunSpider have gotten 5-10x faster in JavaScriptCore — in some cases over 70x faster than the Safari 3.0 version of WebKit. But until recently, regexp performance hadn’t improved much at all.

We thought that making regular expressions fast was a better thing to do than changing the benchmark. A lot of real tasks on the web involve a lot of regexp processing. After all, fundamental tasks on the web, like JSON validation and parsing, depend on regular expressions. And emerging technologies — like John Resig’s processing.js library — extend that dependency ever further.

A Word About Benchmarks

We have included some performance results, but don’t take our word for it. You can get WebKit nightlies for Mac and Windows and try for yourself.

The primary benchmark we use to track JavaScript performance is SunSpider. Although, like all benchmarks, it has its flaws, we think it is a balanced test that covers many dimensions of the JavaScript language and many types of code. If you look at test by test results, you will see that different JavaScript implementations have their own strengths and weaknesses. Browser vendors and independent testers have been tracking this benchmark.

Next Steps and How You Can Contribute

We believe the SquirrelFish Extreme architecture has room for lots more optimization, and we’d love to see more developers and testers pitch in. Currently, we are looking at how to use the bytecode infrastructure to perform more information gathering at runtime and then using it to drive better code generation, and we are studying ways to make JS function calls faster. There is also a lot of basic tuning work to do to take more advantage of the basic architectural advances in SFX. In addition, we’re interested in having JIT back ends for other CPU architectures.

If you’d like to follow the development of WebKit’s JavaScript engine more closely, we have created the squirrelfish-dev@lists.webkit.org mailing list (subscribe here) and the #squirrelfish IRC channel on the FreeNode IRC network. Stop on by and you can learn more about our plans, and how you can help.

Try it Out

Try it, test it, browse with it. It’s now available in nightlies. We hope the changes we’ve made help improve your experience of the web.

UPDATE: For the curious, here are some comparisons of SFX to other leading JavaScript engines. Charles Ying has comparisons on a few more benchmarks.

UPDATE 2: For those of you who just can’t get enough of our little mascot, click the SquirrelFish below in a recent WebKit nightly for a demo of SVG animation support.

the SquirrelFish mascot

Cameron Zwarich is a WebKit Reviewer

Posted by Maciej Stachowiak on Monday, June 23rd, 2008 at 12:58 pm

Cameron Zwarich is now a qualified WebKit Reviewer. Cameron has been doing amazing work on the JavaScript implementation and was one of the core developers on the new bytecode engine. He has also branched out into other areas of the code and squashed bugs of all varieties. Please join me in congratulating Cameron on his reviewer status and thanking him for all of his contributions to WebKit.

Announcing SquirrelFish

Posted by ggaren on Monday, June 2nd, 2008 at 5:37 pm

SquirrelFish Mascot

“Hello, Internet!”

WebKit’s core JavaScript engine just got a new interpreter, code-named SquirrelFish.

SquirrelFish is fast—much faster than WebKit’s previous interpreter. Check out the numbers. On the SunSpider JavaScript benchmark, SquirrelFish is 1.6 times faster than WebKit’s previous interpreter.

SunSpider runs per minute

bar graph of SunSpider runs

Longer bars are better.

What Is SquirrelFish?

SquirrelFish is a register-based, direct-threaded, high-level bytecode engine, with a sliding register window calling convention. It lazily generates bytecodes from a syntax tree, using a simple one-pass compiler with built-in copy propagation.

SquirrelFish owes a lot of its design to some of the latest research in the field of efficient virtual machines, including research done by Professor M. Anton Ertl, et al, Professor David Gregg, et al, and the developers of the Lua programming language.

Some great introductory reading on these topics includes:

I’ve also pored over stacks of terrible books and papers on these topics. I’ll spare you those.

Why It’s Fast

Like the interpreters for many scripting languages, WebKit’s previous JavaScript interpreter was a simple syntax tree walker. To execute a program, it would first parse the program into a tree of statements and expressions. For example, the expression “x + y” might parse to

        +
      /   \
     x     y

Having created a syntax tree, the interpreter would recursively visit the nodes in the tree, performing their operations and propagating execution state. This execution model incurred a few types of run-time cost.

First, a syntax tree describes a program’s grammatical structure, not the operations needed to execute it. Therefore, during execution, the interpreter would repeatedly visit nodes that did no useful work. For example, for the block “{ x++; }”, the interpreter would first visit the block node “{…}”, which did nothing, and then visit its first child, the increment node “x++”, which incremented x.

Second, even nodes that did useful work were expensive to visit. Each visit required a virtual function call and return, which meant a couple of indirect memory reads to retrieve the function being called, and two indirect branches—one for the call, and one for the return. On modern hardware, “indirect” is a synonym for “slow”, since indirection tends to defeat caching and branch prediction.

Third, to propagate execution state between nodes, the interpreter had to pass around a bunch of data. For example, when processing a subtree involving a local variable, the interpreter would copy the variable’s value between all the nodes in the subtree. So, starting at the “x” part of the expression “f((x) + 1)”, a variable node “x” would return x to a parentheses node “(x)”, which would return x to a plus node “(x) + 1”. Then, the plus node would return (x) + 1 to an argument list node “((x) + 1)”, which would copy that value into an argument list object, which, in turn, it would pass to the function node for f. Sheesh!

In our first rounds of optimization, we squeezed out as much performance as we could without changing this underlying architecture. Doing so allowed us to regression test each optimization we wrote. It also set a very high bar for any replacement technology. Finally, having realized the full potential of the syntax tree architecture, we switched to bytecode.

SquirrelFish’s bytecode engine elegantly eliminates almost all of the overhead of a tree-walking interpreter. First, a bytecode stream exactly describes the operations needed to execute a program. Compiling to bytecode implicitly strips away irrelevant grammatical structure. Second, a bytecode dispatch is a single direct memory read, followed by a single indirect branch. Therefore, executing a bytecode instruction is much faster than visiting a syntax tree node. Third, with the syntax tree gone, the interpreter no longer needs to propagate execution state between syntax tree nodes.

The bytecode’s register representation and calling convention work together to produce other speedups, as well. For example, jumping to the first instruction in a JavaScript function, which used to require two C++ function calls, one of them virtual, now requires just a single bytecode dispatch. At the same time, the bytecode compiler, which knows how to strip away many forms of intermediate copying, can often arrange to pass arguments to a JavaScript function without any copying.

Just the Beginning

In a typical compiler, conversion to bytecode is just a means to an end, not an end in itself. The purpose of the conversion is to “lower” an abstract tree of grammatical constructs to a concrete vector of execution primitives, the latter form being more amenable to well-known optimization techniques.

Therefore, though we’re very happy with SquirrelFish’s current performance, we also believe that it’s just the beginning. Some of the compile-time optimizations we’re looking at, now that we have a bytecode representation, include:

  • constant folding
  • more aggressive copy propagation
  • type inference—both exact and speculative
  • specialization based on expression context—especially void and boolean context
  • peephole optimization
  • escape analysis

This is an interesting problem space. Since many scripts on the web are executed once and then thrown away, we need to invent versions of these optimizations that are simple and efficient. Moreover, since JavaScript is such a dynamic language, we also need to invent versions of these optimizations that are resilient in the context of an unknown environment.

We’re also looking at further optimizing the virtual machine, including:

  • constant pool instructions
  • superinstructions
  • instructions with implicit register operands
  • advanced dispatch techniques, like instruction duplication and context threading
  • getting computed goto working on Windows

Performance on Windows has extra room to grow because the interpreter on Windows is not direct-threaded yet. In place of computed goto, it uses a switch statement inside a loop.

Getting Involved

If you’re interested in compilers or virtual machines, this is a great project to join. We’re moving quickly, so the best way to come up to speed is to log on to our IRC channel.

As always, testing out nightly builds and reporting bugs is also a great help.

Extra Bonus Updates

We’ve got some extra bonus info: very early draft documentation of the SquirrelFish VM’s opcodes. For those of you who know about VMs, you may find this enlightening, for those who don’t, you may find it is simpler than you expect.

In addition, we have a detailed comparison of Safari 3.1 vs. SquirrelFish, looking at the individual tests, it is interesting to see which sped up the most. If you look at this comparison to Safari 3.0, you can see that we’ve sped up 4.34x overall since Safari 3, and have improved some kinds of code by over an order of magnitude.

SquirrelFish around the web: There’s lots of interesting discussion in the reddit article about this post. And posts from key SquirrelFish developer Cameron Zwarich has performance data and other info, as does occasional WebKit contributor Charles Ying.

Safari Hits 6.25% Market Share

Posted by Maciej Stachowiak on Saturday, May 31st, 2008 at 9:04 pm

The latest browser market share data is in, and Safari has hit 6.25%, breaking 6% for the first time. Last month’s share was 5.81%, so this is a significant increase. It was only nine months ago that Safari broke 5%. Safari market share has now almost tripled from 2.14% in June 2005, when the WebKit Open Source project launched.

This growth, combined with recent WebKit adoption in projects such as the Iris Browser, Qt 4.4, Android, Adobe AIR, Epiphany, KDE Plasma, iCab and more, is breathtaking and shows huge positive momentum for the WebKit project. Thanks and congratulations to everyone who has contributed to the project.

Third Annual WebKit Open Source Party

Posted by Adele Peterson on Wednesday, May 28th, 2008 at 11:08 am

It’s that time of year again! Web developers, WebKit hackers, browser developers, or anyone with an interest in cool technology should come have a drink and some snacks, and meet WebKit contributors from Apple and around the world. This event is open to anyone who is interested, free of charge.

Don’t miss out on the nerd party of the century!!!!

Details

Place: Thirsty Bear Restaurant & Brewery (map)
Date: Tuesday, June 10th
Time: 7:30 PM
upcoming.org