From my experience in coding javascript, I came to realize that as cool as bleeding-edge javascript libraries and full CSS3 and XPath support are, they are fairly slow, particularly when it comes to DOM manipulation.
John Resig recently did a neat benchmark test on different techniques for implementing the getElementsByClassName method (in light of the recent news that Firefox 3 will have its own native implementation). While it gives a good overall idea of what algorithms should be avoided, it would certainly be nicer to just have a copy-and-paste solution that runs the fastest way possible in all major browsers.
Most libraries have pretty good implementations, but I have yet to see one that is truly optimized for speed. Here's my variation, written with the sole goal of being the fastest possible getElementsByClassName() script:
var getElementsByClassName=function(className,parentNode) { var results = []; //if (document.getElementsByClassName) document.getElementsByClassName(className,parentNode);//this is the code for FF3. Prototype 1.5 doesn't play nice with this code, though. if (document.evaluate) { var query = document.evaluate(".//*[contains(concat(' ', @class, ' '), ' " + className + " ')]", parentNode || document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = 0, length = query.snapshotLength; i < length; i++) results.push(query.snapshotItem(i)); } else { var nodes = (parentNode || document).getElementsByTagName("*"); for (var i = 0, j = 0, length = nodes.length; i < length; i++) { var nodes_i=nodes[i]; if ((" "+nodes_i+" ").indexOf(" "+className+" ") > -1) results[j++] = nodes_i; } } return results; };
Benchmarking
For my test, I created a page with 200 sibling divs, out of which, half had class names. Each test queried the DOM 30 times, 1 second after page load, to minimize interference from network-related threads.
Benchmark results for library equivalents were fairly similar in Firefox (all of them scored somewhere near 190-200ms), but there were large performance differences in Internet Explorer 6. Prototype's implementation was 3 times slower than the script above.
Compared to CSS selector query engines like Prototype's $$() or JQuery, the script above performed between 10 to 20 times better, depending on the browser being tested. (In other words, $$(".my_class") is a big no-no).
The limitations of the script
- It only takes spaces as class separators. No tabs or line feeds. This is the main reason why the script is faster in browsers that don't support document.evaluate
- It assumes a browser, and doesn't check whether the "document" global exists. Rhino would always need to specify the root element anyways.
What makes it fast
- No expensive regular expressions
- document.evaluate delegates all the expensive DOM parsing to the underlying, compiled-code engine, where available.
- Using variables instead of object members cuts down a lot of overhead. A lot of people don't believe me when I say this stuff is important until they actually see the numbers. Just today I got the javascript execution time of an e-commerce site down from 970ms to 180ms simply by fixing a few for loops.
Last thoughts
I personally don't see why anyone would want to separate class names with a tab. Line feeds are even worse since they break the indentation flow and make HTML virtually impossible to read. Sure it's part of the specs, but I'd much rather use a fast and sensible subset.