I’ve previous blogged about cookies. Specifically that the HTTP standard could benefit from a using regex to for path matching . Well I took the idea to the http-state working group (or at least their mail-list ) at the IETF and floated it. The idea cannot really go forward as there is no consistent agreement amongst browser-makers as to what regex is, let alone implementing it in a browser for cookie matching purposes. Lastly, and most importantly regex can get incredibly complicated quickly, especially when trying to construct ‘not these and not these’ ones (see below).

Let me specify again, where this would be useful. Consider an web-application that in terms of URL looks like:

Location Type Size
example.com/ (index) dynamic 20K
example.com/account/edit dynamic 10K
example.com/account/new dynamic 10K
example.com/account/recover_password dynamic 10K
example.com/buy dynamic 10K
example.com/cart/checkout dynamic 15K
example.com/cart/view dynamic 15K
example.com/help/how_to_buy.swf static 543K
example.com/prod_img/ static 10K+ each
example.com/product_search dyanamic 12 - 50K
example.com/style/example_css_sprite.png static 120K
example.com/style/example_functions.css static 29K
example.com/style/example_style.js static 8K

Our wish for small to medium load sites is to set a small session cookie, or one or more larger cookies for stateless operations.The cookies could be persistent between visits to the site, or the result of a log in, it does not really matter.

What does matter, is that we do not need the browser to inform us of these cookies for the static resources (those in yellow above). The browser is going to send up the cookies by habit anyway. They could be quite large if we’re doing the stateless mode of operation for web-apps, and that could impact on the end-users perception of site performance.

The classic solution would be to move these to another domain name, or use a CDN, it get them closer to the end user for performance. If you’ve engineered that solution correctly, you don’t need to worry about the cookies, because it is a different domain name. While a CDN could be expensive, you’re going to do it anyway if you have thousands of page impressions a minute.

For small/medium sites though, it would be nice to keep things simple and serve them from one web-server.Nicer still would be a way to mark the static items as not requiring cookies.

Here’s how the web server sets a cookie in the browser in the response to a resource request:

Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; 

The problem is that the path is implicitly inclusive, and only one path can be set. A simple modification to the spec could be to allow comma separated paths:

Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/,/account/,/buy,/cart/,/product_search; 

But perhaps better still would be the allowing of paths that are logically “not these paths” (exclusive) as that is bound to be a shorter list:

Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; !path=/help/,/prod_img/,/style/; 

Note the bang/shriek/exclamation mark before the word path to denote ‘not’.Hence I propose that HTTP 1.2 (or whatever) allows such a thing, unless there is a way to shoe-horn this into HTTP 1.1 somehow.

By the way, cookies per se have had some criticism this week from Google’s Michal Zalewski : HTTP cookies, or how not to design protocols and there’s a fierce debate raging in his blog comments.


November 1st, 2010