0



05Jun2012

Understanding cache dependencies

WC Sites  (formerly Fatwire ContentServer) is a content oriented CMS, as opposed to other page oriented CMS. The exact meaning of being driven by the content is that you are required only to describe your content without considering how this content will be actually rendered .

This idea has a few consequencies that must be taken in appropriate consideration when you design and code a WCS implementation.

The biggest problem in this separation is that you must keep some track of the relation beetween your content  and its rendering.

(Well this is not actually totally true, considering insite editing, that is a presentation oriented technology, however it does not change the concept - content and presentation are separated but related).

The relation between presentation and content

To better explain the problem consider a generic content, let's say an article in your site.

On the CMS you want to add it usually just once, but on the site the same article can appear a number of times in different web pages, in different formats.

For example an article can be displayed as a full text in a web page, as a summary in another web page or just as a simple link in many others.

All those occurrencies appears in different web pages, and you have to update  all of them when the underlying content model changes. For this reason WCS stores in the database dependency informations.

The main reason to keep those informations is because they are needed for an easy and efficient updating of the cache when the content changes.  In fact, when the site is rendered, blocks of html  (conventionally called pagelet) are generated from the content model and usually cached.

For each of those  pagelet a dependency is generated and stored. 

When the content model is changed all the stored dependencies are walked through and the dependent pagelets are invalidated and regenerated.

 In practice

You can see all of this in action simply inspecting the table SystemItemCache. This table is somewhat obscure but it keeps one of the more important informations used in publishing: dependencies between assets and pagelets.

This table is calculated while rendering the content model in an actual live site. Many dependencies  are actually calculated by the render:calltemplate. When you call a template, you are also declaring that a new pagelet (the one that will be rendered by the template you are invoking) will depend on the asset specified by c and cid. So a new dependency will be recorded for this asset.

Dependencies can also be recorded explicitly using the render:logdep tag. Indeed, you may notice that when you create a new template, some code involving a logdep is automatically added.

That code should NOT be removed: it is adding a dependency between the template itself (the code generating the pagelet) and the generated pagelet. Obviously, when you change the code generating the pagelet, that pagelet must be regenerated.

In general, a pagelet rendering a given asset (identified by c and cid) depends always on the asset itself (so a c:cid dependency is required) and the template that render it (so a Template:tid dependency is also required).

However things can became much more complex than this. There are other sources of dependencies: for example searchstate calls usually generate some dependencies, and an index gathered by a search can depend on so many assets that... an  "unknown dep" can be generated.

Those special dependencies basically invalidate each pagelet when any asset of a given type changes.

Common Errors

Unfortunately, caching and even worse dependency management are often misunderstood.

Some (untrained) developers are somewhat vaguely aware of the caching, but very often they completely ignore dependencies.

A common error is to create a template that depends on two or more assets (the first one being the c/cid, while the other is specified with extra parameters). Unfortunately, only the dependency specified by c/cid is actually recorded, so when the other asset change, the corresponding pagelet is NOT invalidated.

The net effect is a website that won't be updated when the content change (unless you invalidate all the cache).

 

Posted by msciab
05Jun2012