09Jan2013

What is wrong in Fatwire/WCS development 2: Deployment Hell

One "feature" of WCS, being a CMS, is deployment of code done in the same way as content. Code is treated as content and managed in the same way. We will see that this fact can create many problems.

A content editor doing is job changes some content, and then he approves it for publishing. WCS is smart enough to detect dependent content and requires the approval of related content to publish it in single publishing session.

This is great for a web site, where you only have to update a single content asset to update all the web pages referring to that content in the web site. Furthermore the publishing process is smart enough to invalidate only parts of the cache affected by the changed content.

Developers should work in the same way: in the development server, a developer can change the template code, then approve it and finally publish it. Code should then go from development servers to staging servers and finally to live servers.

Let's put aside for now that having a single development server for multiple developers is a problem in itself (I will say more of this later), let's give a look at what developers really do and why this way of developing the code does not work as good as it should.

How developers REALLY develop...

There is a great variation in development procedures. The more common, even if now there better tools are available, many are still using the aging ContentServer Explorer (now Sites Explorer) editing directly JSP code stored in the ElementCatalog table.

Unfortunately, when you edit a JSP, the associated Template or CSElement is not aware that you changed the code with CSExplorer. So to make sure the "code publishing" mechanism work, you have to manually edit and save the Template or CSElement corresponding to the JSP you edited, then approve it and finally publish it.

Being a manual process, way too often happens that someone forget either the edit/save or the approval of a changed template.

Also the propagation of the code from staging to live requires a re-approval of the templates. Although you can theoretically could just do a bulk approve, many people are scared of republishing everything. So what usually happens is that all the changed templates are manually approved and published, using manually kept release notes.

Since usually who deployed templates from development to staging is a different person from whom developed them, a floating document with the list of the changed element, or worse a flow of random email is used to propagate those informations.

At some point someone makes a mistake, forget to approve a template, distribute a list with the wrong templates ... and problems not existing in development starts to appear in staging or in production, randomly.

When different developers are involved, or there is a turnaround in the editorial team, it happens way to often that you no more know what is deployed in which server. I have seen people periodically spending days comparing each template in different server just to figure out what went wrong and the origin of a bug.

But wait... there is more

Actually things can go much worse than this.

Another problem happens very often when developers are forced to develop on a system disconnected from the staging/deliver chain.

This may happen for many reasons, the more common is some brain-damaged security policy, but there can are other more practical reasons, for example: "the connection from UK to India is too slow and we had to deploy a local development server".

The current solution to this problem is CSDT but to be honest, it is not yet very widely used. People are very creative in solving the problem of distributing their development work. Some uses catalog mover, but I have seen people distributing their work as a database dump and even manually copying and pasting the code in the Fatwire Advance Interface.

Needless to say, this is aggravating the deployment hell already described in the previous paragraph.

But the worse situation, that I have seen too, is when developers are developing in their development server, then some other people fix some issues (usually in HTML) editing directly templates in staging, and at the same time some urgent issues are also fixed manually editing templates directly on the live server. The result as you can imagine is a total unmanageable mess. And unfortunately, even is is an extreme case, it happens.

What you really need

The whole idea of deploying code as a collection of separate elements singularly deployable is wrong, and all this is originated because code is treated as content.

Code is inherently different from content. It has a different structure, a different development process, different editing tools. More important, code must be always deployed always as a single unit, ideally as a single file, easily trackable and recognisable. You should immediately know that you have for example version 1.3.14 on production, and bug report and fixes must refer to a specific version number.

So the fact that Fatwire/WCS allows code to be treated as separate entities independently deployable is a weakness.

Java has a concept of deployment unit: it is called the "jar" file. Fatwire/WCS is one of the few Java places where code is not deployed in jars, but it is instead delivered as separate templates deployed through publishing.

What is really needed is that all the code for a site can be distribuited as single JAR fil, that can be easily deployed, tracked, compared, distribuited and versioned.

All the deployment hell I described would go away if instead of having a bunch of files, you have a jar. The jar can be built by developers, tested separately with bug report and fixed referring to a specific build, delivered to destination and deployed just copying the file and eventually running some schema update procedure.

Jars have a shortcoming, of course. Usually they require the restart of the application server to be recognized. This in a live environment is not usually acceptable. Nonetheless, it is not always true that deploying a new jar requires the application server restart. There are plenty of hot-reloading Java systems. Just to mention one, hot reloading of jars in JBoss. So it is possible a system where a site is deployed in jars that are deployed without restarting the application server (and indeed, I already implemented such a system).

I will continue to list WCS/Fatwire development problems in the next few posts before introducing my solution to those problems. Stay tuned.

Filed under: Features, Tutorial No Comments

27Feb2011

Assets as classes, Templates as methods

When you design a content model and its related templates in FatWire, it helps thinking you should consider Asset types as classes, and Templates as methods, that apply to those classes to render the content.

Although it may sound odd, here a few examples.

Let's start from a Page (that is a common asset you normally use as is). Which methods can you apply to this class? Well, methods here always refers to rendering methods.

It is common you want to display a page like a full page. Because normally header and footer are managed by the Layout, you can call the method Body. So Page/Body is a template (working like a method) that renders a Page in full, as a body. Another case that happens is that I just want to render the Page like a simple Link. Here another method can help, let's call it Page/Link. It is also very common you want also render a Summary with Page/Summary.

Now let's consider another type, for example a user defined type like Article; if it is a class, which methods we should use? Obviously we can think of something like before: Article/Body to render an article as the body of a full webpage, Article/Link to render it as a simple linke, and Article/Summary to render just a summary of it.

The underlying idea

Now, it is important to understand that we should use common method names, and for a precise reason: if you add to a container an Article or a Page, and you want to render, for example, a list of summaries to Page or Articles, you can do it without having to write code to distinguish the different cases.

So, good rules to follow when designing (or naming) the templates are:

decide and use a common naming conventions for templates. For example Summary,Link,Body etc
give a type, and better a subtype, to each template you use
reserve the "apply to many type templates" only to a few templates that (like the Layout) really apply to different types

You have to consider that whey you perform a "call template", calling the same template name for different types must template equivalent in function for that type.

Put simply, a call template for "Summary" must call the Page/Summary if the requested asset is a Page; or Article/Summary if the page is an Article. And both must produce valid html, good for any summary in the design of your site.

This is one of the most powerful (and misunderstood) feature in Fatwire ContentServer.

Filed under: Features, Tips and Tricks No Comments

25Feb2011

When to use Basic Assets

Nowadays Flex Assets are used in a predominant way to build your Fatwire website.
Basic Assets are however still there, and there are some cases when I consider (and use) them. Here is a couple of examples:

Data Migration

Basic Assets are more close in their underlying structure to a database table than the Flex Assets. So it is viable to fill them with database queries. It is much more difficult to do so with Flex Assets.

For example recently I had to migrate a large dataset that was not expected to change. It was simpler to create a basic asset and then move data in the generated table than use a flex asset and the Bulk Loader (that is nonetheless available for those tasks, as well).

User Generated Content

Another case is to store User Generated content: for example, user registration. Instead of building a full feature Flex Asset (using for example the Asset API) it was simpler just to create a plain basic asset and register users creating new instances of that asset.

Is is also faster to create an instance of a Basic Asset than of a Flex Asset, and this can be a winning feature sometimes.

Filed under: Features, Tips and Tricks 1 Comment

13Jan2011

What is an ACL?

Have you ever wondered what an ACL in Fatwire really is?

I have seen countless times, when a customer have to create a user, that he gets lost when he sees the following option:

The common reaction is just to select everything and move on...

Not a great choice in term of security.

Also, very often customers do not distinguish well ACLs from Roles.... They often think they are the same thing.

So, if you have this problem, read on...

So, what is an ACL, anyway?

First and before all, an ACL is just a symbolic name describing a set of permissions.

ACL are configured in the SystemACL table.

You can see existing ACLs looking into the SystemACL table itself. If you try to edit, for example, the PageEditor ACL, you will see something like this:

PageEditor ACL

Now, think to this set like a set of actions you can execute against a database table.

You can read a specific row, search in it, create or update a row. Also, as a special permission, you can read the revision tracking (that is basically a log of the modifications) and edit them (removing old revisions).

Ok, they are CRUD permission. What now?

So far an ACL just describes permissions you have on a single database table. But how do you associate a permission to a database table?

The secret is the SystemInfo table: give a look to the following snapshot, taken using ContentServer Explorer:

SystemInfo with ACL

As you can see, each table is protected by an ACL. And the ACL describes what you can do with that table, if the user accessing to the table has that ACL.

So, here how actually ACL works:

First, you are identified:

If you did not use any password, you are the DefaultUser, and have the ACL of the DefaultUser.
Otherwise, you was identified using an username and password, and you have now the ACL associated to that username.

Second, this is what happens:

Whatever you do, at some point you will try to do an action against a database table
The system checks if you have one of the ACL associated to the table. If not, access is denied
If yes, the system checks if the ACL allows you to do the action you require. If not, access is denied.
Otherwise, you will be able to execute the required action as the logged on user.

So basically, ACL protect database tables. They are a very low level feature.

They have nothing to do with Roles, that is a CMS concept (a more elaborated and very different feature).

Hmm, what already defined ACLs mean?

Here a few random notes taken looking at the standard SystemInfo configuration:

- a ElementReader can read the ElementCatalog while an ElementWriter can modify it. Writing the ElementCatalog is usually required only for developers.

- a PageReader can read the SiteCatalog while a PageWriter can modify it. SiteCatalog configures cache , so accessing it is required mostly to a system administrator (or to a developer that configures cache and permissions)

- SiteGod is present everywhere allows can do everything, so it is a very dangerous to give this permission. Normally is there only for maintenance reasons.

- xceleditor is required to access to the CMS tables, so any user in the editorial team requires it.

Filed under: Features 1 Comment

03Jan2011

Basic and Flex assets

In Fatwire, there are 2 types of custom assets: Basic Assets and Flex Assets.

Some Fatwire sites, developed ages ago, are still based on Basic Assets, but the vast majority of recent Fawire web sites are instead based on Flex Assets.

Why Flex Assets?

Modern sites are based on Flex Assets for a reason. Flex Assets are… well, flexible.

The fundamental feature of a Flex Asset is the ability to add fields at any moment.
You can update your site at any time, adding new fields but keeping all the existing content.

Also Flex Asset are organized in Families, so you can easily build hierarchies and render them in a treetab.

There are also a lot of other features that I cannot simply describe in this post.

But this flexibility comes with a cost.

The underlying database structure of flex assets is pretty complex.
Attributes are not stored simply as database fields, so the underling database structure is pretty obscure.
Asset creation is slower that Basic assets
Searching attributes has some limitations and can require complex code.

That said, it is not to discourage use of Flex Assets. On the contrary, they are the best choice for almost all the common uses cases.

However, there are a few of them you can still consider the use of Basic Assets

Why Basic Assets

On the contrary, basic assets do not have all this flexibility. Once created, they have a fixed structure.

The normal way to upgrade them is to drop them and recreate with the new format.

You can still save the content but you may require to use some database magic.

But the underlying structure is simpler and straightforward.

Basically

for each basic asset there is a database table,
for each attribute there is a field in the table.

This property turns to be useful in many cases. So I do not consider Basic asset to be obsolete, but simply a Fatwire feature useful in certain tasks.

So far, I found basic asset to be useful for example to:

1. store user generated content: This because it is faster and simpler to create a basic asset than a flex asset

2. store legacy content: because the straightforward structure of a flex asset make easy to load the content with simple database queries

Filed under: Features No Comments

31Dec2010

What cache dependencies are

One of the best kept "secret" of Fatwire is the cache invalidation mechanism.

Rendering an asset is slow.

Really slow. It is slow because there are many database queries involved. So, once a template is rendered, the result is cached (at least, if you have enabled caching for that template).

But, when you cache a component, what happens if the source asset change? I mean, what happens if you render an article, and then the article is updated? Of course, when the source asset is updated the cached element is invalidated.

Asset dependencies

There is actually a database table tracking all the dependencies. For each cached element (called "pagelet") in the database, a dependency is stored in that table. That is, the information:

"which pagelet must be invalidated when an asset is updated".

Publishing invalidate the cache

When you publish it happens that the cache is updated, as well.

For each published asset,

1. all the dependencies are searched for

2. all the cached pagelet that depends on the published asset are invalidated.

Please note that this happens only for the pagelets that have dependencies on the published assets.

Smart Cache

So you don't need to rebuild the whole cache, but only the pagelet that are invalidated. Just a bunch of them. Those who depends on the new assets.

It is a smart and powerful mechanism. Unfortunately, is is also a mechanism easy to break if you don't follow a design and some simple rules in coding templates.

Or worse, if you ignore the caching and the dependency mechanism.

Filed under: Features 1 Comment

28Dec2010

Fatwire multiple caches

Fatwire Content Server has many caches, and it is very important to understand all of them.

Here a short list of the caches you have to consider

1. Content Server Cache

This is a content cache (html primarily) This cache is stored on disk. It can became very large because it caches a rendered html in full. It is fast but it read the content from disk and this can slow down a busy website.

2. Satellite Server Cache

This is another content cache (html). This cache is stored in memory. To limit the memory required, it does not store the full html but keeps the content split in different pieces, and reassemble it on demand.

3. Blob Server Cache

This cache is for binary content. It is a memory cache as well up to a certain limit, then it became a disk cache.

4. ResultSet Cache

This cache is for the result of queries. To avoid running the same query again and again, Fatwire store the result and keep it in memory for a while. Since many queries are the same, this can help a lot. Actually, thank to this cache, Fatwire often performs well even with very slow databases.

Filed under: Features 1 Comment