What is wrong in Fatwire/WCS development 2: Deployment Hell « Fatcoders WebCenter Sites

09Jan2013

What is wrong in Fatwire/WCS development 2: Deployment Hell

One "feature" of WCS, being a CMS, is deployment of code done in the same way as content. Code is treated as content and managed in the same way. We will see that this fact can create many problems.

A content editor doing is job changes some content, and then he approves it for publishing. WCS is smart enough to detect dependent content and requires the approval of related content to publish it in single publishing session.

This is great for a web site, where you only have to update a single content asset to update all the web pages referring to that content in the web site. Furthermore the publishing process is smart enough to invalidate only parts of the cache affected by the changed content.

Developers should work in the same way: in the development server, a developer can change the template code, then approve it and finally publish it. Code should then go from development servers to staging servers and finally to live servers.

Let's put aside for now that having a single development server for multiple developers is a problem in itself (I will say more of this later), let's give a look at what developers really do and why this way of developing the code does not work as good as it should.

How developers REALLY develop...

There is a great variation in development procedures. The more common, even if now there better tools are available, many are still using the aging ContentServer Explorer (now Sites Explorer) editing directly JSP code stored in the ElementCatalog table.

Unfortunately, when you edit a JSP, the associated Template or CSElement is not aware that you changed the code with CSExplorer. So to make sure the "code publishing" mechanism work, you have to manually edit and save the Template or CSElement corresponding to the JSP you edited, then approve it and finally publish it.

Being a manual process, way too often happens that someone forget either the edit/save or the approval of a changed template.

Also the propagation of the code from staging to live requires a re-approval of the templates. Although you can theoretically could just do a bulk approve, many people are scared of republishing everything. So what usually happens is that all the changed templates are manually approved and published, using manually kept release notes.

Since usually who deployed templates from development to staging is a different person from whom developed them, a floating document with the list of the changed element, or worse a flow of random email is used to propagate those informations.

At some point someone makes a mistake, forget to approve a template, distribute a list with the wrong templates ... and problems not existing in development starts to appear in staging or in production, randomly.

When different developers are involved, or there is a turnaround in the editorial team, it happens way to often that you no more know what is deployed in which server. I have seen people periodically spending days comparing each template in different server just to figure out what went wrong and the origin of a bug.

But wait... there is more

Actually things can go much worse than this.

Another problem happens very often when developers are forced to develop on a system disconnected from the staging/deliver chain.

This may happen for many reasons, the more common is some brain-damaged security policy, but there can are other more practical reasons, for example: "the connection from UK to India is too slow and we had to deploy a local development server".

The current solution to this problem is CSDT but to be honest, it is not yet very widely used. People are very creative in solving the problem of distributing their development work. Some uses catalog mover, but I have seen people distributing their work as a database dump and even manually copying and pasting the code in the Fatwire Advance Interface.

Needless to say, this is aggravating the deployment hell already described in the previous paragraph.

But the worse situation, that I have seen too, is when developers are developing in their development server, then some other people fix some issues (usually in HTML) editing directly templates in staging, and at the same time some urgent issues are also fixed manually editing templates directly on the live server. The result as you can imagine is a total unmanageable mess. And unfortunately, even is is an extreme case, it happens.

What you really need

The whole idea of deploying code as a collection of separate elements singularly deployable is wrong, and all this is originated because code is treated as content.

Code is inherently different from content. It has a different structure, a different development process, different editing tools. More important, code must be always deployed always as a single unit, ideally as a single file, easily trackable and recognisable. You should immediately know that you have for example version 1.3.14 on production, and bug report and fixes must refer to a specific version number.

So the fact that Fatwire/WCS allows code to be treated as separate entities independently deployable is a weakness.

Java has a concept of deployment unit: it is called the "jar" file. Fatwire/WCS is one of the few Java places where code is not deployed in jars, but it is instead delivered as separate templates deployed through publishing.

What is really needed is that all the code for a site can be distribuited as single JAR fil, that can be easily deployed, tracked, compared, distribuited and versioned.

All the deployment hell I described would go away if instead of having a bunch of files, you have a jar. The jar can be built by developers, tested separately with bug report and fixed referring to a specific build, delivered to destination and deployed just copying the file and eventually running some schema update procedure.

Jars have a shortcoming, of course. Usually they require the restart of the application server to be recognized. This in a live environment is not usually acceptable. Nonetheless, it is not always true that deploying a new jar requires the application server restart. There are plenty of hot-reloading Java systems. Just to mention one, hot reloading of jars in JBoss. So it is possible a system where a site is deployed in jars that are deployed without restarting the application server (and indeed, I already implemented such a system).

I will continue to list WCS/Fatwire development problems in the next few posts before introducing my solution to those problems. Stay tuned.

Posted by msciab

09Jan2013