[Clfs-support] Simplification of the XML

Wed May 19 18:24:09 PDT 2010

  Over the past week or so I took up an interest in simplifying the XML 
once again. I've been exploring a couple of different routes. I would 
like a few opinions on this process.

All of my attempts have been centered around the addition of a new 
element, <archopt> and its child, <archentry>. They work a lot like a 
case statement in any other programming language.

A few examples of how this new element could work:

<archopt>
<archentry arch="x86">some xml that will only appear on x86 here</archentry>
<archentry arch="x86">some xml that will only appear on alpha 
here</archentry>
<archentry arch="default">some xml that will appear on every other 
arch</archentry>
</archopt>

This element can technically be used to wrap chapters, titles, paras, 
programlistings, anything.

The method for rendering the book I've devised is a 2 part process. The 
first step is an expansion stage, it is an xslt script that gets run 
once. It produces a single xml file for each architecture. That single 
file can then be processed again using docbook-xsl and turned into the 
html/pdf book as we know it.

To dissect it further. I wrote a very relaxed DTD called expansion.dtd. 
It is built off of the docbook 4.5 dtd and adds in support for the 
archopt/archentry elements. It is not very strict and currently doesn't 
allow the elements everywhere, but it allows the elements to exist 
in-and-out of most of the existing docbook structure. This is more of a 
rough verification to make sure that the document is in the ballpark and 
that the syntax is correct enough for the first stylesheet to process 
without errors. This verification can be skipped without an issue, 
another target can be added into the makefile for this.

The first stylesheet is clfsexpansion.xsl. It expands the input based on 
the stringparam clfs.arch parameter passed to xsltproc. This will output 
a single xml file, with all entities expanded, includes expanded, just 
like the validation does now. This file is pure docbook, no more 
customization at this point. We can do a real validation on these files 
and make sure they are accurate.

 From here we can do a validation if we want, or just continue processing.

The second stylesheet, db_xhtml_chunks.xsl, renders a standard xhtml 
chunked docbook book using xslt. Just like we do now. Very easy to do 
nochunks, chunks, pdf, wget lists, dumps, all from this set of files.

Thoughts, Comments? I would really like to implement this asap. I don't 
want to make a change this major without putting it out there first. 
When it's all said and done, it should make the xml considerably 
smaller. we have something like 6000+ xi:includes right now, i bet we 
could bring that down to under 500 and remove almost every duplicate 
file. (rough numbers off the top of my head)

Joe Ciccone
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clfs-simp-r3.tar.bz2
Type: application/octet-stream
Size: 6363 bytes
Desc: not available
URL: <http://lists.clfs.org/pipermail/clfs-support-clfs.org/attachments/20100519/447ae39b/attachment-0001.obj>