[Clfs-support] Simplification of the XML
Joe Ciccone
jciccone at gmail.com
Wed May 19 18:24:09 PDT 2010
Over the past week or so I took up an interest in simplifying the XML
once again. I've been exploring a couple of different routes. I would
like a few opinions on this process.
All of my attempts have been centered around the addition of a new
element, <archopt> and its child, <archentry>. They work a lot like a
case statement in any other programming language.
A few examples of how this new element could work:
<archopt>
<archentry arch="x86">some xml that will only appear on x86 here</archentry>
<archentry arch="x86">some xml that will only appear on alpha
here</archentry>
<archentry arch="default">some xml that will appear on every other
arch</archentry>
</archopt>
This element can technically be used to wrap chapters, titles, paras,
programlistings, anything.
The method for rendering the book I've devised is a 2 part process. The
first step is an expansion stage, it is an xslt script that gets run
once. It produces a single xml file for each architecture. That single
file can then be processed again using docbook-xsl and turned into the
html/pdf book as we know it.
To dissect it further. I wrote a very relaxed DTD called expansion.dtd.
It is built off of the docbook 4.5 dtd and adds in support for the
archopt/archentry elements. It is not very strict and currently doesn't
allow the elements everywhere, but it allows the elements to exist
in-and-out of most of the existing docbook structure. This is more of a
rough verification to make sure that the document is in the ballpark and
that the syntax is correct enough for the first stylesheet to process
without errors. This verification can be skipped without an issue,
another target can be added into the makefile for this.
The first stylesheet is clfsexpansion.xsl. It expands the input based on
the stringparam clfs.arch parameter passed to xsltproc. This will output
a single xml file, with all entities expanded, includes expanded, just
like the validation does now. This file is pure docbook, no more
customization at this point. We can do a real validation on these files
and make sure they are accurate.
From here we can do a validation if we want, or just continue processing.
The second stylesheet, db_xhtml_chunks.xsl, renders a standard xhtml
chunked docbook book using xslt. Just like we do now. Very easy to do
nochunks, chunks, pdf, wget lists, dumps, all from this set of files.
Thoughts, Comments? I would really like to implement this asap. I don't
want to make a change this major without putting it out there first.
When it's all said and done, it should make the xml considerably
smaller. we have something like 6000+ xi:includes right now, i bet we
could bring that down to under 500 and remove almost every duplicate
file. (rough numbers off the top of my head)
Joe Ciccone
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clfs-simp-r3.tar.bz2
Type: application/octet-stream
Size: 6363 bytes
Desc: not available
URL: <http://lists.clfs.org/pipermail/clfs-support-clfs.org/attachments/20100519/447ae39b/attachment-0001.obj>
More information about the Clfs-support
mailing list