From 838cad73af7a64f466da25b31a09d784d901c67d Mon Sep 17 00:00:00 2001
From: Frank Steimke <fsteimke.hb@gmail.com>
Date: Mon, 18 Mar 2024 07:54:07 +0100
Subject: [PATCH 1/2] Hint about preprocessing modular DocBook (Ch. 4)

---
 src/guide/xml/ch04.xml     |   7 +
 src/guide/xml/ch04.xml.new | 694 +++++++++++++++++++++++++++++++++++++
 2 files changed, 701 insertions(+)
 create mode 100644 src/guide/xml/ch04.xml.new
diff --git a/src/guide/xml/ch04.xml b/src/guide/xml/ch04.xml
index 3dcb7bc16..8de93a3af 100644
--- a/src/guide/xml/ch04.xml
+++ b/src/guide/xml/ch04.xml
@@ -610,6 +610,13 @@ attribute to the root element (if it doesn’t already have one).</para>
 If it isn’t preserved, relative references to other documents will be resolved against
 the static base URI of the stylesheet and not the URI of the original document. That’s
 unlikely to be correct.</para>
+<para>You must also take into account that no XInclude processing has taken place at this time. 
+If you are using modular DocBook, the <parameter>transform-original</parameter> pipeline is usually a bad choice, 
+because it only operates on the root document, but not on the fragments referenced by XInclude. 
+If it is absolutely necessary to use a <parameter>transform-original</parameter> pipeline together with modular DocBook, 
+you can use Saxons <code>-x</code> switch to enable XInclude when parsing the document (see <xref linkend="java-main" xrefstyle="%label"/>). 
+Otherwise, however, the transform-before pipeline is usually the better choice 
+for modular DocBook documents<indexterm><primary>Modular DocBook</primary><secondary>processing pipelines</secondary></indexterm>.</para>  
 </listitem>
 </varlistentry>
 <varlistentry>
diff --git a/src/guide/xml/ch04.xml.new b/src/guide/xml/ch04.xml.new
new file mode 100644
index 000000000..8de93a3af
--- /dev/null
+++ b/src/guide/xml/ch04.xml.new
@@ -0,0 +1,694 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+	 xmlns:xlink="http://www.w3.org/1999/xlink"
+	 version="5.0" xml:id="implementation">
+<info>
+  <?db filename="ch-implementation"?>
+  <title>Implementation details</title>
+</info>
+
+<para>This section sketches out some features of the implementation.
+It would probably be better to build an annotated
+<link xlink:href="https://tdg.docbook.org/">Definitive Guide</link> or
+something, but this will have to do for now.
+</para>
+
+<section xml:id="custom-chunking">
+<title>Customizing chunking</title>
+
+<para>Chunking is controlled by the <parameter>chunk-include</parameter> and
+<parameter>chunk-exclude</parameter> parameters. These parameters are both
+strings that must contain an XPath expression.</para>
+
+<para>For each node in the document, the <parameter>chunk-include</parameter>
+parameter is evaluated. If it does not return an empty sequence, the element
+is considered a chunking candidate. In this case, the
+<parameter>chunk-exclude</parameter> parameter is evaluated. If the exclude
+expression <emphasis>does</emphasis> return an empty sequence, then the element identified
+becomes a chunk. (If the exclude expression returns a non-empty value, the element
+will not become a chunk.)</para>
+
+</section>
+
+<section xml:id="units">
+<title>Lengths and units</title>
+
+<para>Lengths appear in the context of images (width and height) and
+tables (column widths). Several different units of length are
+possible: absolute lengths (e.g., 3in), relative lengths (e.g., 3*),
+and percentages (e.g., 25%). In some contexts, these can be combined:
+a column width of “3*+0.5in” should have a width equal to 3 times the
+relative width plus ½ inch.</para>
+
+<para>In practice, some of the more complicated forms in <biblioref
+linkend="calstable"/> have no direct mapping to the units available in
+HTML and CSS. The stylesheets attempt to specify a mapping that’s
+close. Broadly, they take the nominal width of the table
+(<parameter>nominal-page-width</parameter>), subtract out the fixed
+widths, divide up the remaining widths proportionally among the
+relative widths, and compute final widths. The final widths can be
+expressed either in absolute terms or as percentages.</para>
+
+<para>In handling the width and height of images, the intrinsic width
+and height of the image in pixels are converted into lengths by
+dividing by <parameter>pixels-per-inch</parameter>. Nominal widths are
+taken into consideration if necessary.</para>
+
+<note>
+<para>Determining the intrinsic size of an image depends on an extension function.
+See <xref linkend="extensions"/>. Many bitmap image formats are supported.
+The bounding box of EPS images is used, if it’s present. The intrinsic size of
+SVG images is not available.</para>
+</note>
+
+<para>The list of recognized units (in, cm, etc.) are taken from
+<varname>v:unit-scale</varname>.</para>
+</section>
+
+<section xml:id="verbstyle">
+<title>Verbatim styles</title>
+
+<para>There are four verbatim styles: <literal>lines</literal>, <literal>table</literal>,
+<literal>plain</literal>, and <literal>raw</literal>.</para>
+
+<variablelist>
+<varlistentry><term><literal>lines</literal></term>
+<listitem>
+<para>In the lines style, each line of the verbatim environment is
+marked up individually. In this style, lines can be numbered and
+callouts can be inserted.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry><term><literal>table</literal></term>
+<listitem>
+<para>In the table style, each line of the verbatim environment is
+marked up individually, very much like the lines style. In this style,
+lines can be numbered and callouts can be inserted. It differs from
+the lines style in that the whole thing is wrapped in a table.</para>
+
+<para>The table has one row and two columns. The line numbers appear in the
+first column, the lines in the second. This format was added in order
+to improve the display in user agents that don’t support CSS.
+Ironically, in the course of adding this style, a number of changes
+were made to the way line numbers are formatted in the lines style
+making it largely, perhaps entirely, unnecessary.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry><term><literal>plain</literal></term>
+<listitem>
+<para>In the plain style, callouts can be inserted, but additional markup is not
+added (except for the callouts). Consequently, it isn’t possible to do line numbering
+or syntax highlighting. (It may be possible to provide these features with JavaScript
+libraries in the browser, however.)
+</para>
+</listitem>
+</varlistentry>
+<varlistentry><term><literal>raw</literal></term>
+<listitem>
+<para>In the raw style, no changes are made to the verbatim content. It’s output as
+it appears. Inline markup that it contains, <tag>emphasis</tag> or other elements, will
+be processed, but you cannot add line numbers, callouts, or syntax highlighting.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+
+<para>Consult <xref linkend="params"/> for a variety of parameters that control
+aspects of verbatim processing.</para>
+
+<section xml:id="line-numbers">
+<title>Line numbers</title>
+
+<para>In the <literal>lines</literal> and <literal>table</literal> styles, line
+numbers may be added to the beginning of some (or all) lines. Prior to version
+1.10.0, the stylesheets inserted the numbers without any padding:</para>
+
+<programlisting language="xml"><![CDATA[<span class="line">
+  <span class="ln">5</span>
+  <span class="ld">The line of text</span>
+</span>]]></programlisting>
+
+<para>(The newlines and indentation in these examples are for clarity. In practice, these
+are inside a <tag namespace="http://www.w3.org/1999/xhtml">pre</tag> where every
+space counts and they’re all run together with line breaks only occurring
+between lines.)</para>
+
+<para>In a graphical browser with CSS support, this looked fine. But
+without CSS, the line numbers and the text that followed them could
+flow together and the alignment of the numbers was unclear.</para>
+
+<para>Starting in version 1.10.0, the stylesheets insert padding
+spaces before each number so that they will all be aligned. If the
+largest line number is three digits long, every number smaller than
+100 will be padded to a width of three characters. A single space is
+added after the number to separate it from the text that follows.
+An additional separator may also be inserted, as shown here.</para>
+
+<programlisting language="xml"><![CDATA[<span class="line">
+  <span class="ln"> 5 <span class="nsep">|</span></span>
+  <span class="ld">The line of text</span>
+</span>]]></programlisting>
+
+<para>These changes have no visible effect when CSS is used to style
+the verbatim environment. But without CSS, the numbers are aligned and
+separated from the text that follows. The
+<parameter>verbatim-number-separator</parameter> is generally
+suppressed by CSS, but is visible in text browsers.</para>
+
+</section>
+</section>
+
+<section xml:id="processing-mediaobjects">
+<title>Processing mediaobjects</title>
+
+<para>Starting in version 1.11.0, the way media objects are processed has been
+refactored. This is designed to support fallback at both the object
+level (<tag>imageobject</tag>, <tag>audioaobject</tag>, <tag>videoobject</tag>,
+<tag>textobject</tag>, and <tag>imageobjectco</tag>) and at the data
+level (<tag>imagedata</tag>, <tag>audiodata</tag>, <tag>videodata</tag>,
+and <tag>textdata</tag> within the objects).</para>
+
+<para>Each data element and object element is processed in the
+<mode>m:mediaobject-info</mode> mode. This returns a map for each object
+that contains an array of maps, one for each data element:</para>
+
+<table>
+<title>The object map</title>
+<tgroup cols="2">
+<thead>
+<row>
+<entry>Key</entry>
+<entry>Value</entry>
+</row>
+</thead>
+<tbody>
+<row>
+<entry><code>content-types</code></entry>
+<entry>An array of the distinct content types in the data elements</entry>
+</row>
+<row>
+<entry><code>datas</code></entry>
+<entry>An array of data maps</entry>
+</row>
+<row>
+<entry><code>extensions</code></entry>
+<entry>An array of the distinct extensions in the data elements</entry>
+</row>
+<row>
+<entry><code>node</code></entry>
+<entry>The media object node</entry>
+</row>
+</tbody>
+</tgroup>
+</table>
+
+<para>Each data map has the following structure:</para>
+
+<table>
+<title>The data map</title>
+<tgroup cols="2">
+<thead>
+<row>
+<entry>Key</entry>
+<entry>Value</entry>
+</row>
+</thead>
+<tbody>
+<row>
+<entry><code>align</code></entry>
+<entry>The alignment of the data (if specified)</entry>
+</row>
+<row>
+<entry><code>content-type</code></entry>
+<entry>The computed content type for the data</entry>
+</row>
+<row>
+<entry><code>contentheight</code></entry>
+<entry>The content height of the data (if specified<footnote xml:id="fn-depth"><para>DocBook
+uses “depth” instead of “height”, but we convert
+it to height for consistency with most other systems.</para></footnote>)</entry>
+</row>
+<row>
+<entry><code>contentwidth</code></entry>
+<entry>The content width of the data (if specified)</entry>
+</row>
+<row>
+<entry><code>extension</code></entry>
+<entry>The extension of the data file (if there is one)</entry>
+</row>
+<row>
+<entry><code>fileref</code></entry>
+<entry>The original <tag class="attribute">fileref</tag> attribute value</entry>
+</row>
+<row>
+<entry><code>height</code></entry>
+<entry>The height of the data (if specified<footnoteref linkend="fn-depth"/>)</entry>
+</row>
+<row>
+<entry><code>href</code></entry>
+<entry>The computed <tag class="attribute">href</tag> value for the HTML element;
+this takes account of the <parameter>mediaobject-input-base-uri</parameter> and
+<parameter>mediaobject-output-base-uri</parameter>).
+</entry>
+</row>
+<row>
+<entry><code>node</code></entry>
+<entry>The data element</entry>
+</row>
+<row>
+<entry><code>params</code></entry>
+<entry>Any <tag>multimediaparams</tag> associated with the data element</entry>
+</row>
+<row>
+<entry><code>properties</code></entry>
+<entry>The properties of the element (as returned by the extension funtions;
+this can include EXIF data and metadata)</entry>
+</row>
+<row>
+<entry><code>scale</code></entry>
+<entry>The scaling factor (if there is one) or 1.0</entry>
+</row>
+<row>
+<entry><code>scalefit</code></entry>
+<entry>True if the image should be scaled to fit (implicitly or explicitly)</entry>
+</row>
+<row>
+<entry><code>uri</code></entry>
+<entry>The computed absolute URI of the input data</entry>
+</row>
+<row>
+<entry><code>valign</code></entry>
+<entry>The vertical alignment of the data (if specified)</entry>
+</row>
+<row>
+<entry><code>width</code></entry>
+<entry>The width of the data (if specifieid)</entry>
+</row>
+</tbody>
+</tgroup>
+</table>
+
+<para>The <literal>uri</literal> and <literal>href</literal> properties are computed
+by processing the data elements in the <mode>m:mediaobject-uris</mode> mode.</para>
+
+<para>Armed with information about the objects and the data associated with them,
+the stylesheets proceed to choose an object and then process it. Each object is
+considered in turn, if any of the data elements it contains were excluded, then it is
+rejected. The first object where all of the elements are acceptable is selected.</para>
+
+<para>Consider this example:</para>
+
+<programlisting language="xml"><![CDATA[<mediaobject>
+<imageobject>
+  <imagedata fileref="image.svg"/>
+  <imagedata fileref="image.eps"/>
+  <imagedata fileref="image.png" width="4in"/>
+</imageobject>
+<imageobject>
+  <imagedata fileref="image.svg"/>
+  <imagedata fileref="image.png" width="40em"/>
+</imageobject>
+</mediaobject>]]></programlisting>
+
+<para>If this is being processed for online presentation, the default
+value of <parameter>mediaobject-exclude-extensions</parameter> will exclude the
+EPS file. Because one of it’s data elements was excluded, the processor will choose
+the object containing only the SVG and PNG images for online presentation.</para>
+
+<para>Once an object is selected, an appropriate wrapper is created and all
+of the alternatives are placed within it. So the example above will result in
+<tag namespace="http://www.w3.org/1999/xhtml">picture</tag> element containing
+a
+<tag namespace="http://www.w3.org/1999/xhtml">source</tag> for the SVG image and an
+<tag namespace="http://www.w3.org/1999/xhtml">img</tag> for the fallback PNG.
+</para>
+
+<note>
+<para>Consistent with HTML, only the size, scaling, and alignment attributes of
+the <emphasis>last</emphasis> alternative data element are considered! These apply
+irrespective of which alternative is selected.</para>
+</note>
+
+<section xml:id="mediaobject-uris">
+<title>Mediaobject URIs</title>
+
+<para>Media object URIs are tricky to handle. It’s most
+convenient if the URIs in the source documents point to the actual media files.
+This allows extensions, like the image properties
+<link linkend="extensions">extension function</link>, to access the files.
+At the same time, the references generated in the HTML have to point to the
+locations where they will be published.</para>
+
+<para>In previous versions, the stylesheets attempted (broadly) to use
+the relative difference between the input and output base URIs to work out
+the correct relative URIs for media. That imposed restrictions on the
+authoring environment that weren’t always easy to work with.
+Starting in version 2.0.6, the mechanisms for finding sources
+and producing references in the output has changed. Three parameters
+are used:</para>
+
+<variablelist>
+<varlistentry>
+<term><parameter>mediaobject-input-base-uri</parameter></term>
+<listitem>
+<para>If the <parameter>mediaobject-input-base-uri</parameter> is empty (the default),
+then URIs in the source document are assumed to be relative to the base URI on
+which they occur. This is the usual case if you mix XML and media into the same
+directory structure on the filesystem.</para>
+<para>If the <parameter>mediaobject-input-base-uri</parameter> is not empty, it is
+used to resolve all media URIs. If it’s initialized with a relative URI, that URI will
+be made absolute against the base URI if the input document.</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term><parameter>mediaobject-output-base-uri</parameter></term>
+<listitem>
+<para>If the <parameter>mediaobject-output-base-uri</parameter> is empty (the default),
+then URIs in the output are treated as parallel to the URIs in the input. If the
+reference <filename>../image.png</filename> works in the source document, it’s assumed
+that will also work in the output document.</para>
+<para>If the <parameter>mediaobject-output-base-uri</parameter> is not empty, it is
+the base URI used for media objects. If this is a relative URI, it is taken to be
+relative to the root of the output hierarchy.</para>
+<para>Suppose the output base URI is <uri>https://images.example.com/</uri>, then
+a reference to <filename>image.png</filename> will appear as
+<uri>https://images.example.com/image.png</uri> in the output.</para>
+<para>If the output base URI is <uri>media/</uri>, then
+a reference to <filename>image.png</filename> will appear as
+<uri>media/image.png</uri> in the output. If the document is chunked, the paths back
+to the output directory are relative. In otherwords, if the reference to
+<filename>image.png</filename> appears in a chunk that will be located at
+<filename>back/appendix.html</filename>, then the media URI will be
+<uri>../media/image.png</uri>.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term><parameter>mediaobject-output-paths</parameter></term>
+<listitem>
+<para>This parameter controls whether the relative paths in the input URIs apply
+to the output URIs as well. If the parameter <glossterm>is true</glossterm>,
+the output base URI is <uri>media/</uri>, and the input URI is
+<filename>path/to/image.png</filename>, then the output reference will be to
+<uri>media/path/to/image.png</uri>. If it’s false, then the output reference
+will be to <uri>media/image.png</uri>.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+
+<para>For a further discussion of the options, see
+<xref linkend="media"/>.</para>
+
+<important>
+<para>The stylesheets are not responsible for actually copying the media files
+into the correct locations in the output. The stylesheets only generate the HTML
+files and the references. You must copy the images and other media with some
+other process.</para>
+</important>
+</section>
+</section>
+
+<section xml:id="header-templates">
+<title>Templates</title>
+
+<para>It’s difficult to make title pages for documents easy to customize. There
+is <emphasis>a lot</emphasis> of variation between documents and the styles can
+have very precise design constraints. At the end of the day, if you need complete control,
+you can define a template that matches the element in the
+<mode>m:generate-titlepage</mode> mode and generate all of the markup you wish.</para>
+
+<para>The default title page handling attempts to make some declarative customization
+possible by using templates. A typical header template looks like this:</para>
+
+<programlisting language="xml"><![CDATA[<db:section>
+  <header>
+    <tmp:apply-templates select="db:title">
+      <h2><tmp:content/></h2>
+    </tmp:apply-templates>
+    <tmp:apply-templates select="db:subtitle">
+      <h3><tmp:content/></h3>
+    </tmp:apply-templates>
+  </header>
+</db:section>]]></programlisting>
+
+<para>Any HTML element in the template will be copied to the output. The semantics
+of a “template apply templates” element (<tag>tmp:apply-templates</tag>) is that
+it runs the ordinary <tag>xsl:apply-templates</tag> instruction on the elements that
+match its select expression. If they result is the empty sequence (e.g, if there is no 
+<tag>subtitle</tag>), nothing is output. If there is a result, the content of the
+<tag>tmp:apply-templates</tag> element is processed. Anywhere that
+<tag>tmp:content</tag> appears, the result of applying templates will be output.</para>
+
+<para>In this example, if the title is “H<subscript>2</subscript>O” and there is no subtitle,
+the resulting HTML title page will be:</para>
+
+<programlisting language="xml"><![CDATA[<header>
+  <h2>H<sub>2</sub>O</h2>
+</header>]]></programlisting>
+</section>
+
+<section xml:id="annotations">
+<title>Annotations</title>
+
+<para>The stylesheets fully support annotations, including a number
+of presentation styles enabled by JavaScript in the browser. They also
+support an extension of the documented semantics of
+<tag>annotation</tag>.</para>
+
+<para>Annotations are applied to elements with links. Either the
+element must point to its annotations (with an <att>annotations</att>
+attribute) or the annotations must point to the elements they annotate
+(with an <att>annotates</att> attribute). These are documented as
+ID/IDREF links but they are not <code>IDREFS</code> attributes
+because annotations may be stored separately.</para>
+
+<para>Starting in version 1.5.1, the <citetitle>DocBook xslTNG
+Stylesheets</citetitle> support a non-standard extension: if you place
+the string <code>xpath:</code> in the <att>annotates</att> attribute of
+an <tag>annotation</tag>, then the rest of the attribute is assumed to contain
+an XPath expression that points to the element(s) to which the annotation
+applies. (If you want to put IDREF values before the <code>xpath:</code> token,
+that’s fine, but you can’t put them after; the expression continues to the end
+of the attribute value.)</para>
+
+<para>Suppose, for example, that you wanted to annotate the stylesheet
+title in the previous paragraph. The standard mechanisms would require that
+you either put an <att>xml:id</att> attribute on the element or point to the
+annotation from the element. With the XPath extension, you can do this:</para>
+
+<annotation annotates="xpath:preceding-sibling::db:para/db:citetitle"
+            xmlns:db="http://docbook.org/ns/docbook">
+<para>This annotation applies to the stylesheet title. For a discussion
+of this annotation, see the following paragraphs.
+</para>
+</annotation>
+
+<programlisting language="xml"><![CDATA[<annotation
+   annotates="xpath:preceding-sibling::db:para/db:citetitle"
+   xmlns:db="http://docbook.org/ns/docbook">
+<para>This annotation applies to the stylesheet title.
+For a discussion of this annotation, see the
+following paragraphs.</para>
+</annotation>]]></programlisting>
+
+<para>When the XPath expression is evaluated, the <tag>annotation</tag>
+element is the context item. Often, this means that you’ll want to start
+the expression with <code>id()</code> or <code>/</code>.</para>
+
+<para>The namespace context for the expression is also the <tag>annotation</tag>
+element, that’s why I’ve added the DocBook namespace binding for the
+<tag class="prefix">db</tag> prefix. In practice, if you’re doing this on
+several annotations, you can just put all the namespace bindings on a common
+ancestor. All of the bindings <emphasis>in scope</emphasis> on the
+<tag>annotation</tag> element are available in the expression.</para>
+
+<para>Caveats:</para>
+
+<orderedlist>
+<listitem>
+<para>There’s no way to have multiple XPath expressions. You can’t put
+“<code>xpath:</code>” in there twice. If you want an annotation to apply to
+multiple elements, you’ll have to construct a single expression that selects
+all the elements, or duplicate the annotation, or use ID/IDREF links.</para>
+<para>If this turns out to be a serious limitation in practice, additional
+syntax could be added to support multiple expressions, but it doesn’t
+<emphasis>seem</emphasis> necessary.</para>
+</listitem>
+<listitem>
+<para>You can only select elements. There’s no way to select the third word
+in a particular paragraph, for example, unless it already has some markup
+around it. There’s also no way to select a comment or a processing instruction.
+</para>
+</listitem>
+</orderedlist>
+
+<para>The placement of the annotation marker (“⌖” by default) can also be
+controlled globally and on individual annotations. The
+<parameter>annotation-placement</parameter> parameter provides global control.
+To specify the position for an individual annotation, put the token
+“<code>before</code>” or “<code>after</code>” in the <att>role</att>
+attribute on the <tag>annotation</tag>.</para>
+
+</section>
+
+<section xml:id="preprocessing-pipeline">
+<title>The pre- and post-processing pipeline</title>
+
+<para>Processing a DocBook document is a multi-stage process. The
+original document is transformed several times before converting it to
+HTML. The standard transformations are:</para>
+
+<orderedlist>
+<listitem xml:id="step-first-00-transform">
+<para>Adjust the logical structure. Adds an XML base attribute to the root of the
+document and converts media object <tag class="attribute">entityref</tag> attributes
+into <tag class="attribute">fileref</tag> attributes.
+</para>
+</listitem>
+<listitem>
+<para>Perform XInclude processing. Only occurs if the appropriate
+extension function is available and if the document contains XInclude
+element.
+</para>
+</listitem>
+<listitem>
+<para>Convert DocBook 4.x to DocBook 5.x. Only occurs if the root element is not in
+a namespace.
+</para>
+</listitem>
+<listitem>
+<para>Peform transclusion.
+</para>
+</listitem>
+<listitem>
+<para>Perform profiling.
+</para>
+</listitem>
+<listitem>
+<para>Normalize the content. This removes a lot of variation that’s allowed for authoring.
+For example, authors aren’t required to use an <tag>info</tag> element if a formal object
+has only a title. This process adds the <tag>info</tag> element if it’s missing.
+</para>
+</listitem>
+<listitem>
+<para>Resolve annotations.
+</para>
+</listitem>
+<listitem>
+<para>Process XLink link bases.
+</para>
+</listitem>
+<listitem>
+<para>Validate. Only occurs if the appropriate
+extension function is available and the stylesheet specifies a
+<parameter>relax-ng-grammar</parameter>.
+</para>
+</listitem>
+<listitem xml:id="step-last-oxy-markup">
+<para>Process Oxygen change markup. Only occurs if 
+<parameter>oxy-markup</parameter> <glossterm>is true</glossterm> and the document contains
+Oxygen change markup processing instructions.
+</para>
+</listitem>
+</orderedlist>
+
+<para>A customization can introduce transformations to the original
+document using three parameters:</para>
+
+<variablelist>
+<varlistentry>
+<term><parameter>transform-original</parameter></term>
+<listitem>
+<para>This transform runs before step <xref linkend="step-first-00-transform"/> in the standard transformations.
+If this transformation is used, it must take special care to preserve the 
+base URI of the original document by adding an <tag class="attribute">xml:base</tag>
+attribute to the root element (if it doesn’t already have one).</para>
+<para>Only the first transformation in the list has access to the original base URI.
+If it isn’t preserved, relative references to other documents will be resolved against
+the static base URI of the stylesheet and not the URI of the original document. That’s
+unlikely to be correct.</para>
+<para>You must also take into account that no XInclude processing has taken place at this time. 
+If you are using modular DocBook, the <parameter>transform-original</parameter> pipeline is usually a bad choice, 
+because it only operates on the root document, but not on the fragments referenced by XInclude. 
+If it is absolutely necessary to use a <parameter>transform-original</parameter> pipeline together with modular DocBook, 
+you can use Saxons <code>-x</code> switch to enable XInclude when parsing the document (see <xref linkend="java-main" xrefstyle="%label"/>). 
+Otherwise, however, the transform-before pipeline is usually the better choice 
+for modular DocBook documents<indexterm><primary>Modular DocBook</primary><secondary>processing pipelines</secondary></indexterm>.</para>  
+</listitem>
+</varlistentry>
+<varlistentry>
+<term><parameter>transform-before</parameter></term>
+<listitem>
+<para>This transformation runs after step <xref linkend="step-last-oxy-markup"/>. Its
+input is the DocBook document that will be transformed into HTML.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term><parameter>transform-after</parameter></term>
+<listitem>
+<para>This transformation runs after the DocBook document has been transformed into HTML.
+The resulting HTML document is not valid HTML, but contains islands of valid HTML that will
+be separated out into chunks by subsequent processing.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+
+<para>(If you need
+to insert a transformation in the middle of the standard transformations,
+you’ll have to update the <varname>v:standard-transforms</varname>
+variable in your customization.)</para>
+
+<para>Each of the transformation variables holds a list of transforms that will
+be applied in the order specified. Each member of the list can be a map or a
+string. If a string is provided, it’s the equivalent of providing this map:
+</para>
+
+<programlisting language="xpath">map {
+  'stylesheet-location': $the-string
+}</programlisting>
+
+<para>The map can have several keys:</para>
+
+<table>
+<title>The transformation map</title>
+<tgroup cols="2">
+<thead>
+<row>
+<entry>Key</entry>
+<entry>Value</entry>
+</row>
+</thead>
+<tbody>
+<row>
+<entry><code>stylesheet-location</code></entry>
+<entry>The location of the XSLT stylesheet that performs this transformation.
+This key is required.</entry>
+</row>
+<row>
+<entry><code>extra-params</code></entry>
+<entry>A map of QName/value pairs. These parameters will be made available to
+the transformation in addition to all of the standard parameters available to a
+standard DocBook stylesheet.</entry>
+</row>
+<row>
+<entry><code>functions</code></entry>
+<entry>A list of functions (expressed as EQNames). The transformation will only be
+run if every extension function listed is available.</entry>
+</row>
+<row>
+<entry><code>test</code></entry>
+<entry>An arbitrary XPath expression. The expression will be evaluated with the
+document as the context item. If it returns an effective boolean value of true,
+the transformation will be run.</entry>
+</row>
+</tbody>
+</tgroup>
+</table>
+
+</section>
+</chapter>

From 4ae778ac74d4ee8d0ce0178d434a89b452290313 Mon Sep 17 00:00:00 2001
From: Frank Steimke <fsteimke.hb@gmail.com>
Date: Thu, 3 Oct 2024 11:09:32 +0200
Subject: [PATCH 2/2] removed ch04.xml.new

---
 src/guide/xml/ch04.xml.new | 694 -------------------------------------
 1 file changed, 694 deletions(-)
 delete mode 100644 src/guide/xml/ch04.xml.new

diff --git a/src/guide/xml/ch04.xml.new b/src/guide/xml/ch04.xml.new
deleted file mode 100644
index 8de93a3af..000000000
--- a/src/guide/xml/ch04.xml.new
+++ /dev/null
@@ -1,694 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<chapter xmlns="http://docbook.org/ns/docbook"
-	 xmlns:xlink="http://www.w3.org/1999/xlink"
-	 version="5.0" xml:id="implementation">
-<info>
-  <?db filename="ch-implementation"?>
-  <title>Implementation details</title>
-</info>
-
-<para>This section sketches out some features of the implementation.
-It would probably be better to build an annotated
-<link xlink:href="https://tdg.docbook.org/">Definitive Guide</link> or
-something, but this will have to do for now.
-</para>
-
-<section xml:id="custom-chunking">
-<title>Customizing chunking</title>
-
-<para>Chunking is controlled by the <parameter>chunk-include</parameter> and
-<parameter>chunk-exclude</parameter> parameters. These parameters are both
-strings that must contain an XPath expression.</para>
-
-<para>For each node in the document, the <parameter>chunk-include</parameter>
-parameter is evaluated. If it does not return an empty sequence, the element
-is considered a chunking candidate. In this case, the
-<parameter>chunk-exclude</parameter> parameter is evaluated. If the exclude
-expression <emphasis>does</emphasis> return an empty sequence, then the element identified
-becomes a chunk. (If the exclude expression returns a non-empty value, the element
-will not become a chunk.)</para>
-
-</section>
-
-<section xml:id="units">
-<title>Lengths and units</title>
-
-<para>Lengths appear in the context of images (width and height) and
-tables (column widths). Several different units of length are
-possible: absolute lengths (e.g., 3in), relative lengths (e.g., 3*),
-and percentages (e.g., 25%). In some contexts, these can be combined:
-a column width of “3*+0.5in” should have a width equal to 3 times the
-relative width plus ½ inch.</para>
-
-<para>In practice, some of the more complicated forms in <biblioref
-linkend="calstable"/> have no direct mapping to the units available in
-HTML and CSS. The stylesheets attempt to specify a mapping that’s
-close. Broadly, they take the nominal width of the table
-(<parameter>nominal-page-width</parameter>), subtract out the fixed
-widths, divide up the remaining widths proportionally among the
-relative widths, and compute final widths. The final widths can be
-expressed either in absolute terms or as percentages.</para>
-
-<para>In handling the width and height of images, the intrinsic width
-and height of the image in pixels are converted into lengths by
-dividing by <parameter>pixels-per-inch</parameter>. Nominal widths are
-taken into consideration if necessary.</para>
-
-<note>
-<para>Determining the intrinsic size of an image depends on an extension function.
-See <xref linkend="extensions"/>. Many bitmap image formats are supported.
-The bounding box of EPS images is used, if it’s present. The intrinsic size of
-SVG images is not available.</para>
-</note>
-
-<para>The list of recognized units (in, cm, etc.) are taken from
-<varname>v:unit-scale</varname>.</para>
-</section>
-
-<section xml:id="verbstyle">
-<title>Verbatim styles</title>
-
-<para>There are four verbatim styles: <literal>lines</literal>, <literal>table</literal>,
-<literal>plain</literal>, and <literal>raw</literal>.</para>
-
-<variablelist>
-<varlistentry><term><literal>lines</literal></term>
-<listitem>
-<para>In the lines style, each line of the verbatim environment is
-marked up individually. In this style, lines can be numbered and
-callouts can be inserted.
-</para>
-</listitem>
-</varlistentry>
-<varlistentry><term><literal>table</literal></term>
-<listitem>
-<para>In the table style, each line of the verbatim environment is
-marked up individually, very much like the lines style. In this style,
-lines can be numbered and callouts can be inserted. It differs from
-the lines style in that the whole thing is wrapped in a table.</para>
-
-<para>The table has one row and two columns. The line numbers appear in the
-first column, the lines in the second. This format was added in order
-to improve the display in user agents that don’t support CSS.
-Ironically, in the course of adding this style, a number of changes
-were made to the way line numbers are formatted in the lines style
-making it largely, perhaps entirely, unnecessary.
-</para>
-</listitem>
-</varlistentry>
-<varlistentry><term><literal>plain</literal></term>
-<listitem>
-<para>In the plain style, callouts can be inserted, but additional markup is not
-added (except for the callouts). Consequently, it isn’t possible to do line numbering
-or syntax highlighting. (It may be possible to provide these features with JavaScript
-libraries in the browser, however.)
-</para>
-</listitem>
-</varlistentry>
-<varlistentry><term><literal>raw</literal></term>
-<listitem>
-<para>In the raw style, no changes are made to the verbatim content. It’s output as
-it appears. Inline markup that it contains, <tag>emphasis</tag> or other elements, will
-be processed, but you cannot add line numbers, callouts, or syntax highlighting.
-</para>
-</listitem>
-</varlistentry>
-</variablelist>
-
-<para>Consult <xref linkend="params"/> for a variety of parameters that control
-aspects of verbatim processing.</para>
-
-<section xml:id="line-numbers">
-<title>Line numbers</title>
-
-<para>In the <literal>lines</literal> and <literal>table</literal> styles, line
-numbers may be added to the beginning of some (or all) lines. Prior to version
-1.10.0, the stylesheets inserted the numbers without any padding:</para>
-
-<programlisting language="xml"><![CDATA[<span class="line">
-  <span class="ln">5</span>
-  <span class="ld">The line of text</span>
-</span>]]></programlisting>
-
-<para>(The newlines and indentation in these examples are for clarity. In practice, these
-are inside a <tag namespace="http://www.w3.org/1999/xhtml">pre</tag> where every
-space counts and they’re all run together with line breaks only occurring
-between lines.)</para>
-
-<para>In a graphical browser with CSS support, this looked fine. But
-without CSS, the line numbers and the text that followed them could
-flow together and the alignment of the numbers was unclear.</para>
-
-<para>Starting in version 1.10.0, the stylesheets insert padding
-spaces before each number so that they will all be aligned. If the
-largest line number is three digits long, every number smaller than
-100 will be padded to a width of three characters. A single space is
-added after the number to separate it from the text that follows.
-An additional separator may also be inserted, as shown here.</para>
-
-<programlisting language="xml"><![CDATA[<span class="line">
-  <span class="ln"> 5 <span class="nsep">|</span></span>
-  <span class="ld">The line of text</span>
-</span>]]></programlisting>
-
-<para>These changes have no visible effect when CSS is used to style
-the verbatim environment. But without CSS, the numbers are aligned and
-separated from the text that follows. The
-<parameter>verbatim-number-separator</parameter> is generally
-suppressed by CSS, but is visible in text browsers.</para>
-
-</section>
-</section>
-
-<section xml:id="processing-mediaobjects">
-<title>Processing mediaobjects</title>
-
-<para>Starting in version 1.11.0, the way media objects are processed has been
-refactored. This is designed to support fallback at both the object
-level (<tag>imageobject</tag>, <tag>audioaobject</tag>, <tag>videoobject</tag>,
-<tag>textobject</tag>, and <tag>imageobjectco</tag>) and at the data
-level (<tag>imagedata</tag>, <tag>audiodata</tag>, <tag>videodata</tag>,
-and <tag>textdata</tag> within the objects).</para>
-
-<para>Each data element and object element is processed in the
-<mode>m:mediaobject-info</mode> mode. This returns a map for each object
-that contains an array of maps, one for each data element:</para>
-
-<table>
-<title>The object map</title>
-<tgroup cols="2">
-<thead>
-<row>
-<entry>Key</entry>
-<entry>Value</entry>
-</row>
-</thead>
-<tbody>
-<row>
-<entry><code>content-types</code></entry>
-<entry>An array of the distinct content types in the data elements</entry>
-</row>
-<row>
-<entry><code>datas</code></entry>
-<entry>An array of data maps</entry>
-</row>
-<row>
-<entry><code>extensions</code></entry>
-<entry>An array of the distinct extensions in the data elements</entry>
-</row>
-<row>
-<entry><code>node</code></entry>
-<entry>The media object node</entry>
-</row>
-</tbody>
-</tgroup>
-</table>
-
-<para>Each data map has the following structure:</para>
-
-<table>
-<title>The data map</title>
-<tgroup cols="2">
-<thead>
-<row>
-<entry>Key</entry>
-<entry>Value</entry>
-</row>
-</thead>
-<tbody>
-<row>
-<entry><code>align</code></entry>
-<entry>The alignment of the data (if specified)</entry>
-</row>
-<row>
-<entry><code>content-type</code></entry>
-<entry>The computed content type for the data</entry>
-</row>
-<row>
-<entry><code>contentheight</code></entry>
-<entry>The content height of the data (if specified<footnote xml:id="fn-depth"><para>DocBook
-uses “depth” instead of “height”, but we convert
-it to height for consistency with most other systems.</para></footnote>)</entry>
-</row>
-<row>
-<entry><code>contentwidth</code></entry>
-<entry>The content width of the data (if specified)</entry>
-</row>
-<row>
-<entry><code>extension</code></entry>
-<entry>The extension of the data file (if there is one)</entry>
-</row>
-<row>
-<entry><code>fileref</code></entry>
-<entry>The original <tag class="attribute">fileref</tag> attribute value</entry>
-</row>
-<row>
-<entry><code>height</code></entry>
-<entry>The height of the data (if specified<footnoteref linkend="fn-depth"/>)</entry>
-</row>
-<row>
-<entry><code>href</code></entry>
-<entry>The computed <tag class="attribute">href</tag> value for the HTML element;
-this takes account of the <parameter>mediaobject-input-base-uri</parameter> and
-<parameter>mediaobject-output-base-uri</parameter>).
-</entry>
-</row>
-<row>
-<entry><code>node</code></entry>
-<entry>The data element</entry>
-</row>
-<row>
-<entry><code>params</code></entry>
-<entry>Any <tag>multimediaparams</tag> associated with the data element</entry>
-</row>
-<row>
-<entry><code>properties</code></entry>
-<entry>The properties of the element (as returned by the extension funtions;
-this can include EXIF data and metadata)</entry>
-</row>
-<row>
-<entry><code>scale</code></entry>
-<entry>The scaling factor (if there is one) or 1.0</entry>
-</row>
-<row>
-<entry><code>scalefit</code></entry>
-<entry>True if the image should be scaled to fit (implicitly or explicitly)</entry>
-</row>
-<row>
-<entry><code>uri</code></entry>
-<entry>The computed absolute URI of the input data</entry>
-</row>
-<row>
-<entry><code>valign</code></entry>
-<entry>The vertical alignment of the data (if specified)</entry>
-</row>
-<row>
-<entry><code>width</code></entry>
-<entry>The width of the data (if specifieid)</entry>
-</row>
-</tbody>
-</tgroup>
-</table>
-
-<para>The <literal>uri</literal> and <literal>href</literal> properties are computed
-by processing the data elements in the <mode>m:mediaobject-uris</mode> mode.</para>
-
-<para>Armed with information about the objects and the data associated with them,
-the stylesheets proceed to choose an object and then process it. Each object is
-considered in turn, if any of the data elements it contains were excluded, then it is
-rejected. The first object where all of the elements are acceptable is selected.</para>
-
-<para>Consider this example:</para>
-
-<programlisting language="xml"><![CDATA[<mediaobject>
-<imageobject>
-  <imagedata fileref="image.svg"/>
-  <imagedata fileref="image.eps"/>
-  <imagedata fileref="image.png" width="4in"/>
-</imageobject>
-<imageobject>
-  <imagedata fileref="image.svg"/>
-  <imagedata fileref="image.png" width="40em"/>
-</imageobject>
-</mediaobject>]]></programlisting>
-
-<para>If this is being processed for online presentation, the default
-value of <parameter>mediaobject-exclude-extensions</parameter> will exclude the
-EPS file. Because one of it’s data elements was excluded, the processor will choose
-the object containing only the SVG and PNG images for online presentation.</para>
-
-<para>Once an object is selected, an appropriate wrapper is created and all
-of the alternatives are placed within it. So the example above will result in
-<tag namespace="http://www.w3.org/1999/xhtml">picture</tag> element containing
-a
-<tag namespace="http://www.w3.org/1999/xhtml">source</tag> for the SVG image and an
-<tag namespace="http://www.w3.org/1999/xhtml">img</tag> for the fallback PNG.
-</para>
-
-<note>
-<para>Consistent with HTML, only the size, scaling, and alignment attributes of
-the <emphasis>last</emphasis> alternative data element are considered! These apply
-irrespective of which alternative is selected.</para>
-</note>
-
-<section xml:id="mediaobject-uris">
-<title>Mediaobject URIs</title>
-
-<para>Media object URIs are tricky to handle. It’s most
-convenient if the URIs in the source documents point to the actual media files.
-This allows extensions, like the image properties
-<link linkend="extensions">extension function</link>, to access the files.
-At the same time, the references generated in the HTML have to point to the
-locations where they will be published.</para>
-
-<para>In previous versions, the stylesheets attempted (broadly) to use
-the relative difference between the input and output base URIs to work out
-the correct relative URIs for media. That imposed restrictions on the
-authoring environment that weren’t always easy to work with.
-Starting in version 2.0.6, the mechanisms for finding sources
-and producing references in the output has changed. Three parameters
-are used:</para>
-
-<variablelist>
-<varlistentry>
-<term><parameter>mediaobject-input-base-uri</parameter></term>
-<listitem>
-<para>If the <parameter>mediaobject-input-base-uri</parameter> is empty (the default),
-then URIs in the source document are assumed to be relative to the base URI on
-which they occur. This is the usual case if you mix XML and media into the same
-directory structure on the filesystem.</para>
-<para>If the <parameter>mediaobject-input-base-uri</parameter> is not empty, it is
-used to resolve all media URIs. If it’s initialized with a relative URI, that URI will
-be made absolute against the base URI if the input document.</para>
-</listitem>
-</varlistentry>
-<varlistentry>
-<term><parameter>mediaobject-output-base-uri</parameter></term>
-<listitem>
-<para>If the <parameter>mediaobject-output-base-uri</parameter> is empty (the default),
-then URIs in the output are treated as parallel to the URIs in the input. If the
-reference <filename>../image.png</filename> works in the source document, it’s assumed
-that will also work in the output document.</para>
-<para>If the <parameter>mediaobject-output-base-uri</parameter> is not empty, it is
-the base URI used for media objects. If this is a relative URI, it is taken to be
-relative to the root of the output hierarchy.</para>
-<para>Suppose the output base URI is <uri>https://images.example.com/</uri>, then
-a reference to <filename>image.png</filename> will appear as
-<uri>https://images.example.com/image.png</uri> in the output.</para>
-<para>If the output base URI is <uri>media/</uri>, then
-a reference to <filename>image.png</filename> will appear as
-<uri>media/image.png</uri> in the output. If the document is chunked, the paths back
-to the output directory are relative. In otherwords, if the reference to
-<filename>image.png</filename> appears in a chunk that will be located at
-<filename>back/appendix.html</filename>, then the media URI will be
-<uri>../media/image.png</uri>.
-</para>
-</listitem>
-</varlistentry>
-<varlistentry>
-<term><parameter>mediaobject-output-paths</parameter></term>
-<listitem>
-<para>This parameter controls whether the relative paths in the input URIs apply
-to the output URIs as well. If the parameter <glossterm>is true</glossterm>,
-the output base URI is <uri>media/</uri>, and the input URI is
-<filename>path/to/image.png</filename>, then the output reference will be to
-<uri>media/path/to/image.png</uri>. If it’s false, then the output reference
-will be to <uri>media/image.png</uri>.
-</para>
-</listitem>
-</varlistentry>
-</variablelist>
-
-<para>For a further discussion of the options, see
-<xref linkend="media"/>.</para>
-
-<important>
-<para>The stylesheets are not responsible for actually copying the media files
-into the correct locations in the output. The stylesheets only generate the HTML
-files and the references. You must copy the images and other media with some
-other process.</para>
-</important>
-</section>
-</section>
-
-<section xml:id="header-templates">
-<title>Templates</title>
-
-<para>It’s difficult to make title pages for documents easy to customize. There
-is <emphasis>a lot</emphasis> of variation between documents and the styles can
-have very precise design constraints. At the end of the day, if you need complete control,
-you can define a template that matches the element in the
-<mode>m:generate-titlepage</mode> mode and generate all of the markup you wish.</para>
-
-<para>The default title page handling attempts to make some declarative customization
-possible by using templates. A typical header template looks like this:</para>
-
-<programlisting language="xml"><![CDATA[<db:section>
-  <header>
-    <tmp:apply-templates select="db:title">
-      <h2><tmp:content/></h2>
-    </tmp:apply-templates>
-    <tmp:apply-templates select="db:subtitle">
-      <h3><tmp:content/></h3>
-    </tmp:apply-templates>
-  </header>
-</db:section>]]></programlisting>
-
-<para>Any HTML element in the template will be copied to the output. The semantics
-of a “template apply templates” element (<tag>tmp:apply-templates</tag>) is that
-it runs the ordinary <tag>xsl:apply-templates</tag> instruction on the elements that
-match its select expression. If they result is the empty sequence (e.g, if there is no 
-<tag>subtitle</tag>), nothing is output. If there is a result, the content of the
-<tag>tmp:apply-templates</tag> element is processed. Anywhere that
-<tag>tmp:content</tag> appears, the result of applying templates will be output.</para>
-
-<para>In this example, if the title is “H<subscript>2</subscript>O” and there is no subtitle,
-the resulting HTML title page will be:</para>
-
-<programlisting language="xml"><![CDATA[<header>
-  <h2>H<sub>2</sub>O</h2>
-</header>]]></programlisting>
-</section>
-
-<section xml:id="annotations">
-<title>Annotations</title>
-
-<para>The stylesheets fully support annotations, including a number
-of presentation styles enabled by JavaScript in the browser. They also
-support an extension of the documented semantics of
-<tag>annotation</tag>.</para>
-
-<para>Annotations are applied to elements with links. Either the
-element must point to its annotations (with an <att>annotations</att>
-attribute) or the annotations must point to the elements they annotate
-(with an <att>annotates</att> attribute). These are documented as
-ID/IDREF links but they are not <code>IDREFS</code> attributes
-because annotations may be stored separately.</para>
-
-<para>Starting in version 1.5.1, the <citetitle>DocBook xslTNG
-Stylesheets</citetitle> support a non-standard extension: if you place
-the string <code>xpath:</code> in the <att>annotates</att> attribute of
-an <tag>annotation</tag>, then the rest of the attribute is assumed to contain
-an XPath expression that points to the element(s) to which the annotation
-applies. (If you want to put IDREF values before the <code>xpath:</code> token,
-that’s fine, but you can’t put them after; the expression continues to the end
-of the attribute value.)</para>
-
-<para>Suppose, for example, that you wanted to annotate the stylesheet
-title in the previous paragraph. The standard mechanisms would require that
-you either put an <att>xml:id</att> attribute on the element or point to the
-annotation from the element. With the XPath extension, you can do this:</para>
-
-<annotation annotates="xpath:preceding-sibling::db:para/db:citetitle"
-            xmlns:db="http://docbook.org/ns/docbook">
-<para>This annotation applies to the stylesheet title. For a discussion
-of this annotation, see the following paragraphs.
-</para>
-</annotation>
-
-<programlisting language="xml"><![CDATA[<annotation
-   annotates="xpath:preceding-sibling::db:para/db:citetitle"
-   xmlns:db="http://docbook.org/ns/docbook">
-<para>This annotation applies to the stylesheet title.
-For a discussion of this annotation, see the
-following paragraphs.</para>
-</annotation>]]></programlisting>
-
-<para>When the XPath expression is evaluated, the <tag>annotation</tag>
-element is the context item. Often, this means that you’ll want to start
-the expression with <code>id()</code> or <code>/</code>.</para>
-
-<para>The namespace context for the expression is also the <tag>annotation</tag>
-element, that’s why I’ve added the DocBook namespace binding for the
-<tag class="prefix">db</tag> prefix. In practice, if you’re doing this on
-several annotations, you can just put all the namespace bindings on a common
-ancestor. All of the bindings <emphasis>in scope</emphasis> on the
-<tag>annotation</tag> element are available in the expression.</para>
-
-<para>Caveats:</para>
-
-<orderedlist>
-<listitem>
-<para>There’s no way to have multiple XPath expressions. You can’t put
-“<code>xpath:</code>” in there twice. If you want an annotation to apply to
-multiple elements, you’ll have to construct a single expression that selects
-all the elements, or duplicate the annotation, or use ID/IDREF links.</para>
-<para>If this turns out to be a serious limitation in practice, additional
-syntax could be added to support multiple expressions, but it doesn’t
-<emphasis>seem</emphasis> necessary.</para>
-</listitem>
-<listitem>
-<para>You can only select elements. There’s no way to select the third word
-in a particular paragraph, for example, unless it already has some markup
-around it. There’s also no way to select a comment or a processing instruction.
-</para>
-</listitem>
-</orderedlist>
-
-<para>The placement of the annotation marker (“⌖” by default) can also be
-controlled globally and on individual annotations. The
-<parameter>annotation-placement</parameter> parameter provides global control.
-To specify the position for an individual annotation, put the token
-“<code>before</code>” or “<code>after</code>” in the <att>role</att>
-attribute on the <tag>annotation</tag>.</para>
-
-</section>
-
-<section xml:id="preprocessing-pipeline">
-<title>The pre- and post-processing pipeline</title>
-
-<para>Processing a DocBook document is a multi-stage process. The
-original document is transformed several times before converting it to
-HTML. The standard transformations are:</para>
-
-<orderedlist>
-<listitem xml:id="step-first-00-transform">
-<para>Adjust the logical structure. Adds an XML base attribute to the root of the
-document and converts media object <tag class="attribute">entityref</tag> attributes
-into <tag class="attribute">fileref</tag> attributes.
-</para>
-</listitem>
-<listitem>
-<para>Perform XInclude processing. Only occurs if the appropriate
-extension function is available and if the document contains XInclude
-element.
-</para>
-</listitem>
-<listitem>
-<para>Convert DocBook 4.x to DocBook 5.x. Only occurs if the root element is not in
-a namespace.
-</para>
-</listitem>
-<listitem>
-<para>Peform transclusion.
-</para>
-</listitem>
-<listitem>
-<para>Perform profiling.
-</para>
-</listitem>
-<listitem>
-<para>Normalize the content. This removes a lot of variation that’s allowed for authoring.
-For example, authors aren’t required to use an <tag>info</tag> element if a formal object
-has only a title. This process adds the <tag>info</tag> element if it’s missing.
-</para>
-</listitem>
-<listitem>
-<para>Resolve annotations.
-</para>
-</listitem>
-<listitem>
-<para>Process XLink link bases.
-</para>
-</listitem>
-<listitem>
-<para>Validate. Only occurs if the appropriate
-extension function is available and the stylesheet specifies a
-<parameter>relax-ng-grammar</parameter>.
-</para>
-</listitem>
-<listitem xml:id="step-last-oxy-markup">
-<para>Process Oxygen change markup. Only occurs if 
-<parameter>oxy-markup</parameter> <glossterm>is true</glossterm> and the document contains
-Oxygen change markup processing instructions.
-</para>
-</listitem>
-</orderedlist>
-
-<para>A customization can introduce transformations to the original
-document using three parameters:</para>
-
-<variablelist>
-<varlistentry>
-<term><parameter>transform-original</parameter></term>
-<listitem>
-<para>This transform runs before step <xref linkend="step-first-00-transform"/> in the standard transformations.
-If this transformation is used, it must take special care to preserve the 
-base URI of the original document by adding an <tag class="attribute">xml:base</tag>
-attribute to the root element (if it doesn’t already have one).</para>
-<para>Only the first transformation in the list has access to the original base URI.
-If it isn’t preserved, relative references to other documents will be resolved against
-the static base URI of the stylesheet and not the URI of the original document. That’s
-unlikely to be correct.</para>
-<para>You must also take into account that no XInclude processing has taken place at this time. 
-If you are using modular DocBook, the <parameter>transform-original</parameter> pipeline is usually a bad choice, 
-because it only operates on the root document, but not on the fragments referenced by XInclude. 
-If it is absolutely necessary to use a <parameter>transform-original</parameter> pipeline together with modular DocBook, 
-you can use Saxons <code>-x</code> switch to enable XInclude when parsing the document (see <xref linkend="java-main" xrefstyle="%label"/>). 
-Otherwise, however, the transform-before pipeline is usually the better choice 
-for modular DocBook documents<indexterm><primary>Modular DocBook</primary><secondary>processing pipelines</secondary></indexterm>.</para>  
-</listitem>
-</varlistentry>
-<varlistentry>
-<term><parameter>transform-before</parameter></term>
-<listitem>
-<para>This transformation runs after step <xref linkend="step-last-oxy-markup"/>. Its
-input is the DocBook document that will be transformed into HTML.
-</para>
-</listitem>
-</varlistentry>
-<varlistentry>
-<term><parameter>transform-after</parameter></term>
-<listitem>
-<para>This transformation runs after the DocBook document has been transformed into HTML.
-The resulting HTML document is not valid HTML, but contains islands of valid HTML that will
-be separated out into chunks by subsequent processing.
-</para>
-</listitem>
-</varlistentry>
-</variablelist>
-
-<para>(If you need
-to insert a transformation in the middle of the standard transformations,
-you’ll have to update the <varname>v:standard-transforms</varname>
-variable in your customization.)</para>
-
-<para>Each of the transformation variables holds a list of transforms that will
-be applied in the order specified. Each member of the list can be a map or a
-string. If a string is provided, it’s the equivalent of providing this map:
-</para>
-
-<programlisting language="xpath">map {
-  'stylesheet-location': $the-string
-}</programlisting>
-
-<para>The map can have several keys:</para>
-
-<table>
-<title>The transformation map</title>
-<tgroup cols="2">
-<thead>
-<row>
-<entry>Key</entry>
-<entry>Value</entry>
-</row>
-</thead>
-<tbody>
-<row>
-<entry><code>stylesheet-location</code></entry>
-<entry>The location of the XSLT stylesheet that performs this transformation.
-This key is required.</entry>
-</row>
-<row>
-<entry><code>extra-params</code></entry>
-<entry>A map of QName/value pairs. These parameters will be made available to
-the transformation in addition to all of the standard parameters available to a
-standard DocBook stylesheet.</entry>
-</row>
-<row>
-<entry><code>functions</code></entry>
-<entry>A list of functions (expressed as EQNames). The transformation will only be
-run if every extension function listed is available.</entry>
-</row>
-<row>
-<entry><code>test</code></entry>
-<entry>An arbitrary XPath expression. The expression will be evaluated with the
-document as the context item. If it returns an effective boolean value of true,
-the transformation will be run.</entry>
-</row>
-</tbody>
-</tgroup>
-</table>
-
-</section>
-</chapter>