From 838cad73af7a64f466da25b31a09d784d901c67d Mon Sep 17 00:00:00 2001 From: Frank Steimke Date: Mon, 18 Mar 2024 07:54:07 +0100 Subject: [PATCH 1/2] Hint about preprocessing modular DocBook (Ch. 4) --- src/guide/xml/ch04.xml | 7 + src/guide/xml/ch04.xml.new | 694 +++++++++++++++++++++++++++++++++++++ 2 files changed, 701 insertions(+) create mode 100644 src/guide/xml/ch04.xml.new diff --git a/src/guide/xml/ch04.xml b/src/guide/xml/ch04.xml index 3dcb7bc16..8de93a3af 100644 --- a/src/guide/xml/ch04.xml +++ b/src/guide/xml/ch04.xml @@ -610,6 +610,13 @@ attribute to the root element (if it doesn’t already have one). If it isn’t preserved, relative references to other documents will be resolved against the static base URI of the stylesheet and not the URI of the original document. That’s unlikely to be correct. +You must also take into account that no XInclude processing has taken place at this time. +If you are using modular DocBook, the transform-original pipeline is usually a bad choice, +because it only operates on the root document, but not on the fragments referenced by XInclude. +If it is absolutely necessary to use a transform-original pipeline together with modular DocBook, +you can use Saxons -x switch to enable XInclude when parsing the document (see ). +Otherwise, however, the transform-before pipeline is usually the better choice +for modular DocBook documentsModular DocBookprocessing pipelines. diff --git a/src/guide/xml/ch04.xml.new b/src/guide/xml/ch04.xml.new new file mode 100644 index 000000000..8de93a3af --- /dev/null +++ b/src/guide/xml/ch04.xml.new @@ -0,0 +1,694 @@ + + + + + Implementation details + + +This section sketches out some features of the implementation. +It would probably be better to build an annotated +Definitive Guide or +something, but this will have to do for now. + + +
+Customizing chunking + +Chunking is controlled by the chunk-include and +chunk-exclude parameters. These parameters are both +strings that must contain an XPath expression. + +For each node in the document, the chunk-include +parameter is evaluated. If it does not return an empty sequence, the element +is considered a chunking candidate. In this case, the +chunk-exclude parameter is evaluated. If the exclude +expression does return an empty sequence, then the element identified +becomes a chunk. (If the exclude expression returns a non-empty value, the element +will not become a chunk.) + +
+ +
+Lengths and units + +Lengths appear in the context of images (width and height) and +tables (column widths). Several different units of length are +possible: absolute lengths (e.g., 3in), relative lengths (e.g., 3*), +and percentages (e.g., 25%). In some contexts, these can be combined: +a column width of “3*+0.5in” should have a width equal to 3 times the +relative width plus ½ inch. + +In practice, some of the more complicated forms in have no direct mapping to the units available in +HTML and CSS. The stylesheets attempt to specify a mapping that’s +close. Broadly, they take the nominal width of the table +(nominal-page-width), subtract out the fixed +widths, divide up the remaining widths proportionally among the +relative widths, and compute final widths. The final widths can be +expressed either in absolute terms or as percentages. + +In handling the width and height of images, the intrinsic width +and height of the image in pixels are converted into lengths by +dividing by pixels-per-inch. Nominal widths are +taken into consideration if necessary. + + +Determining the intrinsic size of an image depends on an extension function. +See . Many bitmap image formats are supported. +The bounding box of EPS images is used, if it’s present. The intrinsic size of +SVG images is not available. + + +The list of recognized units (in, cm, etc.) are taken from +v:unit-scale. +
+ +
+Verbatim styles + +There are four verbatim styles: lines, table, +plain, and raw. + + +lines + +In the lines style, each line of the verbatim environment is +marked up individually. In this style, lines can be numbered and +callouts can be inserted. + + + +table + +In the table style, each line of the verbatim environment is +marked up individually, very much like the lines style. In this style, +lines can be numbered and callouts can be inserted. It differs from +the lines style in that the whole thing is wrapped in a table. + +The table has one row and two columns. The line numbers appear in the +first column, the lines in the second. This format was added in order +to improve the display in user agents that don’t support CSS. +Ironically, in the course of adding this style, a number of changes +were made to the way line numbers are formatted in the lines style +making it largely, perhaps entirely, unnecessary. + + + +plain + +In the plain style, callouts can be inserted, but additional markup is not +added (except for the callouts). Consequently, it isn’t possible to do line numbering +or syntax highlighting. (It may be possible to provide these features with JavaScript +libraries in the browser, however.) + + + +raw + +In the raw style, no changes are made to the verbatim content. It’s output as +it appears. Inline markup that it contains, emphasis or other elements, will +be processed, but you cannot add line numbers, callouts, or syntax highlighting. + + + + + +Consult for a variety of parameters that control +aspects of verbatim processing. + +
+Line numbers + +In the lines and table styles, line +numbers may be added to the beginning of some (or all) lines. Prior to version +1.10.0, the stylesheets inserted the numbers without any padding: + + + 5 + The line of text +]]> + +(The newlines and indentation in these examples are for clarity. In practice, these +are inside a pre where every +space counts and they’re all run together with line breaks only occurring +between lines.) + +In a graphical browser with CSS support, this looked fine. But +without CSS, the line numbers and the text that followed them could +flow together and the alignment of the numbers was unclear. + +Starting in version 1.10.0, the stylesheets insert padding +spaces before each number so that they will all be aligned. If the +largest line number is three digits long, every number smaller than +100 will be padded to a width of three characters. A single space is +added after the number to separate it from the text that follows. +An additional separator may also be inserted, as shown here. + + + 5 | + The line of text +]]> + +These changes have no visible effect when CSS is used to style +the verbatim environment. But without CSS, the numbers are aligned and +separated from the text that follows. The +verbatim-number-separator is generally +suppressed by CSS, but is visible in text browsers. + +
+
+ +
+Processing mediaobjects + +Starting in version 1.11.0, the way media objects are processed has been +refactored. This is designed to support fallback at both the object +level (imageobject, audioaobject, videoobject, +textobject, and imageobjectco) and at the data +level (imagedata, audiodata, videodata, +and textdata within the objects). + +Each data element and object element is processed in the +m:mediaobject-info mode. This returns a map for each object +that contains an array of maps, one for each data element: + + +The object map + + + +Key +Value + + + + +content-types +An array of the distinct content types in the data elements + + +datas +An array of data maps + + +extensions +An array of the distinct extensions in the data elements + + +node +The media object node + + + +
+ +Each data map has the following structure: + + +The data map + + + +Key +Value + + + + +align +The alignment of the data (if specified) + + +content-type +The computed content type for the data + + +contentheight +The content height of the data (if specifiedDocBook +uses “depth” instead of “height”, but we convert +it to height for consistency with most other systems.) + + +contentwidth +The content width of the data (if specified) + + +extension +The extension of the data file (if there is one) + + +fileref +The original fileref attribute value + + +height +The height of the data (if specified) + + +href +The computed href value for the HTML element; +this takes account of the mediaobject-input-base-uri and +mediaobject-output-base-uri). + + + +node +The data element + + +params +Any multimediaparams associated with the data element + + +properties +The properties of the element (as returned by the extension funtions; +this can include EXIF data and metadata) + + +scale +The scaling factor (if there is one) or 1.0 + + +scalefit +True if the image should be scaled to fit (implicitly or explicitly) + + +uri +The computed absolute URI of the input data + + +valign +The vertical alignment of the data (if specified) + + +width +The width of the data (if specifieid) + + + +
+ +The uri and href properties are computed +by processing the data elements in the m:mediaobject-uris mode. + +Armed with information about the objects and the data associated with them, +the stylesheets proceed to choose an object and then process it. Each object is +considered in turn, if any of the data elements it contains were excluded, then it is +rejected. The first object where all of the elements are acceptable is selected. + +Consider this example: + + + + + + + + + + + +]]> + +If this is being processed for online presentation, the default +value of mediaobject-exclude-extensions will exclude the +EPS file. Because one of it’s data elements was excluded, the processor will choose +the object containing only the SVG and PNG images for online presentation. + +Once an object is selected, an appropriate wrapper is created and all +of the alternatives are placed within it. So the example above will result in +picture element containing +a +source for the SVG image and an +img for the fallback PNG. + + + +Consistent with HTML, only the size, scaling, and alignment attributes of +the last alternative data element are considered! These apply +irrespective of which alternative is selected. + + +
+Mediaobject URIs + +Media object URIs are tricky to handle. It’s most +convenient if the URIs in the source documents point to the actual media files. +This allows extensions, like the image properties +extension function, to access the files. +At the same time, the references generated in the HTML have to point to the +locations where they will be published. + +In previous versions, the stylesheets attempted (broadly) to use +the relative difference between the input and output base URIs to work out +the correct relative URIs for media. That imposed restrictions on the +authoring environment that weren’t always easy to work with. +Starting in version 2.0.6, the mechanisms for finding sources +and producing references in the output has changed. Three parameters +are used: + + + +mediaobject-input-base-uri + +If the mediaobject-input-base-uri is empty (the default), +then URIs in the source document are assumed to be relative to the base URI on +which they occur. This is the usual case if you mix XML and media into the same +directory structure on the filesystem. +If the mediaobject-input-base-uri is not empty, it is +used to resolve all media URIs. If it’s initialized with a relative URI, that URI will +be made absolute against the base URI if the input document. + + + +mediaobject-output-base-uri + +If the mediaobject-output-base-uri is empty (the default), +then URIs in the output are treated as parallel to the URIs in the input. If the +reference ../image.png works in the source document, it’s assumed +that will also work in the output document. +If the mediaobject-output-base-uri is not empty, it is +the base URI used for media objects. If this is a relative URI, it is taken to be +relative to the root of the output hierarchy. +Suppose the output base URI is https://images.example.com/, then +a reference to image.png will appear as +https://images.example.com/image.png in the output. +If the output base URI is media/, then +a reference to image.png will appear as +media/image.png in the output. If the document is chunked, the paths back +to the output directory are relative. In otherwords, if the reference to +image.png appears in a chunk that will be located at +back/appendix.html, then the media URI will be +../media/image.png. + + + + +mediaobject-output-paths + +This parameter controls whether the relative paths in the input URIs apply +to the output URIs as well. If the parameter is true, +the output base URI is media/, and the input URI is +path/to/image.png, then the output reference will be to +media/path/to/image.png. If it’s false, then the output reference +will be to media/image.png. + + + + + +For a further discussion of the options, see +. + + +The stylesheets are not responsible for actually copying the media files +into the correct locations in the output. The stylesheets only generate the HTML +files and the references. You must copy the images and other media with some +other process. + +
+
+ +
+Templates + +It’s difficult to make title pages for documents easy to customize. There +is a lot of variation between documents and the styles can +have very precise design constraints. At the end of the day, if you need complete control, +you can define a template that matches the element in the +m:generate-titlepage mode and generate all of the markup you wish. + +The default title page handling attempts to make some declarative customization +possible by using templates. A typical header template looks like this: + + +
+ +

+
+ +

+
+
+]]>
+ +Any HTML element in the template will be copied to the output. The semantics +of a “template apply templates” element (tmp:apply-templates) is that +it runs the ordinary xsl:apply-templates instruction on the elements that +match its select expression. If they result is the empty sequence (e.g, if there is no +subtitle), nothing is output. If there is a result, the content of the +tmp:apply-templates element is processed. Anywhere that +tmp:content appears, the result of applying templates will be output. + +In this example, if the title is “H2O” and there is no subtitle, +the resulting HTML title page will be: + + +

H2O

+]]>
+
+ +
+Annotations + +The stylesheets fully support annotations, including a number +of presentation styles enabled by JavaScript in the browser. They also +support an extension of the documented semantics of +annotation. + +Annotations are applied to elements with links. Either the +element must point to its annotations (with an annotations +attribute) or the annotations must point to the elements they annotate +(with an annotates attribute). These are documented as +ID/IDREF links but they are not IDREFS attributes +because annotations may be stored separately. + +Starting in version 1.5.1, the DocBook xslTNG +Stylesheets support a non-standard extension: if you place +the string xpath: in the annotates attribute of +an annotation, then the rest of the attribute is assumed to contain +an XPath expression that points to the element(s) to which the annotation +applies. (If you want to put IDREF values before the xpath: token, +that’s fine, but you can’t put them after; the expression continues to the end +of the attribute value.) + +Suppose, for example, that you wanted to annotate the stylesheet +title in the previous paragraph. The standard mechanisms would require that +you either put an xml:id attribute on the element or point to the +annotation from the element. With the XPath extension, you can do this: + + +This annotation applies to the stylesheet title. For a discussion +of this annotation, see the following paragraphs. + + + + +This annotation applies to the stylesheet title. +For a discussion of this annotation, see the +following paragraphs. +]]> + +When the XPath expression is evaluated, the annotation +element is the context item. Often, this means that you’ll want to start +the expression with id() or /. + +The namespace context for the expression is also the annotation +element, that’s why I’ve added the DocBook namespace binding for the +db prefix. In practice, if you’re doing this on +several annotations, you can just put all the namespace bindings on a common +ancestor. All of the bindings in scope on the +annotation element are available in the expression. + +Caveats: + + + +There’s no way to have multiple XPath expressions. You can’t put +“xpath:” in there twice. If you want an annotation to apply to +multiple elements, you’ll have to construct a single expression that selects +all the elements, or duplicate the annotation, or use ID/IDREF links. +If this turns out to be a serious limitation in practice, additional +syntax could be added to support multiple expressions, but it doesn’t +seem necessary. + + +You can only select elements. There’s no way to select the third word +in a particular paragraph, for example, unless it already has some markup +around it. There’s also no way to select a comment or a processing instruction. + + + + +The placement of the annotation marker (“⌖” by default) can also be +controlled globally and on individual annotations. The +annotation-placement parameter provides global control. +To specify the position for an individual annotation, put the token +“before” or “after” in the role +attribute on the annotation. + +
+ +
+The pre- and post-processing pipeline + +Processing a DocBook document is a multi-stage process. The +original document is transformed several times before converting it to +HTML. The standard transformations are: + + + +Adjust the logical structure. Adds an XML base attribute to the root of the +document and converts media object entityref attributes +into fileref attributes. + + + +Perform XInclude processing. Only occurs if the appropriate +extension function is available and if the document contains XInclude +element. + + + +Convert DocBook 4.x to DocBook 5.x. Only occurs if the root element is not in +a namespace. + + + +Peform transclusion. + + + +Perform profiling. + + + +Normalize the content. This removes a lot of variation that’s allowed for authoring. +For example, authors aren’t required to use an info element if a formal object +has only a title. This process adds the info element if it’s missing. + + + +Resolve annotations. + + + +Process XLink link bases. + + + +Validate. Only occurs if the appropriate +extension function is available and the stylesheet specifies a +relax-ng-grammar. + + + +Process Oxygen change markup. Only occurs if +oxy-markup is true and the document contains +Oxygen change markup processing instructions. + + + + +A customization can introduce transformations to the original +document using three parameters: + + + +transform-original + +This transform runs before step in the standard transformations. +If this transformation is used, it must take special care to preserve the +base URI of the original document by adding an xml:base +attribute to the root element (if it doesn’t already have one). +Only the first transformation in the list has access to the original base URI. +If it isn’t preserved, relative references to other documents will be resolved against +the static base URI of the stylesheet and not the URI of the original document. That’s +unlikely to be correct. +You must also take into account that no XInclude processing has taken place at this time. +If you are using modular DocBook, the transform-original pipeline is usually a bad choice, +because it only operates on the root document, but not on the fragments referenced by XInclude. +If it is absolutely necessary to use a transform-original pipeline together with modular DocBook, +you can use Saxons -x switch to enable XInclude when parsing the document (see ). +Otherwise, however, the transform-before pipeline is usually the better choice +for modular DocBook documentsModular DocBookprocessing pipelines. + + + +transform-before + +This transformation runs after step . Its +input is the DocBook document that will be transformed into HTML. + + + + +transform-after + +This transformation runs after the DocBook document has been transformed into HTML. +The resulting HTML document is not valid HTML, but contains islands of valid HTML that will +be separated out into chunks by subsequent processing. + + + + + +(If you need +to insert a transformation in the middle of the standard transformations, +you’ll have to update the v:standard-transforms +variable in your customization.) + +Each of the transformation variables holds a list of transforms that will +be applied in the order specified. Each member of the list can be a map or a +string. If a string is provided, it’s the equivalent of providing this map: + + +map { + 'stylesheet-location': $the-string +} + +The map can have several keys: + + +The transformation map + + + +Key +Value + + + + +stylesheet-location +The location of the XSLT stylesheet that performs this transformation. +This key is required. + + +extra-params +A map of QName/value pairs. These parameters will be made available to +the transformation in addition to all of the standard parameters available to a +standard DocBook stylesheet. + + +functions +A list of functions (expressed as EQNames). The transformation will only be +run if every extension function listed is available. + + +test +An arbitrary XPath expression. The expression will be evaluated with the +document as the context item. If it returns an effective boolean value of true, +the transformation will be run. + + + +
+ +
+
From 4ae778ac74d4ee8d0ce0178d434a89b452290313 Mon Sep 17 00:00:00 2001 From: Frank Steimke Date: Thu, 3 Oct 2024 11:09:32 +0200 Subject: [PATCH 2/2] removed ch04.xml.new --- src/guide/xml/ch04.xml.new | 694 ------------------------------------- 1 file changed, 694 deletions(-) delete mode 100644 src/guide/xml/ch04.xml.new diff --git a/src/guide/xml/ch04.xml.new b/src/guide/xml/ch04.xml.new deleted file mode 100644 index 8de93a3af..000000000 --- a/src/guide/xml/ch04.xml.new +++ /dev/null @@ -1,694 +0,0 @@ - - - - - Implementation details - - -This section sketches out some features of the implementation. -It would probably be better to build an annotated -Definitive Guide or -something, but this will have to do for now. - - -
-Customizing chunking - -Chunking is controlled by the chunk-include and -chunk-exclude parameters. These parameters are both -strings that must contain an XPath expression. - -For each node in the document, the chunk-include -parameter is evaluated. If it does not return an empty sequence, the element -is considered a chunking candidate. In this case, the -chunk-exclude parameter is evaluated. If the exclude -expression does return an empty sequence, then the element identified -becomes a chunk. (If the exclude expression returns a non-empty value, the element -will not become a chunk.) - -
- -
-Lengths and units - -Lengths appear in the context of images (width and height) and -tables (column widths). Several different units of length are -possible: absolute lengths (e.g., 3in), relative lengths (e.g., 3*), -and percentages (e.g., 25%). In some contexts, these can be combined: -a column width of “3*+0.5in” should have a width equal to 3 times the -relative width plus ½ inch. - -In practice, some of the more complicated forms in have no direct mapping to the units available in -HTML and CSS. The stylesheets attempt to specify a mapping that’s -close. Broadly, they take the nominal width of the table -(nominal-page-width), subtract out the fixed -widths, divide up the remaining widths proportionally among the -relative widths, and compute final widths. The final widths can be -expressed either in absolute terms or as percentages. - -In handling the width and height of images, the intrinsic width -and height of the image in pixels are converted into lengths by -dividing by pixels-per-inch. Nominal widths are -taken into consideration if necessary. - - -Determining the intrinsic size of an image depends on an extension function. -See . Many bitmap image formats are supported. -The bounding box of EPS images is used, if it’s present. The intrinsic size of -SVG images is not available. - - -The list of recognized units (in, cm, etc.) are taken from -v:unit-scale. -
- -
-Verbatim styles - -There are four verbatim styles: lines, table, -plain, and raw. - - -lines - -In the lines style, each line of the verbatim environment is -marked up individually. In this style, lines can be numbered and -callouts can be inserted. - - - -table - -In the table style, each line of the verbatim environment is -marked up individually, very much like the lines style. In this style, -lines can be numbered and callouts can be inserted. It differs from -the lines style in that the whole thing is wrapped in a table. - -The table has one row and two columns. The line numbers appear in the -first column, the lines in the second. This format was added in order -to improve the display in user agents that don’t support CSS. -Ironically, in the course of adding this style, a number of changes -were made to the way line numbers are formatted in the lines style -making it largely, perhaps entirely, unnecessary. - - - -plain - -In the plain style, callouts can be inserted, but additional markup is not -added (except for the callouts). Consequently, it isn’t possible to do line numbering -or syntax highlighting. (It may be possible to provide these features with JavaScript -libraries in the browser, however.) - - - -raw - -In the raw style, no changes are made to the verbatim content. It’s output as -it appears. Inline markup that it contains, emphasis or other elements, will -be processed, but you cannot add line numbers, callouts, or syntax highlighting. - - - - - -Consult for a variety of parameters that control -aspects of verbatim processing. - -
-Line numbers - -In the lines and table styles, line -numbers may be added to the beginning of some (or all) lines. Prior to version -1.10.0, the stylesheets inserted the numbers without any padding: - - - 5 - The line of text -]]> - -(The newlines and indentation in these examples are for clarity. In practice, these -are inside a pre where every -space counts and they’re all run together with line breaks only occurring -between lines.) - -In a graphical browser with CSS support, this looked fine. But -without CSS, the line numbers and the text that followed them could -flow together and the alignment of the numbers was unclear. - -Starting in version 1.10.0, the stylesheets insert padding -spaces before each number so that they will all be aligned. If the -largest line number is three digits long, every number smaller than -100 will be padded to a width of three characters. A single space is -added after the number to separate it from the text that follows. -An additional separator may also be inserted, as shown here. - - - 5 | - The line of text -]]> - -These changes have no visible effect when CSS is used to style -the verbatim environment. But without CSS, the numbers are aligned and -separated from the text that follows. The -verbatim-number-separator is generally -suppressed by CSS, but is visible in text browsers. - -
-
- -
-Processing mediaobjects - -Starting in version 1.11.0, the way media objects are processed has been -refactored. This is designed to support fallback at both the object -level (imageobject, audioaobject, videoobject, -textobject, and imageobjectco) and at the data -level (imagedata, audiodata, videodata, -and textdata within the objects). - -Each data element and object element is processed in the -m:mediaobject-info mode. This returns a map for each object -that contains an array of maps, one for each data element: - - -The object map - - - -Key -Value - - - - -content-types -An array of the distinct content types in the data elements - - -datas -An array of data maps - - -extensions -An array of the distinct extensions in the data elements - - -node -The media object node - - - -
- -Each data map has the following structure: - - -The data map - - - -Key -Value - - - - -align -The alignment of the data (if specified) - - -content-type -The computed content type for the data - - -contentheight -The content height of the data (if specifiedDocBook -uses “depth” instead of “height”, but we convert -it to height for consistency with most other systems.) - - -contentwidth -The content width of the data (if specified) - - -extension -The extension of the data file (if there is one) - - -fileref -The original fileref attribute value - - -height -The height of the data (if specified) - - -href -The computed href value for the HTML element; -this takes account of the mediaobject-input-base-uri and -mediaobject-output-base-uri). - - - -node -The data element - - -params -Any multimediaparams associated with the data element - - -properties -The properties of the element (as returned by the extension funtions; -this can include EXIF data and metadata) - - -scale -The scaling factor (if there is one) or 1.0 - - -scalefit -True if the image should be scaled to fit (implicitly or explicitly) - - -uri -The computed absolute URI of the input data - - -valign -The vertical alignment of the data (if specified) - - -width -The width of the data (if specifieid) - - - -
- -The uri and href properties are computed -by processing the data elements in the m:mediaobject-uris mode. - -Armed with information about the objects and the data associated with them, -the stylesheets proceed to choose an object and then process it. Each object is -considered in turn, if any of the data elements it contains were excluded, then it is -rejected. The first object where all of the elements are acceptable is selected. - -Consider this example: - - - - - - - - - - - -]]> - -If this is being processed for online presentation, the default -value of mediaobject-exclude-extensions will exclude the -EPS file. Because one of it’s data elements was excluded, the processor will choose -the object containing only the SVG and PNG images for online presentation. - -Once an object is selected, an appropriate wrapper is created and all -of the alternatives are placed within it. So the example above will result in -picture element containing -a -source for the SVG image and an -img for the fallback PNG. - - - -Consistent with HTML, only the size, scaling, and alignment attributes of -the last alternative data element are considered! These apply -irrespective of which alternative is selected. - - -
-Mediaobject URIs - -Media object URIs are tricky to handle. It’s most -convenient if the URIs in the source documents point to the actual media files. -This allows extensions, like the image properties -extension function, to access the files. -At the same time, the references generated in the HTML have to point to the -locations where they will be published. - -In previous versions, the stylesheets attempted (broadly) to use -the relative difference between the input and output base URIs to work out -the correct relative URIs for media. That imposed restrictions on the -authoring environment that weren’t always easy to work with. -Starting in version 2.0.6, the mechanisms for finding sources -and producing references in the output has changed. Three parameters -are used: - - - -mediaobject-input-base-uri - -If the mediaobject-input-base-uri is empty (the default), -then URIs in the source document are assumed to be relative to the base URI on -which they occur. This is the usual case if you mix XML and media into the same -directory structure on the filesystem. -If the mediaobject-input-base-uri is not empty, it is -used to resolve all media URIs. If it’s initialized with a relative URI, that URI will -be made absolute against the base URI if the input document. - - - -mediaobject-output-base-uri - -If the mediaobject-output-base-uri is empty (the default), -then URIs in the output are treated as parallel to the URIs in the input. If the -reference ../image.png works in the source document, it’s assumed -that will also work in the output document. -If the mediaobject-output-base-uri is not empty, it is -the base URI used for media objects. If this is a relative URI, it is taken to be -relative to the root of the output hierarchy. -Suppose the output base URI is https://images.example.com/, then -a reference to image.png will appear as -https://images.example.com/image.png in the output. -If the output base URI is media/, then -a reference to image.png will appear as -media/image.png in the output. If the document is chunked, the paths back -to the output directory are relative. In otherwords, if the reference to -image.png appears in a chunk that will be located at -back/appendix.html, then the media URI will be -../media/image.png. - - - - -mediaobject-output-paths - -This parameter controls whether the relative paths in the input URIs apply -to the output URIs as well. If the parameter is true, -the output base URI is media/, and the input URI is -path/to/image.png, then the output reference will be to -media/path/to/image.png. If it’s false, then the output reference -will be to media/image.png. - - - - - -For a further discussion of the options, see -. - - -The stylesheets are not responsible for actually copying the media files -into the correct locations in the output. The stylesheets only generate the HTML -files and the references. You must copy the images and other media with some -other process. - -
-
- -
-Templates - -It’s difficult to make title pages for documents easy to customize. There -is a lot of variation between documents and the styles can -have very precise design constraints. At the end of the day, if you need complete control, -you can define a template that matches the element in the -m:generate-titlepage mode and generate all of the markup you wish. - -The default title page handling attempts to make some declarative customization -possible by using templates. A typical header template looks like this: - - -
- -

-
- -

-
-
-]]>
- -Any HTML element in the template will be copied to the output. The semantics -of a “template apply templates” element (tmp:apply-templates) is that -it runs the ordinary xsl:apply-templates instruction on the elements that -match its select expression. If they result is the empty sequence (e.g, if there is no -subtitle), nothing is output. If there is a result, the content of the -tmp:apply-templates element is processed. Anywhere that -tmp:content appears, the result of applying templates will be output. - -In this example, if the title is “H2O” and there is no subtitle, -the resulting HTML title page will be: - - -

H2O

-]]>
-
- -
-Annotations - -The stylesheets fully support annotations, including a number -of presentation styles enabled by JavaScript in the browser. They also -support an extension of the documented semantics of -annotation. - -Annotations are applied to elements with links. Either the -element must point to its annotations (with an annotations -attribute) or the annotations must point to the elements they annotate -(with an annotates attribute). These are documented as -ID/IDREF links but they are not IDREFS attributes -because annotations may be stored separately. - -Starting in version 1.5.1, the DocBook xslTNG -Stylesheets support a non-standard extension: if you place -the string xpath: in the annotates attribute of -an annotation, then the rest of the attribute is assumed to contain -an XPath expression that points to the element(s) to which the annotation -applies. (If you want to put IDREF values before the xpath: token, -that’s fine, but you can’t put them after; the expression continues to the end -of the attribute value.) - -Suppose, for example, that you wanted to annotate the stylesheet -title in the previous paragraph. The standard mechanisms would require that -you either put an xml:id attribute on the element or point to the -annotation from the element. With the XPath extension, you can do this: - - -This annotation applies to the stylesheet title. For a discussion -of this annotation, see the following paragraphs. - - - - -This annotation applies to the stylesheet title. -For a discussion of this annotation, see the -following paragraphs. -]]> - -When the XPath expression is evaluated, the annotation -element is the context item. Often, this means that you’ll want to start -the expression with id() or /. - -The namespace context for the expression is also the annotation -element, that’s why I’ve added the DocBook namespace binding for the -db prefix. In practice, if you’re doing this on -several annotations, you can just put all the namespace bindings on a common -ancestor. All of the bindings in scope on the -annotation element are available in the expression. - -Caveats: - - - -There’s no way to have multiple XPath expressions. You can’t put -“xpath:” in there twice. If you want an annotation to apply to -multiple elements, you’ll have to construct a single expression that selects -all the elements, or duplicate the annotation, or use ID/IDREF links. -If this turns out to be a serious limitation in practice, additional -syntax could be added to support multiple expressions, but it doesn’t -seem necessary. - - -You can only select elements. There’s no way to select the third word -in a particular paragraph, for example, unless it already has some markup -around it. There’s also no way to select a comment or a processing instruction. - - - - -The placement of the annotation marker (“⌖” by default) can also be -controlled globally and on individual annotations. The -annotation-placement parameter provides global control. -To specify the position for an individual annotation, put the token -“before” or “after” in the role -attribute on the annotation. - -
- -
-The pre- and post-processing pipeline - -Processing a DocBook document is a multi-stage process. The -original document is transformed several times before converting it to -HTML. The standard transformations are: - - - -Adjust the logical structure. Adds an XML base attribute to the root of the -document and converts media object entityref attributes -into fileref attributes. - - - -Perform XInclude processing. Only occurs if the appropriate -extension function is available and if the document contains XInclude -element. - - - -Convert DocBook 4.x to DocBook 5.x. Only occurs if the root element is not in -a namespace. - - - -Peform transclusion. - - - -Perform profiling. - - - -Normalize the content. This removes a lot of variation that’s allowed for authoring. -For example, authors aren’t required to use an info element if a formal object -has only a title. This process adds the info element if it’s missing. - - - -Resolve annotations. - - - -Process XLink link bases. - - - -Validate. Only occurs if the appropriate -extension function is available and the stylesheet specifies a -relax-ng-grammar. - - - -Process Oxygen change markup. Only occurs if -oxy-markup is true and the document contains -Oxygen change markup processing instructions. - - - - -A customization can introduce transformations to the original -document using three parameters: - - - -transform-original - -This transform runs before step in the standard transformations. -If this transformation is used, it must take special care to preserve the -base URI of the original document by adding an xml:base -attribute to the root element (if it doesn’t already have one). -Only the first transformation in the list has access to the original base URI. -If it isn’t preserved, relative references to other documents will be resolved against -the static base URI of the stylesheet and not the URI of the original document. That’s -unlikely to be correct. -You must also take into account that no XInclude processing has taken place at this time. -If you are using modular DocBook, the transform-original pipeline is usually a bad choice, -because it only operates on the root document, but not on the fragments referenced by XInclude. -If it is absolutely necessary to use a transform-original pipeline together with modular DocBook, -you can use Saxons -x switch to enable XInclude when parsing the document (see ). -Otherwise, however, the transform-before pipeline is usually the better choice -for modular DocBook documentsModular DocBookprocessing pipelines. - - - -transform-before - -This transformation runs after step . Its -input is the DocBook document that will be transformed into HTML. - - - - -transform-after - -This transformation runs after the DocBook document has been transformed into HTML. -The resulting HTML document is not valid HTML, but contains islands of valid HTML that will -be separated out into chunks by subsequent processing. - - - - - -(If you need -to insert a transformation in the middle of the standard transformations, -you’ll have to update the v:standard-transforms -variable in your customization.) - -Each of the transformation variables holds a list of transforms that will -be applied in the order specified. Each member of the list can be a map or a -string. If a string is provided, it’s the equivalent of providing this map: - - -map { - 'stylesheet-location': $the-string -} - -The map can have several keys: - - -The transformation map - - - -Key -Value - - - - -stylesheet-location -The location of the XSLT stylesheet that performs this transformation. -This key is required. - - -extra-params -A map of QName/value pairs. These parameters will be made available to -the transformation in addition to all of the standard parameters available to a -standard DocBook stylesheet. - - -functions -A list of functions (expressed as EQNames). The transformation will only be -run if every extension function listed is available. - - -test -An arbitrary XPath expression. The expression will be evaluated with the -document as the context item. If it returns an effective boolean value of true, -the transformation will be run. - - - -
- -
-