by Lars Vogel

Follow me on twitter

Lars Vogel on Google+

DocBook - Tutorial

Lars Vogel

Version 2.9

18.12.2011

Revision History
Revision 0.1 - 0.5 28.06.2007 Lars
Vogel
Created
Revision 0.6 - 2.9 16.10.2008 - 18.12.2011 Lars
Vogel
bugfixes and updates

DocBook

This article explains how to write DocBook files in Eclipse and how to convert these files into various output formats, e.g. to HTML and pdf. It also explains how to configure XInclude to divide the information into different source files.

This article uses Docbook 4.5, the Saxon XLST processor in version 6.5.5 and Eclipse 3.7 (Indigo).


Table of Contents

1. Introduction to DocBook
1.1. Overview
1.2. DocBook Example
1.3. The required toolset
2. Installation
2.1. Eclipse
2.2. Docbook and Stylesheets
2.3. XSL processor
2.4. Issues
3. Convert Docbook to HTML
3.1. Project Setup
3.2. Write your first DocBook document
3.3. Use ant to convert DocBook to html
4. Convert Docbook to plain text
5. DocBook Tags
5.1. Tags
5.2. Tables
5.3. Lists
5.4. Links
5.5. Graphics
5.6. Menus
5.7. Keyboard Shortcuts
6. Creating epub
6.1. Overview of EPub
6.2. Creating epub with Apache Ant
7. Create pdf output
7.1. Overview
7.2. Installation
7.3. Define the Ant Task
8. Influencing the output result
8.1. HTML Parameters
8.2. pdf Parameters
8.3. Add content into the HTML output
9. Advanced Features
9.1. Syntax Highlighting
9.2. Remove certain parts
10. Using XInclude with Eclipse XSL
10.1. Overview
10.2. Eclipse XSL Tools
10.3. Using the XInclude ant task
11. Thank you
12. Questions and Discussion
13. Links and Literature

1. Introduction to DocBook

1.1. Overview

DocBook is a standard for creating well-formated plain text documents. DocBook documents can be transformed into other output formats.

DocBook is plain text and can therefore be written in a text editor which supports plain text as output format and put under version control.

To transform Docbook into other formats you can use XLST. XSLT stands for Extensible Stylesheet Language Transformation. Stylesheets for converting DocBook to common output are available, e.g. to convert into HTML, pdf, java help or Unix man pages.

DocBook has two main document class, book and article.

  • Article: Used for writing technical articles. The main tag is article. Article is used in the following example.

  • Book: Used for longer description. The main tags is book. In addition to sections in articles you have also the tag "Chapter"" and the tag "Part".

1.2. DocBook Example

The following is an example of a DocBook article:

				
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<article>
	<articleinfo>
		<title>DocBook Intro</title>
		<author>
			<firstname>Lars</firstname>
			<surname>Vogel</surname>
		</author>
	</articleinfo>
	<section label="1.0">
		<title>An introduction to DocBook</title>
		<section label="1.1">
			<title>Subsection</title>
			<para> 
				This is text.
			</para>
		</section>
	</section>
</article>

			

The following is an example of a DocBook book:

				
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book>
	<title>Docbook Book Example</title>
	<chapter>
		<title>This is the first chapter</title>
		<section>
			<title> First section in the chapter</title>
			<para>Random text.
			</para>
		</section>
		<section>
			<title> Second section in the chapter</title>
			<para>Other random text
			</para>
		</section>
	</chapter>	

	<chapter>
		<title>This is the second chapter</title>
		<section>
			<title> My Title</title>
			<para>More...
			</para>
		</section>
		<section>
			<title> Other title</title>
			<para>Blabla
			</para>
		</section>
	</chapter>
</book>
			

Tip

The above defines that the DTD is stored in a directory one levels above the document directory.

1.3. The required toolset

To create DocBook files and to convert them into other formats you need:

  • The DocBook DTD which defines how a DocBook must be written.

  • XSLT stylesheets to convert your DocBook into another format.

  • A XSLT parser

We will use Eclipse as an XML editor, Saxon as the XSLT parser and Apache Ant for the XSLT transformation.

2. Installation

2.1. Eclipse

Install Eclipse. See Eclipse IDE for installing and using Eclipse. We will use the Apache Ant which is integrated into Eclipse therefore no additional installation is required.

2.2. Docbook and Stylesheets

Download the Docbook DTD and XSLT stylesheets. You can download the Docbook DTD from:

http://www.oasis-open.org/docbook/xml/4.5 The XSLT stylesheets can be downloaded from:

http://docbook.sourceforge.net

At of the time of writing this article the version "docbook-xsl-1.76.1" is the most recent version.

2.3. XSL processor

Download Saxon from http://saxon.sourceforge.net/ . Download the version 6.5.5 as newer Saxon version to not work well with DocBook 4.5. Saxon 9 is an XSLT 2.0 processor and the current official version of the XSL stylesheets are XSLT 1.0 based.

2.4. Issues

Sometimes running the XSLT conversion result in the strange error message:

				
javax.xml.transform.TransformerConfigurationException: java.net.MalformedURLException: no protocol: ../common/entities.ent
			

In this case I added the XML parser xerces-j to your build path and see if that resolves the error.

3. Convert Docbook to HTML

3.1.  Project Setup

In Eclipse create a new project "de.vogella.docbook.first", select File -> New -> Project and select from the proposed list General -> Projects.

Create the following folder structure:

  • input: Contains your DocBook files

  • docbook-xml-4.5: Contains the DTD definition for DocBook

  • docbook-xsl: Contains the XSL stylesheets to convert to the other output formats.

  • lib: Will contains your libraries (for pdf creation)

Place the DocBook DTD and the XSLT stylesheets into the corresponding directories.

Copy the Saxon jar into the lib folder. The result should look like the following.

3.2. Write your first DocBook document

In your folder "documents" create a following file "book.xml". The "../docbook-xml-4.5/docbookx.dtd" corresponds to the directory you have created earlier in your project setup.

				
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<article>
	<articleinfo>
		<title>DocBook Intro</title>
		<author>
			<firstname>Lars</firstname>
			<surname>Vogel</surname>
		</author>
	</articleinfo>
	<section label="1.0">
		<title>An introduction to DocBook</title>
		<section label="1.1">
			<title>Subsection</title>
			<para> 
				This is text.
			</para>
		</section>
	</section>
</article>

			

3.3. Use ant to convert DocBook to html

Create an Apache Ant build file "build.xml" in your project directory.

			
<?xml version="1.0"?>
<!--
  - Author:  Lars Vogel
  -->
<project name="docbook-src" default="build-html">

	<description>
            This Ant buildhtml.xml file is used to transform DocBook XML to html output
    </description>

	<!--
      - Configure basic properties that will be used in the file.
      -->
	<property name="input.dir" value="input" />
	<property name="output.dir" value="output" />
	<property name="docbook.xsl.dir" value="docbook-xsl" />
	<property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />

	<!-- Making saxon available -->
	<path id="saxon.class.path">
		<pathelement location="lib/saxon9he.jar" />
	</path>


	<!--
      - target:  usage
      -->
	<target name="usage" description="Prints the Ant build.xml usage">
		<echo message="Use -projecthelp to get a list of the available targets." />
	</target>

	<!--
      - target:  clean
      -->
	<target name="clean" description="Cleans up generated files.">
		<delete dir="${output.dir}" />
	</target>

	<!--
      - target:  depends
      -->
	<target name="depends">
		<mkdir dir="${output.dir}" />
	</target>

	<!--
      - target:  build-html
      - description:  Iterates through a directory and transforms
      -     .xml files into .html files using the DocBook XSL.
      -->
	<target name="build-html" depends="depends" description="Generates HTML files from DocBook XML">
		<xslt style="${html.stylesheet}" extension=".html" basedir="${input.dir}" destdir="${output.dir}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
			<param name="html.stylesheet" expression="style.css" />
			<classpath refid="saxon.class.path" />
		</xslt>
		<!-- Copy the stylesheet to the same directory as the HTML files -->
		<copy todir="${output.dir}">
			<fileset dir="lib">
				<include name="style.css" />
			</fileset>
		</copy>
	</target>

</project>

		

Run the build.xml file (right mouse click, run as -> Ant Build). Then check the output directory. You should find a directory "Example", with the file "book.html".

Congratulations you created your first DocBook and converted it into an HTLM document.

4. Convert Docbook to plain text

The best way to convert Docbook files to plain text is first to convert them to HTML and then use the text browser Lynx to convert it to text with the following command.

			
lynx -dump myfile.html > myfile.txt
		

This way the text is well structured, e.g. tables are looking nice.

5. DocBook Tags

The following is an overview of useful DocBook tags.

5.1. Tags

Table 1. Important Docbook tags

Tag Explanation
<![CDATA[ SPECIAL_SIGN_HERE,e.g. & ]]> Allows to enter special signs into the text which would be otherwise intepreted by DocBook
<programlisting> </programlisting> Highlights the text as coding.
<emphasis> </emphasis> Highlights the text
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="text" href="example1.txt" /> Includes example1.xml as text, so the file can contain tag, etc.
<ulink url="http://www.heise.de/newsticker">German IT News</ulink>. Paste a hypertext link into the document.
&amp; & Creates the & sign. Can for example be used in links.


5.2. Tables

You can create an table like this:

				
<table frame='all'>
	<title>Sample Table</title>
	<tgroup cols='2' align='left' colsep='1' rowsep='1'>
		<colspec colname='c1' />
		<colspec colname='c2' />
		<thead>
			<row>
				<entry>a4</entry>
				<entry>a5</entry>
			</row>
		</thead>
		<tfoot>
			<row>
				<entry>f4</entry>
				<entry>f5</entry>
			</row>
		</tfoot>
		<tbody>
			<row>
				<entry>b1</entry>
				<entry>b2</entry>
			</row>
			<row>
				<entry>d1</entry>
				<entry>d5</entry>
			</row>
		</tbody>
	</tgroup>
</table>
			

The output look then like this:

Table 2. Sample Table

a4 a5
f4 f5
b1 b2
d1 d5


5.3. Lists

You can create non-numbered lists like this:

				
<itemizedlist>
	<listitem>
		<para>Item1</para>
	</listitem>
	<listitem>
		<para>Item2</para>
	</listitem>
	<listitem>
		<para>Item3</para>
	</listitem>
	<listitem>
		<para>Item4</para>
	</listitem>
</itemizedlist>
			

The output look then like this:

  • Item1

  • Item2

  • Item3

  • Item4

You can create non-numbered lists like this:

				
<orderedlist>
	<listitem> 
	<para>This is a list entry</para>
	</listitem>
	<listitem>
	<para>This is another list entry</para>
	</listitem>
</orderedlist>

			

The output look then like this:

  1. This is a list entry

  2. This is another list entry

5.4. Links

You can create links like this

			
<para>
	We use the Ant integrated into Eclipse. See
	<ulink url="http://www.vogella.de/articles/ApacheAnt/article.html"> Apache Ant Tutorial</ulink>
	for an introduction into Apache Ant.
</para>
		

5.5. Graphics

DocBook has no restrictions what kind of graphic format you use, e.g. JPEG, PNG or SVG. You can include graphics via the following tag. The optional "phrase" is used in HTML output to define the mandatory "alt" attribute of image.

			
<para>
	<mediaobject>
		<imageobject>
			<imagedata fileref="images/antview10.gif"/>
		</imageobject>
		<textobject>
			<phrase>A text for the graphic</phrase>
		</textobject>
	</mediaobject>
</para>
		

You can also specify different graphics for different output formats.

			
<para>
	<mediaobject>
		<imageobject role="html">
			<imagedata fileref="images/antview10.gif" />
			
		</imageobject>
		<imageobject role="fo">
			<imagedata fileref="images/antview10.gif" />
			
		</imageobject>
		<textobject>
			<phrase>A text for the graphic</phrase>
		</textobject>
	</mediaobject>
</para>
		

5.6. Menus

To show nice menu paths, as for example FileNew Project, use the folloiwng

					
<menuchoice>
	<guimenu>File</guimenu>
	<guisubmenu>New Project</guisubmenu>
</menuchoice>
		

5.7. Keyboard Shortcuts

To define keyboard as for example C-x+C-c, use the folloiwng

					
<keycombo>
		<keycap>CTRL</keycap>
		<keycap>Space</keycap>
</keycombo>
		

6. Creating epub

6.1. Overview of EPub

Epub is a format for electronic book defined by the International Digital Publishing Forum (IDPF). EPUB is based on XHTML and supports styling via CSS. A EPUB file is a zip file with a predefined content. The zip file must contain a folder "META-INF" which contains a file "container.xml". This file contains a pointer to the OEBPS/content.opf file. The content.opf contains the meta information about the book and pointers to the content pages which are defined as HTML pages.

The Docbook XLST stylesheets support a conversion into EPUB. This conversion is based on the XHTML stylesheets and therefore support the same parameter as in HTML. The final epub document requries also an additonal file mimetype with a predefined content and the content of OEBPS and META-INF. The XSLT transformation will not automatically create the mimetype file and the zip file we will use Apache Ant to create them for us.

To validate an epub you can use the jar file from Epubcheck Validation tool. Download the latest 1.x version and put it into the classpath of your Ant file. Make sure that you also extract the lib folder included in the zip file relative to the epub*.jar. After the conversion you can validate your epub via the following command line. We will include the check also in our Ant task.

				
java -jar epubcheck-1.2.jar book.epub
			

6.2. Creating epub with Apache Ant

The following example is based on the same file and directory structure as the other examples. Create the following "book.xml" file in your "input" directory.

				
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book>
	<title>Docbook Book Example</title>
	
	<bookinfo>
		<title>DocBook Intro</title>
		<author>
			<firstname>Lars</firstname>
			<surname>Vogel</surname>
		</author>
	</bookinfo>
	
	<chapter>
		<title>This is the first chapter</title>
		<section>
			<title> First section in the chapter</title>
			<para>Random text.
			</para>
			<para>
			<mediaobject>
				<imageobject>
					<imagedata fileref="images/vogella_current_logo.png"/>
				</imageobject>
				<textobject>
				<phrase>
				</phrase>
			</textobject>
			</mediaobject>
		</para>
		</section>
		<section>
			<title> Second section in the chapter</title>
			<para>Other random text
			</para>
		</section>
		
	</chapter>

	<chapter>
		<title>This is the second chapter</title>
		<section>
			<title> My Title</title>
			<para>More...
			</para>
		</section>
		<section>
			<title> Other title</title>
			<para>Blabla
			</para>
		</section>
	</chapter>
</book>
			

This book refers to a image ""vogella_current_logo.png" in the image folder. Either create such an image or delete the part.

Create also an folder "epubinput" with a file "mimetype". This file should have the following content. "application/epub+zip" need to be the only content in the file and it needs to be in the first line of the file.

				
application/epub+zip
			

The following Ant "build.xml" file will create epub output.

				
<?xml version="1.0"?>
<!--
  - Author:  Lars Vogel
  -->

<project name="docbook-src" default="build-epub">
	<description>
            This Ant file is used to transform DocBook XML to epub output
    </description>

	<!--
      - Configure basic properties that will be used in the file.
      -->

	<property name="input.dir" value="input" />
	<property name="output.dir" value="output" />
	<property name="docbook.xsl.dir" value="docbook-xsl-1.76.1" />

	<property name="epub.stylesheet" value="${docbook.xsl.dir}/epub/docbook.xsl" />

	<property name="{destfilename}" value="book" />

	<!-- Making saxon available -->
	<path id="saxon.class.path">
		<pathelement location="lib/saxon.jar" />
	</path>

	<property name="epubcheck.jar" value="lib/epubcheck/epubcheck-1.2.jar" />
	<!--
      - target:  usage
      -->

	<target name="usage" description="Prints the Ant build.xml usage">
		<echo message="Use -projecthelp to get a list of the available targets." />
	</target>

	<!--
      - target:  clean
      -->

	<target name="clean" description="Cleans up generated files.">
		<delete dir="${output.dir}" />
	</target>

	<!--
      - target:  depends
      -->

	<target name="depends">
		<mkdir dir="${output.dir}" />
		<mkdir dir="${output.dir}/tmp" />
		<copy todir="${output.dir}/tmp">
			<fileset dir="epubinput">
				<include name="mimetype" />
			</fileset>
		</copy>
		<copy todir="${output.dir}/tmp/OEBPS/images">
			<fileset dir="images">
				<include name="vogella_current_logo.png" />
			</fileset>
		</copy>
	</target>

	<!--
      - target:  build-html
      - description:  Iterates through a directory and transforms
      -     .xml files into .html files using the DocBook XSL.
      -->

	<!--
	   - target:  build-epub
	   - description:  Iterates through a directory and transforms
	   -     .xml files into .epub files using the DocBook XSL.
	 -->
	<target name="build-epub" depends="clean, depends" description="Generates EPUB files from DocBook XML">

		<xslt style="${epub.stylesheet}" extension=".html" 
			basedir="${input.dir}" destdir="${output.dir}/tmp">
			<include name="**/*book.xml" />
			<param name="epub.stylesheet" expression="style.css" />
			<!-- The following parameter do not work currently
			
			<param name="epub.metainf.dir" expression="${output.dir}/META-INF/" />
			<param name="epub.oebps.dir" expression="${output.dir}/OEBPS/" />
			-->
			<classpath refid="saxon.class.path" />
		</xslt>

		<copy todir="${output.dir}/tmp/OEBPS">
			<fileset dir="OEBPS">
			</fileset>
		</copy>

		<copy todir="${output.dir}/tmp/META-INF">
			<fileset dir="META-INF">
			</fileset>
		</copy>

		<!-- Don't know how to avoid genereation of "${destfilename}.html" by Saxon -->
		<delete file="${output.dir}/tmp/book.html" />

		<echo message="Generating book.epub" level="info" />

		<!-- We create temporary zips so that minetype is the first one in the final zip  -->

		<zip destfile="${output.dir}/temp.mimetype" basedir="${output.dir}/tmp" compress="false" includes="mimetype" />
		<zip destfile="${output.dir}/temp.zip" basedir="${output.dir}/tmp/" level="9" compress="true" excludes="mimetype" includes="OEBPS/** META-INF/**" />
		<zip destfile="${output.dir}/book.epub" update="true" keepcompression="true" encoding="UTF-8" excludes="*.html">
			<zipfileset src="${output.dir}/temp.mimetype" />
			<zipfileset src="${output.dir}/temp.zip" />
		</zip>

		<!-- Have to delete these directories would be nicer to place then in tmp output dir -->
		<delete dir="./OEBPS" />
		<delete dir="./META-INF" />

		<!-- Make sure the epubcheck lib has a subfolder lib with saxon.jar and jing.jar in it
		-->
		<epub.check epub="book" />
		
	</target>

	<!-- epub check macro definition -->
	<macrodef name="epub.check" description="Check an epub">
		<attribute name="epub" description="Name of the EPUB" />
		<sequential>
			<java jar="${epubcheck.jar}" fork="true">
				<arg value="${output.dir}/@{epub}.epub" />
			</java>
		</sequential>
	</macrodef>
</project>
			

I personally see the following issues please let me know if you have a solution for it.

  • Target location of META-INF/ can be specified via "epub.metainf.dir" but if you do so this path is also used in the container.xml.

  • Same issue with "epub.oebps.dir".

You find anther example Ant file in Ant for EPUB Blog Entry from Tony Graham.

7. Create pdf output

7.1. Overview

You can convert DocBook to XML-FO via the DocBook XSL Stylesheets. XML FO stands for XML Formating Objects and is a XML Standard which is optimized for print medias. XML-FO can then be tranlated into PDF via the Apache FOP library.

7.2. Installation

In addition to the existing setup you also require the Apache FOP library. Download the binary FOP distribution from http://xmlgraphics.apache.org/fop/.

Copy all the jars from the FOP distribution in your library directory and add the libs to the ant build path. See Apache Ant Tutorial on how to modify the ant build path.

7.3. Define the Ant Task

You have to add the task to your ant build file and then call the task. The following show how to define the task and how to call it. The second listing is then a full example ant build.xml file.

				
<!--
	- Defines the ant task for xinclude
-->
<taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop" />


<!-- Transformation into pdf
	- Two steps
	- 1.) First create the FO files 
	- 2.) Then transform the FO files into pdf files
-->

<!--
	- target:  build-pdf
	- description:  Iterates through a directory and transforms
	-     .xml files into .fo files using the DocBook XSL.
-->
<target name="build-pdf" depends="depends, xinclude"
	description="Generates HTML files from DocBook XML">
	<!-- Convert DocBook Files into FO -->
	<xslt style="${fo.stylesheet}" extension=".fo" basedir="${src.tmp}"
		destdir="${src.tmp}">
		<include name="**/*book.xml" />
		<include name="**/*article.xml" />
		<param name="section.autolabel" expression="1" />
	</xslt>
	<!-- Convert FO Files into pdf -->
	<fop format="application/pdf" outdir="${doc.dir}">
		<fileset dir="${src.tmp}">
			<include name="**/*.fo" />
		</fileset>
	</fop>
</target>

			

				
<?xml version="1.0"?>
<!--
	- Author:  Lars Vogel
-->
<project name="docbook-src" default="all">

	<description>
		This Ant build.xml file is used to transform DocBook XML to
		various output formats
	</description>

	<!--
		- Defines the ant task for xinclude
	-->

	<taskdef name="xinclude" classname="de.vogella.xinclude.XIncludeTask" />

	<!--
		- Defines the ant task for xinclude
	-->
	<taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop" />

	<!--
		- Configure basic properties that will be used in the file.
	-->


	<property name="javahelp.dir" value="${basedir}/../Documentation/output/vogella/javahelp" />
	<property name="src" value="${basedir}/documentation" />
	<property name="output.dir" value="${basedir}/../Documentation/output/vogella/articles" />
	<property name="output.tmp" value="${basedir}/output.tmp" />
	<property name="lib" value="${basedir}/lib/" />
	<property name="docbook.xsl.dir" value="${basedir}/docbook-xsl-1.72.0" />
	<property name="xinclude.lib.dir" value="${basedir}/lib/" />

	<!--
		- Usage of the differect style sheets which will be used for the transformation
	-->
	<property name="eclipse.stylesheet" value="${docbook.xsl.dir}/eclipse/eclipse.xsl" />
	<property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />
	<property name="fo.stylesheet" value="${docbook.xsl.dir}/fo/docbook.xsl" />
	<property name="javahelp.stylesheet" value="${docbook.xsl.dir}/javahelp/javahelp.xsl" />



	<property name="chunk-html.stylesheet" value="${docbook.xsl.dir}/html/chunk.xsl" />




	<!--
		- target:  usage
	-->
	<target name="usage" description="Prints the Ant build.xml usage">
		<echo message="Use -projecthelp to get a list of the available targets." />
	</target>

	<!--
		- target:  clean
	-->
	<target name="clean" description="Cleans up generated files.">
		<delete dir="${output.dir}" />
	</target>

	<!--
		- target:  depends
	-->
	<target name="depends">
		<mkdir dir="${output.dir}" />
	</target>

	<!--
			- target:  copy 
			- Copies the images from the subdirectories to the target folder
		-->
	<target name="copy">
		<echo message="Copy the images" />
		<copy todir="${output.dir}">
			<fileset dir="${src}">
				<include name="**/images/*.*" />
			</fileset>
		</copy>
	</target>


	<!--
		- target: xinclude
		- description: Creates one combined temporary files for the different inputs files. 
		- The combined file will then be processed via different ant tasks
	-->
	<target name="xinclude">

		<xinclude in="${src}/DocBook/article.xml" out="${output.tmp}/DocBook/article.xml" />

		<xinclude in="${src}/JavaConventions/article.xml" out="${output.tmp}/JavaConventions/article.xml" />

		<xinclude in="${src}/JUnit/article.xml" out="${output.tmp}/JUnit/article.xml" />

		<xinclude in="${src}/EclipseReview/article.xml" out="${output.tmp}/EclipseReview/article.xml" />

		<xinclude in="${src}/HTML/article.xml" out="${output.tmp}/HTML/article.xml" />

		<xinclude in="${src}/Eclipse/article.xml" out="${output.tmp}/Eclipse/article.xml" />

		<xinclude in="${src}/Logging/article.xml" out="${output.tmp}/Logging/article.xml" />
		<!--
		<xinclude in="${src}/ant/article.xml" out="${src.tmp}/ant/article.xml" />
		-->

	</target>


	<!--
		- target:  build-html
		- description:  Iterates through a directory and transforms
		-     .xml files into .html files using the DocBook XSL.
	-->
	<target name="build-html" depends="depends, xinclude" description="Generates HTML files from DocBook XML">
		<xslt style="${html.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${output.dir}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
			<param name="html.stylesheet" expression="styles.css" />
			<param name="section.autolabel" expression="1" />
			<param name="html.cleanup" expression="1" />
			<outputproperty name="indent" value="yes" />
		</xslt>
		<!-- Copy the stylesheet to the same directory as the HTML files -->
		<copy todir="${output.dir}">
			<fileset dir="lib">
				<include name="styles.css" />
			</fileset>
		</copy>
	</target>

	<!--
			- target:  build-javahelp
			- description:  Iterates through a directory and transforms
			-     .xml files into .html files using the DocBook XSL.
		-->
	<target name="build-javahelp" depends="depends, xinclude" description="Generates HTML files from DocBook XML">
		<xslt style="${javahelp.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${javahelp.dir}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
			<outputproperty name="indent" value="yes" />
		</xslt>
	</target>





	<!--
		- target:  chunks-html
		- description:  Iterates through a directory and transforms
		-     .xml files into seperate .html files using the DocBook XSL.
	-->
	<target name="build-chunks" depends="depends, xinclude" description="Generates chunk HTML files from DocBook XML">
		<xslt style="${html.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${output.dir}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
			<param name="html.stylesheet" expression="styles.css" />
			<param name="section.autolabel" expression="1" />
			<param name="html.cleanup" expression="1" />
			<param name="chunk.first.selection" expression="1" />
		</xslt>
		<!-- Copy the stylesheet to the same directory as the HTML files -->
		<copy todir="${output.dir}">
			<fileset dir="lib">
				<include name="styles.css" />
			</fileset>
		</copy>
	</target>


	<!-- Transformation into pdf
		- Two steps
		- 1.) First create the FO files 
		- 2.) Then transform the FO files into pdf files
	-->

	<!--
		- target:  build-pdf
		- description:  Iterates through a directory and transforms
		- .xml files into .fo files using the DocBook XSL.
		- Relativebase is set to true to enable FOP to find the graphics which are included 
        - in the images directory
	-->
	<target name="build-pdf" depends="depends, xinclude" description="Generates HTML files from DocBook XML">
		<!-- Convert DocBook Files into FO -->
		<xslt style="${fo.stylesheet}" extension=".fo" basedir="${output.tmp}" destdir="${output.tmp}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
			<param name="section.autolabel" expression="1" />
		</xslt>
		<!-- Convert FO Files into pdf -->
		<fop format="application/pdf" outdir="${output.dir}" relativebase="true">
			<fileset dir="${output.tmp}">
				<include name="**/*.fo" />
			</fileset>
		</fop>
	</target>

	<!--
		- target:  chunks-html
		- description:  Iterates through a directory and transforms
		-     .xml files into seperate .html files using the DocBook XSL.
	-->
	<target name="build-eclipse" depends="depends, xinclude" description="Generates Eclipse help files from DocBook XML">
		<xslt style="${eclipse.stylesheet}" basedir="${output.tmp}" destdir="${output.dir}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
		</xslt>
	</target>

	<target name="all" depends="copy, build-html, build-pdf, build-chunks, build-eclipse">
	</target>

</project>
			

8. Influencing the output result

The XSLT stylesheets have several parameters which can influence the result of the conversion.

8.1. HTML Parameters

You find all HTML relevant parameters here http://docbook.sourceforge.net/release/xsl/current/doc/html/ .

Table 3. HTML Parameters

Parameter Description
name="section.autolabel" expression="1" Turns on the autolabeling for sections (1. Title, 1.1. Subtitle, etc.
name="chapter.autolabel" expression="1" Turns on the autolabeling for chapters
name="html.stylesheet" expression="styles.css" Define the stylesheet which should be used.
name="html.cleanup" expression="1" Will try to clean-up the html code for better readability
name="chunk.first.sections" expression="0" Will try to clean-up the html code for better readability [TODO: Does not work yet]


8.2. pdf Parameters

You find all FO / PDF relevant parameters here http://docbook.sourceforge.net/release/xsl/current/doc/fo/ .

Table 4. pdf Parameters

Parameter Description
name="section.autolabel" expression="1" Turns on the autolabeling for sections (1. Title, 1.1. Subtitle, etc.
name="chapter.autolabel" expression="1" Turns on the autolabeling for chapters
name="html.stylesheet" expression="styles.css" Define the stylesheet which should be used.
name="html.cleanup" expression="1" Will try to clean-up the html code for better readability


8.3. Add content into the HTML output

Docbook allows to include external html files into the HTML output. For example you could use this to add Javascript into your HTML output.

For example use the following statement to include some html code.

				
<?dbhtml-include href="../../myadditonalcontent.html"?>
			

See Inserting external HTML code for details.

9. Advanced Features

9.1. Syntax Highlighting

You can also enable syntax highlighting. This involves creating a customization stylesheet layer, the usage of an external lib and configuration file. Please see Source Code Syntax Highlighting with DocBook for a description of the setup.

To change how the highlighting is done you could adjust the following template file: your_xslt_installation_dir/html/highlight.xsl

9.2. Remove certain parts

Sometimes you want to remove certain parts of your document before processing it. The following is an example where sections marked with the role="wrapper" will be removed.

The following processing rule will remove the marked section. You would output that to a temporary folder and run your real conversion on the temp folder.

				
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	version="1.0">

	<xsl:output method="xml" />

	<xsl:template match="section[@role='wrapper']">
		<xsl:apply-templates select="section" />
	</xsl:template>

	<xsl:template match="@*|node()">
		<xsl:copy>
			<xsl:apply-templates select="@*|node()" />
		</xsl:copy>
	</xsl:template>

</xsl:stylesheet>

		
			

10. Using XInclude with Eclipse XSL

10.1. Overview

XInclude can be used to structure the DocBook source files so that you have one file per chapter / section and one master file which includes these files. Via XInclude these separate files can be combined into on file.

You can for example include a file "foo.xml" into another one via the following statement

				
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="foo.xml" />
			

In case this file should be treated as text:

				
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="text" href="bar.xml" />
			

A XInclude ant task is provided by the Eclipse XSL project. I can proudly say that this ant task was contributed by me to the Eclipse XLS project. :-)

10.2. Eclipse XSL Tools

Eclipse XSL Tools provide support for XSLT transformations. It supports XSL editing and debugging support. We will only use the XInclude task but you have to install the whole package.

Install the XSL tools via the update manager from the standard Eclipse Galileo update site. See Using the Eclipse update manager for details.

10.3. Using the XInclude ant task

From your Eclipse installation take the "org.eclipse.wst.xsl.core.jar" and add this jar to your ant classpath. Put the new jar into your Ant classpath. See Apache Ant Tutorial - Classpath for details.

You should now be able to create and run the xinclude task. Belong an example ant build.xml file.

				
<?xml version="1.0"?>
<!--
  - Author:  Lars Vogel
  -->
<project name="docbook-src" default="usage">

	<description>
            This Ant build.xml file is used to transform DocBook XML to various
			output formats
    </description>

	<!--
      - Configure basic properties that will be used in the file.
      -->

	<property name="doc.dir" value="${basedir}/output" />
	<property name="src" value="${basedir}/src" />
	<property name="src.tmp" value="${basedir}/src.tmp" />
	<property name="lib" value="${basedir}/lib/" />
	<property name="docbook.xsl.dir" value="${basedir}/docbook-xsl-1.72.0" />
	
	<property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />
	<property name="xinclude.lib.dir" value="${basedir}/lib/" />


	<!--
      - target:  usage
      -->
	<target name="usage" description="Prints the Ant build.xml usage">
		<echo message="Use -projecthelp to get a list of the available targets." />
	</target>

	<!--
      - target:  clean
      -->
	<target name="clean" description="Cleans up generated files.">
		<delete dir="${doc.dir}" />
	</target>

	<!--
      - target:  depends
      -->
	<target name="depends">
		<mkdir dir="${doc.dir}" />
	</target>


	<!--
	- target: xinclude
	- description: Creates one combined temporary files for the different inputs files. 
	- The combined file will then be processed via different ant tasks
		-->
	<target name="xinclude">
		<xsl.xinclude in="${src}/DocBook/article.xml" out="${src.tmp}/DocBook/article.xml" />
	</target>
	
	
	<!--
      - target:  build-html
      - description:  Iterates through a directory and transforms
      -     .xml files into .html files using the DocBook XSL.
      -->
	<target name="build-html" depends="depends, xinclude" description="Generates HTML files from DocBook XML">
		<xslt style="${html.stylesheet}" extension=".html" basedir="${src.tmp}" destdir="${doc.dir}">
			<include name="**/*book.xml" />
			<include name="**/*article.xml" />
			<param name="html.stylesheet" expression="styles.css" />
		</xslt>
		<!-- Copy the stylesheet to the same directory as the HTML files -->
		<copy todir="${doc.dir}">
			<fileset dir="lib">
				<include name="styles.css" />
			</fileset>
		</copy>
	</target>

</project>
			

11. Thank you

Please help me to support this article:

Flattr this

12. Questions and Discussion

Before posting questions, please see the vogella FAQ. If you have questions or find an error in this article please use the www.vogella.de Google Group. I have created a short list how to create good questions which might also help you.

13. Links and Literature

Eclipse XSLT

http://www.sagehill.net/docbookxsl/index.html DocBook XSL Online Book from Bob Stayton

http://www.sagehill.net/docbookxsl/HTMLTitlePage.html How to customize the titlepage of thd Docbook output

http://www.ibiblio.org/godoy/sgml/docbook/howto/writing-docbook.html Example Docbook Elements

http://sourceforge.net/projects/docbook/ The XSLT stylesheets

http://code.google.com/p/epub-tools/source/checkout Create epub from DocBook via Python

http://wiki.docbook.org/DocBookAppsMailingList XSLT mailing list

XSLT 2.0 Stylesheets

http://docbook.sourceforge.net/release/xsl/current/doc/ - Reference of the XSLT stylesheet parameters

Syntax Highlighting with XSLTHL Reference of the XSLT stylesheet parameters

http://www.docbook.org/tdg/en/html/docbook.html Reference of the DocBook parameters

http://xmlgraphics.apache.org/fop/ The Apache FOP Distribution