XML schema for the Confluence storage format

View the schema files in GitHub

This repository contains:

Getting started

Before you begin

You will need a validating XML editor.

For example (there are many alternatives):

The following procedure assumes that you have extracted all of the files in this package into one directory.

Procedure

  1. Open the supplied file confluence-page-example.xml in your XML editor.
  2. "Play" (edit the contents).

    Explore the DTD/XSD-aware features of your XML editor, such as (depending on your particular editor):

Editing XML copied from the Confluence Source Editor plugin

About this task

The Confluence Source Editor plugin ("advanced editor") displays the source of a page as an XML snippet: a collection of XML elements, without a single root element.

To edit the source as a document in a validating XML editor, you need to wrap the snippet in a root element.

This package supplies example XML documents that use the root element name ac:confluence.

Before you begin

You will need:

Procedure

  1. In your XML editor, open the supplied file confluence-page-template.xml.
  2. In Confluence:
    1. Edit a page (open a page in the rich text editor).
    2. Open the page in the source editor.
    3. Press Ctrl+A to select all of the source.
    4. Press Ctrl+C to copy the source to the Clipboard.
  3. In your XML editor:
    1. Select the following comment:
      <!-- Replace this comment with your page source -->
      
    2. Press Ctrl+V to paste the source copied from Confluence.
    3. Edit the source.
    4. When you have finished editing, select the source between the <ac:confluence> start tag and the </ac:confluence> end tag. Do not select <ac:confluence> or </ac:confluence>.
    5. Copy the selected source to the Clipboard.
  4. In Confluence, press Ctrl+V to paste the edited source into the Source Editor plugin, replacing the original source.

Editing XML accessed via WebDAV

The Confluence WebDAV plugin serves page source in the same manner as the Confluence Source Editor plugin: as an XML snippet (without a root element), rather than an XML document; and with the file extension .txt, rather than, say, .xml.

Similar to the previous procedure for working with the Confluence Source Editor plugin, you need to wrap the snippet in a root element, and then unwrap it before saving it back to Confluence.

I have added a comment to the Confluence Storage Format page requesting a change to the behavior of the WebDAV plugin, so far without response from Atlassian.

Using a catalog, rather than explicitly referring to the DTD/XSD in each document instance

To validate an XML document, an XML editor needs to know where to find the DTD/XSD files.

The supplied file confluence-page-template.xml contains explicit references to confluence.dtd and confluence.xsd:

<!DOCTYPE ac:confluence SYSTEM "confluence.dtd">
<ac:confluence ...
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.atlassian.com/schema/confluence/4/ac/ confluence.xsd">

This technique requires the DTD/XSDs to be in the same directory as the document (.xml). (You can also use relative or absolute references to another directory.)

However, if your XML editor supports catalogs, then your documents do not need to explicitly refer to the correct location of the DTD/XSDs. Instead, the XML editor uses a catalog to locate these files.

A catalog is a file that maps:

(This specific mapping behavior applies to files that you are accessing via a file system — which is typical of a document editing environment — rather than via the web.)

The supplied file confluence-page-template-for-catalog.xml is an example of a document that you can use with a catalog. The schemaLocation attribute is the same as before; the only difference is the DOCTYPE, which contains an FPI followed by a system identifier:

<!DOCTYPE ac:confluence PUBLIC "-//Atlassian//Confluence 4 Page//EN" "http://www.atlassian.com/schema/confluence/4/confluence.dtd">

(The system identifier does not, in practice, need to point to an actual resource; the XML editor will, by preference, attempt to locate the DTD via the catalog, using the FPI).

A catalog is supplied in the file catalog.xml.

The method for making a catalog available to an editor depends on the particular editor.

Using a catalog with Altova XMLSpy

Edit RootCatalog.xml in the XMLSpy installation folder (for example, C:\Program Files\Altova\XMLSpy2012\), and insert the following element before the </catalog> end tag:

<nextCatalog catalog="drive letter:/directory path to Confluence schema package/catalog.xml"/>

Restart XMLSpy.

Using a catalog with jEdit

Click Plugins ► Plugin Options... ► XML ► Catalogs, click the + (plus sign) button, and then select the supplied catalog.xml.

Restart jEdit.

Tip: In my experience, clicking ► Plugins ► XML ► Clear Resource Cache is not always effective.

Contents of this repository

File Description
catalog.xml OASIS XML catalog
confluence.dtd Document type definition (DTD)
confluence.xsd Master XSD (W3C XML 1.0 Schema document)
confluence2xhtml.xsl XSLT stylesheet: transforms Confluence storage format into XHTML (more like the rich text editor display than a "preview")
confluence-page-example.xml Example Confluence page source XML document
confluence-page-example-with-xslt.xml Example Confluence page source XML document containing a reference to the XSLT stylesheet confluence2xhtml.xsl (tip: open this in Firefox)
confluence-page-example-with-xslt-wiki.xml Example Confluence page source XML document containing a reference to the XSLT stylesheet wikifier/confluence2wiki.xsl (tip: open this in Firefox)
confluence-page-template.xml Example Confluence page source XML document with empty body
confluence-page-template-for-catalog.xml Example Confluence page source XML document for use with catalog (no explicit reference to local copy of DTD/XSD)
confluence-ri.xsd
confluence-xhtml.xsd
xml.xsd
Other XSD files used by the master XSD
index.html The file you are reading now
wikifier/* Wikifier source files
xhtml1-lat1.ent
xhtml1-special.ent
xhtml1-symbol.ent
XHTML character entity definitions (used in the Confluence DTD)

Wikifier: Convert Confluence XML to wiki markup

Go to the Wikifier web page

Wikifier is a web-based test harness for the XSLT stylesheet confluence2wiki.xsl (supplied in the wikifier directory) that transforms Confluence XML into wiki markup.

To convert Confluence XML to wiki markup:

  1. Copy Confluence XML from the Confluence Source Editor plugin to your clipboard. (Or, equivalently, copy the Confluence XML contents of a .txt file served by the Confluence WebDAV plugin.)
  2. Go to the Wikifier web page. (Or host Wikifier on your own server, using the files supplied in the wikifier directory.)
  3. Paste (press Ctrl+V) the Confluence XML into the text area under the "Confluence XML" heading.

To copy the wiki markup from Wikifier to your clipboard:

  1. Select the text under the "Wiki markup" heading (tip: press Ctrl+A).
  2. Copy (press Ctrl+C) the selected text to your clipboard.

Wikifier does not send your Confluence XML to a server; all processing of your Confluence XML is done client-side.

I have tested Wikifier in the following web browsers: IE9, and current "production" versions of Chrome, Firefox, and Safari (all on Windows).

Tip: Instead of using Wikifier, you can paste your XML into the supplied file confluence-page-example-with-xslt-wiki.xml, and then open the file in Firefox to see the converted wiki markup.

What Wikifier is, and is not

Wikifier is a minimal test harness for the XSLT stylesheet I have developed to convert Confluence XML to wiki markup.

The XSLT stylesheet is by no means complete. I welcome your feedback. If Wikifier does not correctly convert some Confluence XML, please let me know, and I will do what I can (no promises, though).

Wikifier is not a replacement for the Confluence 3 wiki markup editor view.

Wikifier is only a test harness; it is not intended to be a fully fledged application. The XSLT stylesheet took me about a day and a half to develop; same again for Wikifier (my cross-browser JavaScript coding skills are both rudimentary and extremely rusty!).

Why I developed the XSLT stylesheet

I developed the XSLT stylesheet for the following use case: to copy relatively simple content from the current version of Confluence (4) to the current version of JIRA (4).

I did not develop the XSLT stylesheet to bring wiki markup back to Confluence. However, if you want to, you can paste the wiki markup from Wikifier into:

The XSLT stylesheet could be used as the "heart" of a plugin, although I have no immediate plan to do that.

Round-tripping?

Bear in mind the following comment from Paul Curren (Atlassian):

Wiki markup can only represent a subset of what can be represented in XHTML.

What Paul says is true. For example, if you paste Confluence XML table markup (which is, in this specific case, XHTML markup) with merged cells into Wikifier, the resulting wiki markup will retain the table cell contents, but will not retain the merged cell formatting.

Also from Paul, also true:

just about anything is possible with appropriate development effort

I can imagine that it might, perhaps, be possible to develop new Confluence 3 macros to match new capabilities in Confluence 4, and have an XSLT stylesheet transform such Confluence 4 syntax into these new macros (or even, say, as the contents of the existing Confluence 3 HTML macro). For me, though, this is a purely academic issue. I can now copy content from Confluence to JIRA, which is what I was after.

Tips

DTD, or XSD and DTD, but not just XSD

To validate Confluence page source, you need either:

You cannot validate Confluence page source with only the XSD, because Confluence page source can contain references to character entities (for example, &mdash;) that can only be defined in a DTD. If you attempt to validate Confluence page source that contains character entity references, but you do not refer to the DTD, you will get an XML parsing error.

Development notes

Testing

The DTD/XSDs have been tested using the following Confluence page source:

Tip: The <property name="body"> elements in entities.xml wrap the page contents in a CDATA section. The page contents can also contain CDATA sections. However, nested CDATA sections are not allowed in XML, so, to avoid this issue, the ]]> terminators of the CDATA sections in the page contents contain a space (]] >). When extracting the page contents into individual XML files, you need to remove these spaces.

Coined names and identifiers (not approved by Atlassian)

Highlighted items in the following listing are my own coinage, not approved by Atlassian:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ac:confluence PUBLIC "-//Atlassian//Confluence 4 Page//EN" "http://www.atlassian.com/schema/confluence/4/confluence.dtd">
<ac:confluence
  xmlns:ac="http://www.atlassian.com/schema/confluence/4/ac/"
  xmlns:ri="http://www.atlassian.com/schema/confluence/4/ri/"
  xmlns="http://www.atlassian.com/schema/confluence/4/">
<p>Contents of page body</p>
</ac:confluence>

Document root element name

Value: ac:confluence

Notes:

Namespace names (URIs)

Values:

http://www.atlassian.com/schema/confluence/4/ac/
http://www.atlassian.com/schema/confluence/4/ri/
http://www.atlassian.com/schema/confluence/4/

Notes:

Known issues

General

Wikifier

Acknowledgements

I wish to thank the following people for their assistance and/or encouragement in developing this package:

Shannon Greywalker (Confluence user)
I only know Shannon through comments on the Confluence website. What Shannon writes is worth reading.
Sarah Maddox (Atlassian)
Sarah pointed me to the exported Confluence Documentation that Atlassian makes available for download, and gave me tips on finding the page source inside. That source was a valuable bucket of test cases for developing the DTD/XSDs.
Other Atlassians (in particular, Paul Curren)
Thank you for taking the time to respond to my comments openly, positively, and constructively. It's sincerely appreciated.

Change log

Most recent changes first:

Date
yyyy-mm-dd
Description
2024-09-26 Stored these files in GitHub, published as a GitHub Pages site.
2012-06-06 Prettified this readme.
2012-05-01
  • Added a license statement
  • Added source files for Wikifier (in the wikifier subdirectory), which converts Confluence XML to wiki markup
  • Added confluence-page-example-with-xslt-wiki.xml (open this in Firefox to see its Confluence XML converted to wiki markup)
2012-04-23
  • Corrected catalog-related tips in this readme
  • Refined DTD comments after testing with the Eclipse XML editor
  • Refined confluence2xhtml.xsl (improved table formatting)
2012-04-18
  • Added this readme
  • Refined DTD after testing against Atlassian's own Confluence Documentation source
  • Added XSD (constraints match DTD, as far as possible)
  • Added catalog and related sample files
2012-04-13
  • Refined DTD after further experimentation to see what markup the rich text editor can create
  • Added example Confluence page source file (containing a variety of markup; open this in your favorite validating XML editor), with a variant that refers to an XSLT stylesheet (below)
  • Added "Confluence to XHTML" XSLT stylesheet (confluence2xhtml.xsl)
2012-04-12 First draft of DTD.

License

This package and its contents are distributed under the BSD 2-Clause license (also known as the Simplified BSD license):

Copyright © 2012, Fundi Software

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.