This repository contains:
You will need a validating XML editor.
For example (there are many alternatives):
The following procedure assumes that you have extracted all of the files in this package into one directory.
confluence-page-example.xml
in your XML editor.Explore the DTD/XSD-aware features of your XML editor, such as (depending on your particular editor):
The Confluence Source Editor plugin ("advanced editor") displays the source of a page as an XML snippet: a collection of XML elements, without a single root element.
To edit the source as a document in a validating XML editor, you need to wrap the snippet in a root element.
This package supplies example XML documents that use the root element name ac:confluence
.
You will need:
confluence-page-template.xml
.<!-- Replace this comment with your page source -->
<ac:confluence>
start tag and the </ac:confluence>
end tag.
Do not select <ac:confluence>
or </ac:confluence>
.The Confluence WebDAV plugin serves page source in the same manner as the Confluence Source Editor plugin: as an XML snippet (without a root element),
rather than an XML document; and with the file extension .txt
, rather than, say, .xml
.
Similar to the previous procedure for working with the Confluence Source Editor plugin, you need to wrap the snippet in a root element, and then unwrap it before saving it back to Confluence.
I have added a comment to the Confluence Storage Format page requesting a change to the behavior of the WebDAV plugin, so far without response from Atlassian.
To validate an XML document, an XML editor needs to know where to find the DTD/XSD files.
The supplied file confluence-page-template.xml
contains explicit references to confluence.dtd and confluence.xsd:
<!DOCTYPE ac:confluence SYSTEM "confluence.dtd"> <ac:confluence ... xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.atlassian.com/schema/confluence/4/ac/ confluence.xsd">
This technique requires the DTD/XSDs to be in the same directory as the document (.xml). (You can also use relative or absolute references to another directory.)
However, if your XML editor supports catalogs, then your documents do not need to explicitly refer to the correct location of the DTD/XSDs. Instead, the XML editor uses a catalog to locate these files.
A catalog is a file that maps:
(This specific mapping behavior applies to files that you are accessing via a file system — which is typical of a document editing environment — rather than via the web.)
The supplied file confluence-page-template-for-catalog.xml
is an example of a document that you can use with a catalog.
The schemaLocation attribute is the same as before; the only difference is the DOCTYPE, which contains an FPI followed by a system identifier:
<!DOCTYPE ac:confluence PUBLIC "-//Atlassian//Confluence 4 Page//EN" "http://www.atlassian.com/schema/confluence/4/confluence.dtd">
(The system identifier does not, in practice, need to point to an actual resource; the XML editor will, by preference, attempt to locate the DTD via the catalog, using the FPI).
A catalog is supplied in the file catalog.xml
.
The method for making a catalog available to an editor depends on the particular editor.
Edit RootCatalog.xml
in the XMLSpy installation folder
(for example, C:\Program Files\Altova\XMLSpy2012\
),
and insert the following element before
the </catalog>
end tag:
<nextCatalog catalog="drive letter:/directory path to Confluence schema package/catalog.xml"/>
Restart XMLSpy.
Click Plugins ► Plugin Options... ► XML ► Catalogs, click the + (plus sign) button, and then select the supplied catalog.xml
.
Restart jEdit.
Tip: In my experience, clicking ► Plugins ► XML ► Clear Resource Cache is not always effective.
File | Description |
---|---|
catalog.xml | OASIS XML catalog |
confluence.dtd | Document type definition (DTD) |
confluence.xsd | Master XSD (W3C XML 1.0 Schema document) |
confluence2xhtml.xsl | XSLT stylesheet: transforms Confluence storage format into XHTML (more like the rich text editor display than a "preview") |
confluence-page-example.xml | Example Confluence page source XML document |
confluence-page-example-with-xslt.xml | Example Confluence page source XML document containing a reference to the XSLT stylesheet confluence2xhtml.xsl (tip: open this in Firefox) |
confluence-page-example-with-xslt-wiki.xml | Example Confluence page source XML document containing a reference to the XSLT stylesheet wikifier/confluence2wiki.xsl (tip: open this in Firefox) |
confluence-page-template.xml | Example Confluence page source XML document with empty body |
confluence-page-template-for-catalog.xml | Example Confluence page source XML document for use with catalog (no explicit reference to local copy of DTD/XSD) |
confluence-ri.xsd confluence-xhtml.xsd xml.xsd |
Other XSD files used by the master XSD |
index.html | The file you are reading now |
wikifier/* | Wikifier source files |
xhtml1-lat1.ent xhtml1-special.ent xhtml1-symbol.ent |
XHTML character entity definitions (used in the Confluence DTD) |
Wikifier is a web-based test harness for the XSLT stylesheet confluence2wiki.xsl (supplied in the wikifier directory) that transforms Confluence XML into wiki markup.
To convert Confluence XML to wiki markup:
To copy the wiki markup from Wikifier to your clipboard:
Wikifier does not send your Confluence XML to a server; all processing of your Confluence XML is done client-side.
I have tested Wikifier in the following web browsers: IE9, and current "production" versions of Chrome, Firefox, and Safari (all on Windows).
Tip: Instead of using Wikifier, you can paste your XML into the supplied file confluence-page-example-with-xslt-wiki.xml, and then open the file in Firefox to see the converted wiki markup.
Wikifier is a minimal test harness for the XSLT stylesheet I have developed to convert Confluence XML to wiki markup.
The XSLT stylesheet is by no means complete. I welcome your feedback. If Wikifier does not correctly convert some Confluence XML, please let me know, and I will do what I can (no promises, though).
Wikifier is not a replacement for the Confluence 3 wiki markup editor view.
Wikifier is only a test harness; it is not intended to be a fully fledged application. The XSLT stylesheet took me about a day and a half to develop; same again for Wikifier (my cross-browser JavaScript coding skills are both rudimentary and extremely rusty!).
I developed the XSLT stylesheet for the following use case: to copy relatively simple content from the current version of Confluence (4) to the current version of JIRA (4).
I did not develop the XSLT stylesheet to bring wiki markup back to Confluence. However, if you want to, you can paste the wiki markup from Wikifier into:
The XSLT stylesheet could be used as the "heart" of a plugin, although I have no immediate plan to do that.
Bear in mind the following comment from Paul Curren (Atlassian):
Wiki markup can only represent a subset of what can be represented in XHTML.
What Paul says is true. For example, if you paste Confluence XML table markup (which is, in this specific case, XHTML markup) with merged cells into Wikifier, the resulting wiki markup will retain the table cell contents, but will not retain the merged cell formatting.
Also from Paul, also true:
just about anything is possible with appropriate development effort
I can imagine that it might, perhaps, be possible to develop new Confluence 3 macros to match new capabilities in Confluence 4, and have an XSLT stylesheet transform such Confluence 4 syntax into these new macros (or even, say, as the contents of the existing Confluence 3 HTML macro). For me, though, this is a purely academic issue. I can now copy content from Confluence to JIRA, which is what I was after.
To validate Confluence page source, you need either:
You cannot validate Confluence page source with only the XSD, because Confluence page source
can contain references to character entities (for example, —
) that can only be
defined in a DTD. If you attempt to validate Confluence page source that contains character
entity references, but you do not refer to the DTD, you will get an XML parsing error.
The DTD/XSDs have been tested using the following Confluence page source:
A Confluence export consists of a .zip file containing a single entities.xml
file, which contains page source inside <property name="body">
elements.
I wrote a script to extract the contents of these elements into individual XML files, which I then validated using a Windows batch (.bat) file that calls xmllint, like this:
for %%f in ("*.xml") do "%_xmllint%" --noent --nowarning --noout --loaddtd --schema "%_schema%" "%%f" >> c:\temp\log.txt 2>&1
where:
_xmllint
contains the path of the xmllint executable_schema
contains the path of confluence.xsd
(xmllint does not look for schemaLocation attributes)<!DOCTYPE ac:confluence SYSTEM "confluence.dtd">
for the --loaddtd
option.
Tip: The <property name="body">
elements in entities.xml
wrap the page contents in a CDATA section.
The page contents can also contain CDATA sections. However, nested CDATA sections are not allowed in XML, so, to avoid this issue,
the ]]>
terminators of the CDATA sections in the page contents contain a space (]] >
). When extracting the page contents
into individual XML files, you need to remove these spaces.
Highlighted items in the following listing are my own coinage, not approved by Atlassian:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE ac:confluence PUBLIC "-//Atlassian//Confluence 4 Page//EN" "http://www.atlassian.com/schema/confluence/4/confluence.dtd"> <ac:confluence xmlns:ac="http://www.atlassian.com/schema/confluence/4/ac/" xmlns:ri="http://www.atlassian.com/schema/confluence/4/ri/" xmlns="http://www.atlassian.com/schema/confluence/4/"> <p>Contents of page body</p> </ac:confluence>
Value: ac:confluence
Notes:
ac:page
might be a better choice of name for this root element (more specific).Values:
http://www.atlassian.com/schema/confluence/4/ac/ http://www.atlassian.com/schema/confluence/4/ri/ http://www.atlassian.com/schema/confluence/4/
Notes:
--nowarning
option to suppress warnings about loading a schema that has already been loaded. Suggestions welcome.<p>Text inside a paragraph element.</p> illegal text
I wish to thank the following people for their assistance and/or encouragement in developing this package:
Most recent changes first:
Date yyyy-mm-dd |
Description |
---|---|
2024-09-26 | Stored these files in GitHub, published as a GitHub Pages site. |
2012-06-06 | Prettified this readme. |
2012-05-01 |
|
2012-04-23 |
|
2012-04-18 |
|
2012-04-13 |
|
2012-04-12 | First draft of DTD. |
This package and its contents are distributed under the BSD 2-Clause license (also known as the Simplified BSD license):
Copyright © 2012, Fundi Software
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.