The 
Mathematica Journal
Volume 9, Issue 1

Search

In This Issue
Tricks of the Trade
In and Out
Trott's Corner
New Products
New Publications
Calendar
News Bulletins
New Resources
Classifieds

Download This Issue 

About the journal
Editorial Policy
Staff
Submissions
Subscriptions
Advertising
Back Issues
Contact Information

XML and Mathematica
Pavi Sandhu

Exporting XML

Functions for Exporting XML

Export

You can export XML data from Mathematica using the standard Export function.

The first argument of the function specifies the file to which the data should be exported. The second argument specifies the data to be exported. For exporting XML data, this can be a SymbolicXML expression or any other Mathematica expression. You can also specify an optional third argument to control the form of the output. For exporting XML data, the relevant file formats are "XML", "NotebookML", "ExpressionML", "MathML", and "SVG".

With "XML" as the export format, all expressions are imported as NotebookML or ExpressionML.

With "MathML" specified as the export format, the same expression is written out as MathML.

If Export is used with only two arguments, Mathematica determines the export format based on the filename extension. The ".xml" extension is associated with XML. Hence, Export[filename.xml, expr] is equivalent to Export[filename.xml, expr, "XML"], as seen in the example below.

The .mml extension is associated with MathML. Hence, Export[filename.mml, expr] is equivalent to Export[filename.mml, expr, "MathML"], as seen in the example below.

You can control the various details of the export process using the conversion options feature of the Export function.

The following commands delete the test files created by evaluating the commands in this section.

ExportString

You can convert Mathematica expressions into XML strings using the ExportString function. This function has the following syntax.

For exporting as XML, the relevant formats are "XML", "NotebookML", "ExpressionML", "MathML", and "SVG".

This command produces a SymbolicXML expression.

If the SymbolicXML expression is supplied as the first argument of ExportString, the resulting output is ordinary XML.

If the first argument is some other type of expression, the output is in the form of ExpressionML.

You can control various details of the export process using the conversion options feature of the ExportString function.

Conversion Options

Introduction

The standard ConversionOptions feature of Export can be used for controlling the export process. The syntax for specifying a conversion option is as follows.

Multiple conversion options can be specified by making the right-hand side of ConversionOptions a list of lists. For exporting XML data, the following conversion options are available.

  • "Annotations"
  • "AttributeQuoting"
  • "CheckXML"
  • "ElementFormatting"
  • "Entities"
  • "NamespacePrefixes"

Annotations

This conversion option controls which “annotations” are added to the output MathML. The value of this option is a list whose elements can be any combination of the following: "DocumentHeader", "XMLDeclaration", or "DOCTYPEDeclaration". The order of the elements in the list is irrelevant.

XMLDeclaration

When "XMLDeclaration" is one of the annotations, then an XML declaration is included in the header. That is, the statement <?xml version="1.0"?> appears in the header.

DOCTYPEDeclaration

When "DOCTYPEDeclaration" is one of the annotations, then an XML document type declaration of the form <!DOCTYPE ... > appears in the header. This is a statement that specifies the DTD for the XML application in which the output is written.

DocumentHeader

With the setting, "Annotations"->{"DocumentHeader", "XMLDeclaration", "DOCTYPEDeclaration"}, a header containing an XML declaration and a document type declaration for the MathML DTD are automatically added to the output.

When "Annotations" does not contain "DocumentHeader", then the output has no header. This is true even if the "Annotations" contains other elements such as "XMLDeclaration" or "DOCTYPEDeclaration". Thus "DocumentHeader" is an overall switch that controls whether the structure has a header or not.

The "Annotations"->"DocumentHeader" is useful for controlling the form of SymbolicXML generated. For instance, you can explicitly add an XMLElement[Document] to the SymbolicXML output, as shown below.

Here, "Annotations"->{"DocumentHeader"} is not specified so the XMLElement[Document] is omitted from the output.

AttributeQuoting

This conversion option determines whether attribute values are enclosed by single quotes or double quotes.

The option "AttributeQuoting" and its possible values.

With the default setting, "AttributeQuoting" -> "'", attribute values are enclosed in single quotes. This ensures that there is no conflict with Mathematica strings, which are typically enclosed in double quotes.

For certain applications, you might prefer attribute values to be enclosed in double quotes. You can achieve this by setting "AttributeQuoting" -> "\"". Note that the double quote character must be preceded by a forward slash to escape it.

CheckXML

This conversion option determines whether the SymbolicXML expression being exported is first checked for errors. By default this option is set to True.

The option "CheckXML" and its possible values.

You can set this option to False if you are confident the SymbolicXML is correct, because checking the XML for errors can cause processing delays. The following example shows the delay produced by checking a small file for errors.

The error checking provided by this option can be quite useful because a small error may completely ruin the exported form of a large SymbolicXML expression. With the option on, small errors can often be fixed. Here the SymbolicXML has an error, but it is fixed to give a reasonable result.

On the other hand, with "CheckXML"->False, nothing is output to the file.

ElementFormatting

This conversion option controls how elements are indented in the XML file. Possible values of the option are All, None, Automatic, or a user-defined function. The default value is Automatic.

The option "ElementFormatting" and its possible values.

The following example shows the result of using "ElementFormatting"->All.

With "ElementFormatting"->None, no extra indentation is added.

With "ElementFormatting"->Automatic, elements with mixed content (strings as content) are not indented, while elements with element-only content are indented.

We saw that with "ElementFormatting"->All, long strings are line wrapped. This can be used to produce output similar to HTML. On the other hand, ElementFormatting->Automatic produces one long line of text.

Advanced users can also specify a function to determine the formatting. The function is passed a two-element list, {namespace, localName}. The function should return True when indenting is wanted, False when no indenting is wanted, and Automatic for cases where the element-only content should be indented and mixed content should not be indented.

Entities

When exporting XML documents, it is sometimes desirable to represent special characters using named character entities. The "Entities" conversion option supports output of these named character entities.

The option "Entities" and its possible values.

You can also specify a list of entities as the value of this option as a list. For example, if you want to export both HTML and MathML entities, you could use this setting.

If neither the "HTML" nor "MathML" settings are used, all characters are still output correctly in XML. However, they may be numeric entities or encoded in UTF-8.

Here we use the "HTML" setting to turn an α in the input into the named character entity &alpha;.

You can also enter your own list of character replacement rules to be used. If you do this, then you are also responsible for including some basic escaping required by XML. For example:

If you specify any value for Entities, it is your responsibility to ensure that appropriate entity declarations are present. For example, by using the "HTML" setting, you can easily generate XML with HTML entities. In this example, the Icelandic character “thorn” is exported as the corresponding character entity reference.

Here, the thorn entity is not declared, and so the character is exported as a numeric character reference.

NamespacePrefixes

This option lets you generate XML markup with a specific namespace declaration and namespace prefixes. The option is specified in the form

where url and prefix are strings specifying the URL of the namespace and the namespace prefix. In the following example, the "NamespacePrefix" option is used to generate presentation markup with each MathML element having a namespace prefix "mml" associated with the MathML namespace.



     
About Mathematica  Download Mathematica Player
Copyright © Wolfram Media, Inc. All rights reserved.