The XML Elements of Style
by Steve Muench10/18/2000
In honor of the eminently pragmatic William Strunk, Jr. and E. B. White, I present the XML Elements of Style: the elements you must follow as you create your own documents. If your XML document follows these ten basic rules, it qualifies as a "well-formed XML document."
|
Related Reading
|
- Begin each document with an XML declaration. The first characters in any XML document should be an XML declaration. The declaration is case-sensitive and looks like this in its simplest form:
- Use only one top-level, enclosing document element. The first, outermost element in an XML document is called the document element because its name announces what kind of document it is--<FAQ-List>, <Book>, <Transaction>, <TrackingStatus>, etc. You must have only one document element per document. So the following is legal:
- Match opening and closing tags properly. XML is case-sensitive, so the following are not considered matching tag names:
- Add comments between <!-- and --> characters. You can include comments anywhere after the XML declaration as long as they are not inside attribute values and don't occur in the middle of the < and > boundaries of a tag. So the comments in the following document are legal:
- Start element and attribute names with a letter. Element and attribute names must be a contiguous sequence of letters and cannot start with a digit or include spaces in the name. The following are not allowed:
- Put attributes in the opening tag. Attributes are listed inside the opening tag of the element to which they apply. The following is correct:
- Enclose attribute values in matching quotes. Either of the following is fine:
- Use only simple text as attribute values. Elements are the only things that can be nested. Attributes only contain simple text values. So the following is illegal:
- Use < and & instead of < and & for the literal less-than and ampersand characters. The less-than and ampersand characters have a special meaning in XML files, so when you need to use either of these characters literally, you need to use < and & instead:
- Write empty elements as <ElementName/>. Elements that do not contain other elements or text nested within them can be written with the more compact empty element syntax of:
<?xml version="1.0"?>
The special tag delimiters of <? and ?> distinguish this declaration from other tags in the document. The <?xml characters in the XML declaration must be the very first characters in the document. No spaces or carriage returns or anything can come before them.
<?xml version="1.0"?>
<Question>Is this legal?</Question>
But the following is not:
<?xml version="1.0"?>
<Question>Is this legal?</Question>
<Answer>No</Answer>
because both <Question> and <Answer> are top-level elements. You can't even have the same element name repeated at the top level: there must be exactly one. So the following is also illegal:
<?xml version="1.0"?>
<Question>Is this legal?</Question>
<Question>Is that your final answer?</Question>
You need to pick a single name and use that element to enclose the others, like:
<?xml version="1.0"?>
<FAQ-List>
<Question>Is this legal?</Question>
<Question>Is that your final answer?</Question>
</FAQ-List>
<Question>Is this legal?</question>
<QUESTION>Is this legal?</Question>
You'll find that XML syntax is rigid and unforgiving. You cannot get away with being sloppy about the order of closing tags. The following is illegal:
<Question><Link href="http://qa.com/">Is this
legal?</Question></Link>
You need to close </Link> before closing </Question>, like this:
<Question><Link href="http://qa.com/">Is this
legal?</Link></Question>
Simply keeping your tags neatly indented helps you avoid this mistake:
<Question>
<Link href="http://qa.com/">Is this legal?</Link>
</Question>
Note that adding extra spaces, carriage returns, or tabs between nested tags to make an XML document look indented to the human eye does not affect its structural meaning when working with datagrams, although clearly it increases the document's size slightly.
<?xml version="1.0"?>
<!-- Comment Here ok -->
<FAQ-List>
<!-
| And here, multiple lines are fine
+-->
<Question>Is this legal?<!-- Here is fine --></Question>
<!-- Here too -->
<Answer>Yes</Answer>
</FAQ-List>
<!-- Even Here -->
but all four comments in this example are not:
<!-- NOT before XML declaration -->
<?xml version="1.0"?>
<FAQ-List>
<FAQ Submitter="<!-- NOT in an attribute value -->" >
<Question <!-- NOT between < and > of a tag --> >Is this
legal?</Question>
<Answer>Yes</Answer>
<!-- Illegal for comment to contain two hypens -- like this -->
</FAQ>
</FAQ-List>
<2-Part-Question> <!-- Error: element name starts with a digit --> <Two Part Question> <!-- Error: has spaces in the name --> <Question 4You="Yes"> <!-- Error: attribute name starts with a digit -->
Some punctuation symbols (like underscore and hyphen) are allowed in names, but most others are illegal:
<_StrangeButLegal>Legal</_StrangeButLegal>
<More-Normal-Looking>Legal</More-Normal-Looking>
<OK_As_Well>Legal</OK_As_Well>
<FAQ Submitter="smuench@oracle.com">
<!-- etc. -->
</FAQ>
while the following is illegal:
<FAQ>
<!-- etc. -->
</FAQ Submitter="smuench@oracle.com">
<FAQ Submitter="smuench@oracle.com">
<FAQ Submitter='smuench@oracle.com'>
but the following two are not. You can't forget the quotes:
<FAQ Submitter=smuench@oracle.com>
<FAQ Submitter='smuench@oracle.com">
or be sloppy about using the same closing quote character as your opening one.
<Task Subtasks="<Task Name='Learn XML Syntax'>"/>
<Company>AT & T</Company> <!-- AT & T --> <Where-Clause>SAL < 5000</Where-Clause> <!-- SAL < 500 -->
On occasion, the " and ' also come in handy to represent literal " and ' in attribute values:
<Button On-Click="alert('Print a " and '');"></Button>
<Task Name="Learn XML Syntax">
<Task Name="Use Empty Elements"/> <!-- Empty Element -->
</Task>
As shown above with the Name attribute on the empty <Task> element, attributes on empty elements are still legal.
Steve Muench is Oracle's lead XML Technical Evangelist and development lead for Oracle XSQL Pages.
Return to oracle.oreilly.com.



