What characters do I need to escape in XML documents ?
XML escape characters
Escaping characters depends on where the special character is used.
The five characters were:
- ” "
- ‘ '
- < <
- > >
- & &
To escape five characters in text, the three characters “, ‘ and > not be escaped in text.
<?xml version="1.0"?> <valid>"'></valid>
To escape all five characters in attributes, the > character need not be escaped in attributes.
<?xml version="1.0"?> <valid attribute=">"/>
The ‘ character not be escaped in attributes if the quotes are “.
<?xml version="1.0"?> <valid attribute="'"/>
The “ not be escaped in attributes if the quotes are ‘ .
<?xml version="1.0"?> <valid attribute='"'/>
All 5 special characters must not be escaped in comments.
<?xml version="1.0"?> <valid> <!-- "'<>& --> </valid>
The five special characters should not be escaped in CDATA sections.
<?xml version="1.0"?> <valid> <![CDATA["'<>&]]> </valid>
The five special characters must not be escaped in XML processing instructions.
<?xml version="1.0"?> <?process <"'&> ?> <valid/>
XML vs. HTML:
The HTML has set of escape codes that cover a more characters.
Get this statement:
In XML, HTML and SGML documents, logical constructs can be termed as character data and attribute values contains of chain of characters, in which all character can distinct directly or defined by a series of characters known as character reference, that are of two types: a numeric character reference and a character entity reference. The character entity references are valid in HTML and XML documents.
Five predefined XML entities were:
- quot “
- amp &
- apos ‘
- lt <
- gt >
Try this key:
Tags and attributes has various escaping characters.
- < <
- > > (only for compatibility, read below)
- & &
- ” "
- ‘ '
The & ampersand character and the < left angle bracket should not occur in accurate form, except when used in within a block, a processing instruction, or a CDATA section. It must escaped numeric character references or the strings ” & ” and ” < ” if they needed. The right angle bracket > can represents the string ” > “,and escaped using either ” > ” or a character reference when it displays in the string ” ]]> ” in content, when that string is not marking the end of a CDATA section.
Access the attribute values to contain both single and double quotes, the single-quote character ‘ can be defined as ” ' “, and the double-quote character “ as ” " “.