Today XML 1.0 Fifth edition is officially out and comes with its errata corrections. See here.

One among them has more impact than the other : the change in the definition of allowed character for Names (element, attribute, processing instruction, entities and IDs and IDREFs)

This change is about making the reference of XML specification to version 5 and beyond of Unicode instead of beeing stuck to version 2.0 (in fact instead of just give the list of allowed characters, it give the list of forbidden characters)

That is a valuable goal but it will definitely come with a lot of burden.

Some of the burden are the tons of specifications pointing to XML, without more precision, that are based on the assumption that XML never change too much.

Some of them are more important than the other

  • Namespaces in XML 1.0
  • XML Schema 1.0
  • XPath 2.0

A very interesting post, with good link to other post, can be found on James Clark's blog (don't forget to see the link at the top and also the comments)

Now it is clear that there will be a bunch of errata published to take this into account

Anyway, it is also an improvement since now it will be possible to write tag names and ids in tifinagh or using whatever character of devanangari !