I’ve been working on an interesting data description system: something for automatically generating the API as well as the database procedures that are necessary for doing the metrics just the way you want. In my mind, this is what XML was built for. An XML description of the objects and events that occur on them (as well as hierarchical relationships) is probably the easiest and cleanest way (in my mind) to run the generator (save using a graphical tool).
Now, XML has a bad reputation, especially in the business world, for being used situations where it’s not really welcome. Certainly, a lot of people use XML for the wrong reasons, but it certainly has a lot of benefits, especially over other arbitrary or constructed file formats, even other plain text file formats.
First, XML is (mostly) human readable, but still fairly fast to parse. There are many text based file formats out there that are certainly more human readable, but have parsing problems because they are so concerned with human readability. Most people, after learning the initial concepts of reading XML, can edit it just fine, and that’s what I’m looking for.
More importantly, XML is supported by a number of tools, IDEs and APIs (including the heavy integration .NET) which means that I don’t have to find or create these tools or even recommend any to our customers. They can integrate into any IDE that supports XML. Some IDEs even support XML validation against schemas as well as auto completion. You can’t get that kind of support with many other text based file formats, and definitely not with those that you create yourself.
Lastly, and I’ve touched on it already, is the validation support through XML Schemas. Even before beginning to parse the XML, I can run through it quickly and check to make sure that I can make certain assumptions. With the right schema, I can assume order of elements, uniqueness of names, and even make sure that references point to valid types. This makes parsing the file about a hundred times easier. I only have to check for a few oddities that Schemas can’t validate (e.g. only using the ref attribute under specific circumstances, or only adding parameters to specific event types). Since this is all standardized into XML, using it just makes sense, even if the generated files are slightly less than human readable.
And then there’s the biggest tool support of all: Revision control. You can check XML files into your source control tool and actually see diffs! Though I guess that would be true of any text format, but is certainly not true of some random binary format you’ve written a custom editor for. That why all our random formats are XML.
I actually said the same thing at the tools round table, and suggested it as a way to go in the future at my previous employ. That said, any text format will do the trick, it’s just XML has so many tools available for it that it’s just easier to use it over any other.
[...] thinks that would be an interesting talk. I’d probably go into some things I’ve talked about here (why use XML over other possibilities and where to use it) as well as some things I haven’t (the [...]