October 30, 2002

Enterprise XML Schema Advice Needed

So you may have read about the fact that we're thinking pretty hard about re-architecting things to use lots of XML at work. Now we're facing an interesting challenge, and I'd like to ask the blog world for advice. Surely we're not the first group to encounter this.

The Problem

The problem is that we have tons of data that we need to represent in XML. Much of the data is related to stock tickers. For example, given YHOO, we have earnings information, P/E ratio, average daily volume, EPS, full company name, and so on. However, we also have some data that doesn't map to particular tickers--instead is maps to more general symbols that we use internally (industry codes, etc.).

The Goal

We're trying to figure out how to create an XML Schema (or several?) that encompasses the full collection of data that we may want to publish both on-line and internally. It's hard to figure out the right approach or where to start.

Do we build one gigantic schema? If so, what problems do we run into down the road? Will we be generating new versions too often?

Should we instead build several schemas? One for us, one for those who consume our data, and others as needed?

If anyone has written about this from a practical point of view, we'd love pointers to it. (At least I would.) Theory is all well and good, but if you haven't been through this exercise before, I'm going to be skeptical about your recommendations. Why? This feels like a hard problem that looks easy on the surface.

Also, what about naming conventions for both namespaces and elements? That's likely to become a semi-religious debate quickly, but it can't hurt (too much) to ask. :-)

Posted by jzawodn at 11:20 AM