Identifying XBRL Documents

Question

After reading about XBRL Validation, it would be a great feature to add to a work in progress program. However, due to performance limitations, I can't read in the entire document into the system for validation, as large amount of documents maybe flowing into the system for processing, or the document itself could be large.

I thought, maybe by reading the first few bytes of the document, we could identify whether the document is an xbrl or not. Assuming that in an xbrl document, the first few bytes of the xbrl (without the xml declaration) will always start with either be "xbrl" or "xbrli:xbrl"

Would it be safe to assume that, an XBRL document is defined by the root tag of the document to either be "xbrl" or "xbrli:xbrl"? Or is there a better way to identify an xbrl document without having to parse the entire document?

Thanks!

score 1 · Accepted Answer · edited May 23 '17 at 12:08

1

It is not safe to assume this. Though, if a 95% hitrate is good enough for you then its good.

It would be almost 100% safe if you would check for the prefix explicitly:

check for xmlns:prefix="http://www.xbrl.org/2003/instance" and for a root <prefix:xbrl ...>
check for xmlns="http://www.xbrl.org/2003/instance" and for a root <xbrl ...>

Maybe, you will find a working regular expression to match those. The point is, that you cannot assume that the prefix is always none or xbrli.

The safe way to do it is to use a SAX parser (which does not parse an entire document). See for example this question: Determine root Element during SAX parsing

edited May 23 '17 at 12:08

Community

1
1

answered Sep 02 '15 at 09:13

Dennis Münkle

5,036
1
19
18

Thanks for that, I've managed to use SAXParser to retrieve the root Element and check if the root element is a valid XBRL element for the document (i.e checking xmlns and if root is either or . But currently encounter a problem with stopping the parse after the root element is found. I followed the link that was provided which also contains the link to the current solution for stopping parsing at anytime, and am quite against throwing an exception just to stop the parse... – sincreadys Sep 03 '15 at 04:22
sincreadys: It's polite to mark a reply as the answer if it's helped you. – DdW Jul 15 '16 at 13:18
Apologies for my ignorance. I've marked the reply as the answer. thank you! – sincreadys Sep 05 '16 at 00:43

Identifying XBRL Documents

1 Answers1