131

Some colleagues and I were comparing past languages we had programmed in and were talking about our experience with VBScript with its odd features such as 1-based index instead of 0-based indexes like almost every other language has, the reasoning being that it was a language for users (e.g. Excel VBA) instead of a language for developers.

Then someone said, "XPath also has 1-based indexes" which I couldn't believe until I found this article in which many reasons are given in favor of the 0-based approach including some from Michael Kay himself:

  • "...zero-based indexing tends to make the index formulae simpler when accessing a multi-dimensional array with a one-dimensional array access expression"
  • "when handling tables, or subscripting into strings, zero-based addressing would often be much more convenient"
  • "...hardware addressing is not the only benefit of 0-based addressing ... it also makes computations easier..."

but then Michael Kay is quoted as concluding:

...1-based logic was the right choice for XPath and XSLT...because the language was designed for users, not for programmers, and users still have this old-fashioned habit of referring to the first chapter in a book as Chapter One...

Can someone explain that to me? (1) How is XPath designed for users? I can't imagine anyone who is not a developer wrangling with the syntactical rigidity of XPath or the declarative/functional-programming-aspects of XSLT. and (2) Why really did the creators of XPath go against the norm of modern programming languages by choosing a 1-based index?

Edward Tanguay
  • 189,012
  • 314
  • 712
  • 1,047
  • 9
    In the same article Michael is also quoted with the following words: "I can't tell you what the actual history of the decision was; I can only post-rationalize it". If even he doesn't know then there is probably no satisfying answer. – Dirk Vollmar Jul 23 '10 at 15:46
  • 5
    I have voted to CLOSE this question as subjective and argumentative. 0-based indexing is in no way better than 1-based indexing and the reverse is also true: 1-based indexing is in no way better than 0-based indexing. Both have plusses and minuses. 1-based indexing is more natural for non-programmers. It also allows to specify the upper boundary of a range as `n`, not the very unnatural and often leading to errors `n - 1`. For anyone with perverted due to "modern programming" logic, starting to use 1-based indexing would be an enjoying and refreshing experience :) – Dimitre Novatchev Jul 23 '10 at 16:25
  • 4
    the answers to this stackoverflow question show that 0-based indexes are preferred for many reasons: http://stackoverflow.com/questions/393462/defend-zero-based-arrays – Edward Tanguay Jul 23 '10 at 20:17
  • Your question isn't really a question, it's a rant. 1-based indexing in action has the desirable property that what you'd write for "give me the second element in this collection" maps naturally to the same numeral you'd use in speech or thought. It's the obvious explanation, and it's probably the right one. Don't like it? Tough, it ain't gonna change. And there are far fatter fish to fry in this wild world of the Web. – Owen S. Jul 24 '10 at 06:10
  • 10
    My question is a real question actually, as I teach programming and want to have an answer to this question regarding xpath indexes in case it comes up. I think the best answer is that a 1-based index maps to position() which is used heavily in xpath. – Edward Tanguay Jul 24 '10 at 16:06
  • @Edward: But that just begs the question: why did XPath choose 1-based indexing for position(), i.e. for the definition of "context position"? I think the same answer applies. – Owen S. Jul 26 '10 at 02:51
  • 71
    I think this is a legit question and should not have been closed. It asks for a historical fact that is not a matter of opinion and the answer would be enlightening. – Ben Flynn Oct 20 '11 at 16:44
  • 2
    "0-based indexing is in no way better than 1-based indexing and the reverse is also true: 1-based indexing is in no way better than 0-based indexing." The question was, Why is one true for XPath as opposed to the other, nobody said one is better than the other--people overstepped their bounds by closing this question. – Shon Jun 06 '15 at 01:06
  • 2
    I agree. To introduce a 1-based technology into a system predominantly 0-based, is just an accident waiting to happen. It's like the old imperial vs metric. Oh darn our Mars mission probe is on it's way to interstellar space. –  Aug 11 '15 at 05:11
  • Actually VB*Script* is zero based as well. – Michel de Ruiter Oct 20 '15 at 15:36
  • Only the C programming languages, and things derived from it, count from 0. Everything else counts from one, including all non-C derived programming languages. Sadly, since about 2000, the C horror has increased in influence. XPath predates 2000 by a bit. – Tuntable Mar 14 '17 at 01:53

2 Answers2

39

Array and other collection indexes represent memory offsets, so logically enough they begin at zero. XML and XPATH indexes represent positions and counts, so logically enough they begin at one (and zero is therefore representative of "empty")

mike mckechnie
  • 844
  • 8
  • 14
10

To answer this question, we must examine the history of some technologies.

RSS XML XSLT and XPath History

Version 0.9 of RSS was originally released as RDF Site Summary in 1999 by a couple of guys at Netscape for Netscape’s my.netscape.com portal. Later that year, it was renamed to RSS (Rich Site Summary) with the v0.91 update. Development of the project changed hands several times, but RSS version 1.0 was released by December of 2000. With the v1.0 update, RSS included support for XML.

During 2002 v2.0 was released in September as RSS (Really Simple Syndication) and began to evolve into a major internet technology. In it’s early history, RSS feeds (and the XML data they contained) were read by humans in the raw format. Blogs and other news sources used RSS feeds and XML to output continuously updated information. Since XML was being read by mere mortals (non-programmers), XPath and XSLT also needed to be easily understandable, so that these mere mortals would not be overwhelmed by complexity when interacting with it. That is why XPath mimics the style of URIs, which is something that end-users were already familiar with. One of the concessions made for the purpose of readability by users, was to use old-fashioned numbering techniques i.e. 1-based indexes instead of 0-based indexes. That is the same concession that you mentioned with VBScript, and it was made for similar reasons.

Although RSS feeds and XML were made to be readable for most people, RSS readers were developed to provide a more pleasant interface for humans to read RSS feeds. Now, raw RSS and XML data are read almost exclusively with some sort of reader or graphical interface. XML is still in frequent (perhaps permanent) use across the web, but it is masked by fancy graphical user interfaces to provide a better experience for end users.

*The term, "mere mortals," refers to humans who are not programers

Andrew
  • 1,322
  • 14
  • 20
  • 6
    I'm not convinced it has so much to do with RSS. For example, in [this XSL specification](http://www.w3.org/TR/1999/WD-xslt-19990421) (which [later separated out into XPath and others](https://www.w3.org/standards/history/xpath)) from April 1999, "The position() function returns the position of the context node in the context node list. The first position is 1, and so the last position will be equal to last()." Were any "mere mortals" using RDF in the run up to April 1999, when that was drafted? – Matt Gibson Oct 08 '15 at 17:15