4

An old code base I'm refactoring has a rather unorthodox way to access n dimensional vectors by overloading the () operator instead of the [] operator.

To illustrate, suppose Vec is a class that overloads both operators for indexing purpose of an internal list of doubles.

Vec v, w;
int index = 651;
double x = v(index);   // they index like this ..
double y = w[index];   // .. instead of like normally..

I can only think of one good reason why you would do this: The operator[] requires an argument of type size_t which may or may not be uint64 depending on the platform. On the other hand, the index type used in the code is mostly int, which would mean a lot of implicit/explicit type casting when indexing. The code is extremely indexing heavy, so I'm quite hesitant to change this.

Does this really makes any sense for todays modern compilers and 64bit platforms? I quickly whipped up a test on https://godbolt.org/ You can try it out here here

I was primarily concerned with the latest x64 gcc and Visual Studio compilers, but the difference shows when compiling in x86-64 gcc (11.2). Gcc adds a cdqe instruction for the () operator, which we know may be less efficient.

I am a bit puzzled. Why would one make this choice?

Update

As explained in the answers, my assumption about operator [] being locked to size_t type size was incorrect. If you index with int instead of uint64 indices, the compiled code is nearly the same, save using different registers and constants.

The idea to provide multiple indices for multiple dimensions with () is quite a nice one, I must say. Wasn't actually used anywhere in the codebase, so I think it's just better to replace it with []. I'm also happy to see it handled by the proper indexing operator instead in future versions of C++.

StarShine
  • 1,940
  • 1
  • 27
  • 45
  • 5
    Given that they are "n dimensional vectors", the decision may be because `()` allows multiple parameters. – Drew Dormann Feb 04 '22 at 15:52
  • 4
    I don't think `[]` forces the parameter to be `size_t`. – HolyBlackCat Feb 04 '22 at 15:53
  • Ok, the funny thing is it never actually indexes beyond the first dimension. – StarShine Feb 04 '22 at 15:54
  • _"Why would one make this choice?"_ may not be answerable here. It's along the lines of "What were they thinking?" Execution speed is not the answer, and "interface decision" definitely is. Overloading `operator()` allows the type to be put in a `std::function` for example. But this is just guesswork. – Drew Dormann Feb 04 '22 at 16:31

2 Answers2

5

One "problem" with [] is when you have multiple dimensions. With an array you can do arr[val1][val2], and arr[val1] will give you an array you can index with [val2]. With a class type this isn't as simple. There is no [][] operator so the [val1] needs to return an object that [val2] can be applied to. This means you need to create another type and provide all the functionality you want. With (), you can do data(val1, val2) and no intermediate object is needed.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • 2
    We might be able to do `data[val1, val2]` in the future. As a first step, C++20 deprecates comma inside `[]`. – Evg Feb 04 '22 at 16:00
  • @Evg That would be nice, and can allow some nice generic code with non-type template parameter packs. – NathanOliver Feb 04 '22 at 16:02
  • 4
    Since C++23, `operator[]` can take more than one subscripts. For example, an `operator[]` of a 3D array class declared as `T& operator[](std::size_t x, std::size_t y, std::size_t z);` can directly access the elements. See [operator overloading, **Array subscript operator**](https://en.cppreference.com/w/cpp/language/operators). – 273K Feb 04 '22 at 16:04
  • 1
    @273K Thanks for the link. Nice to see that making it into the language. – NathanOliver Feb 04 '22 at 16:06
  • 2
    @Evg it's unfortunate that a breaking change was chosen. Multiple indexes could have been supported via tuples or `std::initializer_list`. – Ben Voigt Feb 04 '22 at 16:13
  • 4
    @BenVoigt While it is technically a breaking change, I don't know what real world code it would actually break. I don't recall ever seeing an intentional use of `,` inside `[]`. – cigien Feb 04 '22 at 18:53
  • I'm also curious how the code gen compares on different platforms, performance-wise and how registers are spent. – StarShine Feb 07 '22 at 09:55
3

The operator[] requires an argument of type size_t

This is incorrect. operator[] can accept an operand of any type, including non-integral types. However, the limitation is that it must be a single argument. This may change in a future standard, but if your code is old, it is clearly affected.

Does this really makes any sense for todays modern compilers and 64bit platforms?

Using 32-bit indexes on a 64-bit platform may be beneficial, if you have to store a lot of indexes and don't need the extra range. Keeping your data condensed not only saves memory, but may also improve CPU cache hit rate and increase performance.

Andrey Semashev
  • 10,046
  • 1
  • 17
  • 27