2

I'm writing a small tensor library (inspired from here: http://www.wlandry.net/Projects/FTensor), relying on C++11 or 14.

For the sake of clarity, I would like to access the indices starting from 1 ,and not 0. For instance, for a vector v, use v(1), v(2) and v(3) to access its components, stored in a regular 1D array. (edit: This is for a continuum mechanics application. People are used to start at 1. Yes, it is un-natural for programmers.)

Now, I already now how to do it at run time. I would like to know if we can avoid this computation at runtime, as it get more complex for higher order tensors in case of symmetries.

I found this way of writing a compile time map where i manually put the shift: https://stackoverflow.com/a/16491172/1771099

However, using the code in the SO answer above, I cannot do something like this:

for i=1; i<4;i++){
    std::cout << mymap::get<i>::val << std::endl;
}

since i is not known a compile time. Is there a smart way to do it ? Build a kind of static map, for each tensor order and symmetry, than we can lookup at runtime ?

Thanks

Edit: Here is what I consider "bad". The constexpr are actually useless

  // subtract 1 to an index
  inline constexpr size_t
  sub_1_to_idx(size_t const &i) {
      return i - 1;
  }

  // build a parameter pack from (a,b,c,...) to (a-1,b-1,c-1,...)
  template<typename... idx>
  constexpr std::array<size_t, sizeof...(idx)>
  shift_idx(const idx &... indices) {
          return std::array<size_t, sizeof...(idx)> 
   {{sub_1_to_idx(indices)...}};

  };

  //use this to return an index of a 1D array where the values of a possibly multidimentional, possibly symmetric Tensor are stored.
  template<>
  template<typename... idx>
  size_t
  TA_Index<2, Tsym::IJ>::of(const idx&... indices) {
      static_assert(sizeof...(indices) == 2, "*** ERROR *** Please access second order tensor with 2 indices");

      const std::array<size_t, sizeof...(idx)> id = shift_idx(indices...);
      return id[0] > id[1] ? (id[0] + (id[1] * (2 * 3 - id[1] - 1)) / 2)
                           : (id[1] + (id[0] * (2 * 3 - id[0] - 1)) / 2);
Napseis
  • 833
  • 2
  • 10
  • 24
  • 7
    What's wrong with `double& MyVec::get(int i) { return data[i-1]; }`? It won't get more compile-time than that... For higher dimensional stuff, it would really help if you could show some code that you consider suboptimal (not compile-time enough?) so that concrete answers can be given. Right now it's very unclear what you really want the answers to contain. – Max Langhof Nov 29 '19 at 15:35
  • 5
    "sake of clarity" hehehehehehehehehehehe – Yakk - Adam Nevraumont Nov 29 '19 at 15:45
  • @MaxLanghof, I edited the post to show what i am doing currently – Napseis Nov 29 '19 at 15:49
  • Building upon Yakk's passive-aggressive comment: Why do you think it improves clarity to have indices start at 1? If anything, it reduces clarity, since 0-based indices are accepted standard basically everywhere. I for sure would be very confused by such an implementation and would probably use it wrong unless someone or something pointed this obscure behavior out to me prominently. – Max Vollmer Nov 29 '19 at 15:57
  • 1
    To folks hating on one-based indices: Yes, in the programming world that is out of the norm. But in the math world, indices start at 1. Depending on whether the users/audience is leaning more toward math or more toward programming, one or the other might be better (although I would still be scared of people stuck in a one-based world writing any C++). – Max Langhof Nov 29 '19 at 16:00
  • @Napseis Your question is not clear to me, do you want to know how to index your tensor with a compile-time constant or how to do it with a runtime constant? You cannot do the index calculation at compile-time if it is based on a runtime-dependent value and you can assure that calculation is done at compile-time rather than runtime whenever possible by making the runtime calculation function `constexpr`. – walnut Nov 29 '19 at 16:02
  • If it's about user input from mathematicians using 1-based indices, I don't see why the implementation in code shouldn't be 0-based. Just reduce the user input by 1, there is no performance issue, no matter where the data comes from, I/O is always slower than a simple subtraction. – Max Vollmer Nov 29 '19 at 16:03
  • @Napseis Whether a lookup table is preferable for performance over doing the calculation is a matter of the exact sizes and CPU it is run on, but I doubt that there is ever any reasonable benefit. I think you should probably ask about the concrete index calculation function and how to improve its runtime performance instead, if that is what you are worried about. – walnut Nov 29 '19 at 16:06
  • Also, what is the range of the indices coming in? A look up table (computed at compile-time) won't help much once you have significantly more than like 128-256 distinct index combinations. – Max Langhof Nov 29 '19 at 16:08

1 Answers1

3

There isn't much you can change about this. Yes, you could create a dense compile-time lookup table, but would you access it one-based (requires padding, i.e. space overhead) or zero-based (requires re-indexing, i.e. time overhead)? And would it actually buy you any meaningful performance?

Have you compared the assembly resulting from one-based vs zero-based indices?

https://godbolt.org/z/9pKt6Q

They are virtually identical, save for two lea (to, well, subtract 1 from both input indices) and an additional cmova (no idea what's going on there). If the zero-based code was good enough, the one-based one will be equally fine.

Also, in either case, the compiler perfectly constant-folded the indices that were known at compile-time. That's bread-and-butter optimization for a compiler, so whether or not your functions are constexpr or not is going to be insubstantial when indices are known at compile-time.

And finally, for operations on higher-dimensional/larger tensors, the semi-random access pattern incurred from the index calculations will likely be a much bigger performance bottleneck (due to not using the cache efficiently) than the index calculations themselves.

Max Langhof
  • 23,383
  • 5
  • 39
  • 72