0

Ok so i want to understand what this does:

#define NR_END 1

float *vector(long nl, long nh)
/* allocate a float vector with subscript range v[nl..nh] */
{
    float *v;

    v=(float *)malloc((size_t) ((nh-nl+1+NR_END)*sizeof(float)));
    if (!v) nrerror("allocation failure in vector()");
    return v-nl+NR_END;
}

i mean it creates a pointer to a malloc space of the size given there. so v[0] is the first unit ... and so on till the end of the block.

but why is some term returned in the end.

I thought you couldn't move anywhere on the stack with a pointer but just be able to access the space given?

int main()
{
    printf("Hello world\n\n");

    float* f = vector(3,5);

    f[0]=1;
    f[1]=2;
    f[2]=3;
    f[7] = 5;


    printf("%f", f[7]);

    return 0;

}

i did these tests and it compiles whithout errors or warnings when gcc filename.c -o filename.

and why does v[7]=5 work when it shouldn't?

and what is a subscript range?

and how are you supposed to use this correctly?

i am very confused. Please help me understand this.

the file seems to be also viewable at https://www.cfa.harvard.edu/~sasselov/rec/code/nrutil.c

  • 2
    Note that C and C++ are two very different languages, with different rules and semantics. As such please refrain from using terms like "C/C++" or tagging both languages in your questions. So that's why the C++ tag have been removed, since it doesn't seem applicable (because you're programming in C). – Some programmer dude Dec 17 '19 at 10:15
  • 3
    As for part of your problem. C doesn't have any kind of bounds-checking. It's your responsibility as the programmer to not go out of bounds of allocated memory. Going out of bounds leads to *undefined behavior*. – Some programmer dude Dec 17 '19 at 10:17
  • Does this answer your question? [How dangerous is it to access an array out of bounds?](https://stackoverflow.com/questions/15646973/how-dangerous-is-it-to-access-an-array-out-of-bounds) – kaylum Dec 17 '19 at 10:20
  • @someprogrammerdude I was aware of that but i thought with pointer you could move just one unit outside the given space and that's it. Else it would give an error or smth. i mean you could just point somewhere and than rewrite like 1000 upcoming units like `*(p+1000)=500`? i don't think this works. –  Dec 17 '19 at 10:22
  • @kaylum no. I want someone to explain to me exactly what the function above does. –  Dec 17 '19 at 10:23
  • 1
    We told you what it does and so does the link. It's **undefined behaviour** because of the out of bounds accesses. Which means we can't actually tell you for sure what the result will be. – kaylum Dec 17 '19 at 10:24
  • @kaylum i mean what `vector()` does and how to use it. this function is from a library. i have not written it. i mean this `return v-nl+NR_END;` would be also undefined behaviour since v starts at 0, wouldn't it? –  Dec 17 '19 at 10:27
  • 1
    What is `NR_END`? It seems that this is *critical* in determining what the returned value of the `vector` function will be - presumably, it will *always* be > `nl`, so the calculated return value will *never* be less than `v`. – Adrian Mole Dec 17 '19 at 10:30
  • and what is this "subscript range"? –  Dec 17 '19 at 10:30
  • NR_END is just 1, this was written at the top of the file `#define NR_END 1` @adrianmole –  Dec 17 '19 at 10:31
  • there file seems also to be online. https://www.cfa.harvard.edu/~sasselov/rec/code/nrutil.c –  Dec 17 '19 at 10:33
  • Not sure what the point is but it's returning an array with valid indices between nl and nh instead of the usual 0-based indices. That is, for example, `f[0]` is not a valid memory access unless nl=0. – kaylum Dec 17 '19 at 10:33
  • @AndiHamolli a _subscript range_ is something that allows you to access an array with subscripts ranges other than [0..n] for example [5..10]. Then `array[5]` would be the first element and `array[10]` would be the last element. However in C there this concept does not exist, but the `vector` function in your code tries to emulate this and fails (see various answers below). – Jabberwocky Dec 17 '19 at 11:04

3 Answers3

3

That is some pretty sketchy-looking code, this is not a good idea.

The function seems to allocate a vector where the caller promises to only use indexes in the closed interval [nl, nh], e.g. vector(100, 104) would try to allocate a 5-element vector indexed 100, 101, 102, 103 and 104. Any other index results in undefined behavior. This includes 0 which typically is the first valid index of a C array.

The two core lines of code are:

// 1
v=(float *)malloc((size_t) ((nh-nl+1+NR_END)*sizeof(float)));

This can be re-written a bit cleaner without the pointless casts:

v = malloc((nh - nl + 1 + NR_END) * sizeof *v);

This then computes the size of the desired index interval (nh - nl + 1 would result in 104 - 100 + 1 which is 5 in our example). It adds an extra NR_END elements, no idea why.

And then there's:

// 2
return v - nl + NR_END;

This returns the base pointer v, but first adjusts it by going backwards nl elements, i.e. 100 in our example. This causes indexing by 100 to hit the first actual allocated element. The addition of NR_END biases the use of the allocated vector towards the end, again I have no idea why.

So in memory it would look like this, with 6 elements allocated:

   +---+---+---+---+---+---+
v: | 0 | 1 | 2 | 3 | 4 | 5 |
   +---+---+---+---+---+---+

But by subtracting, we return a pointer to lower addresses, relying on the caller indexing with at least nl to back up into the allocated space.

All that said, I'm pretty sure this is undefined behavior and assumes a bit much about address computations. You're not supposed to work with addresses outside the range [0,N] for an array of N elements, which would rule out dealing with addresses to before the array starts.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • 2
    Excellent answer! Like you, I am also unsure why the extra (NR_END) element is added. Also, I think this is from the "Traditional K&R" version of the "Numerical Recipes in C" book/code - very old and prone to much abuse. – Adrian Mole Dec 17 '19 at 10:53
  • oh, so he returns some wrong pointer on purpose (is this actually good practice or is there some better way to do that?) so that when you do `*(p+nl)` you actually land in the correct spot where you should have. The NR_END doesn't really make sense, it would work without that i think. but since he uses that at return, i would think that he lets the first element of the created array empty, and you acces the [101]th index, meaning the [2]nd index if you do v[100] and not the first one? –  Dec 17 '19 at 11:08
1

In order to see the code you wrote actually fails, you need to do more than just compiling the code without any extra flags. If you use clang MemorySantizer and disable compiler optimisations, you will see that there is a problem with the memory access done with f[0].

C does not do bounds checking and leaves it to the programmer. The vector() function returns a memory location which can only be addressed through v[nl] ... v[nh]

Line 23 in the a.c. is f[0]=1;

% clang -fsanitize=memory -fno-omit-frame-pointer -g -O0 a.c
% ./a.out 
Hello world

MemorySanitizer:DEADLYSIGNAL
==26321==ERROR: MemorySanitizer: SEGV on unknown address 0x600ffffffff8 (pc 0x0000010a9670 bp 0x7fffffffeaa0 sp 0x7fffffffea10 T101363)
==26321==The signal is caused by a WRITE memory access.
    #0 0x10a966f in main /home/fnoyanisi/a.c:23:9
    #1 0x1060b5a in __sanitizer::ReportDeadlySignalImpl(__sanitizer::SignalContext const&, unsigned int, void (*)(__sanitizer::SignalContext const&, void const*, __sanitizer::BufferedStackTrace*), void const*) /usr/src/contrib/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cc:211:3
    #2 0x1060b5a in __sanitizer::ReportDeadlySignal(__sanitizer::SignalContext const&, unsigned int, void (*)(__sanitizer::SignalContext const&, void const*, __sanitizer::BufferedStackTrace*), void const*) /usr/src/contrib/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cc:225
    #3 0x1060e8e in __sanitizer::HandleDeadlySignal(void*, void*, unsigned int, void (*)(__sanitizer::SignalContext const&, void const*, __sanitizer::BufferedStackTrace*), void const*) /usr/src/contrib/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cc:234:3

MemorySanitizer can not provide additional info.
SUMMARY: MemorySanitizer: SEGV /home/fnoyanisi/a.c:23:9 in main
==26321==ABORTING
% sed '23q;d' a.c
    f[0]=1;
% clang a.c
% ./a.out 
Hello world

5.000000
% 
fnisi
  • 1,181
  • 1
  • 14
  • 24
  • what about f[1] and f[2] and f[7]? how can i make gcc to show me warnings. i don't quite understand what this % actually is? –  Dec 17 '19 at 11:12
  • `%` is my prompt (_tcsh_), the example is from a FreeBSD machine (for some reason, clang MemorySanitizer is broken in macOS). I am not sure whether gcc has any tools like MemorySanitizer but if you are on gnu/Linux you can try using `-O0` to compile and use Valgrind maybe? – fnisi Dec 17 '19 at 11:15
  • and why doens't it show the error for f2 and f7? what is this clang thingy? –  Dec 17 '19 at 11:17
  • can you please explain to me what exactly your error says? –  Dec 17 '19 at 11:18
  • 1
    It does not show an error for `f[2]` or `f[7]` be cause the program already quits when an attempt to write memory pointed by `f[0]` is made...if you are asking `what clang is` and posting a C programming question...well this is like riding a car and saying what is a Mercedes...https://en.wikipedia.org/wiki/Clang – fnisi Dec 17 '19 at 11:23
  • is there some way to show all possible warnings when using gcc. i think something like this, should at least appear as a warning when doing f[0] –  Dec 17 '19 at 13:51
  • To see all the warnings, you can use -Wall. But as I said, your compiler does not do bounds checking and expects you to do it. In case of an attempt to access memory out of your range, you would get a runtime error (SIGSEGV) rather than a compile time warning/error. Clang MemorySanitazer does runtime checks so does Valgring – fnisi Dec 17 '19 at 18:50
0

What the vector(long nl, long nh) function does is create an array of the necessary size to hold the given range of numbers, and adds an 'extra' element (possibly, in order to permit the C practice of taking an address "one beyond the bounds"). So, if you call it with values of 3 and 5, it allocates space for 4 float variables at the address in v.

However, simply using this v address, one would have to access the elements using indexes in the range 0 thru 2. So, before returning an address, the function subtracts your lower index (nl) from it. Now, although this value will now point to an invalid memory address, if you only ever access elements from your specified range, then the pointer arithmetic used in calculating, say f[3] in your main function, will add a suitable offset to this "invalid" pointer, so that your are then actually accessing the v[0] variable.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • do you mean 0 to 3, since there is that one extra element in the array? –  Dec 17 '19 at 10:47
  • Yes and no! I don't *fully* understand why the extra element is added but, sticking to the requested range, the indexes of 3 … 5 would 'equate' to 0 … 2 where I mention it in my answer. – Adrian Mole Dec 17 '19 at 10:49
  • i was thinking maybe he leaves an extra element in the array, bc he returns smth + NR_END, meaning when you have vector(100,104) and you do v[100] you actually go to the 2nd place in the array created by malloc. meaning, if you do v[99] it would also be correct and would access the first element of the array which makes no sense why you would want that –  Dec 17 '19 at 11:16