0

How to find the size of string array passed to a function. The size should be computed inside the function.

#include<iostream>

using namespace std;


template <typename T,unsigned S>
unsigned arraysize(const T (&v)[S]) { return S; }

void func(string args[])
{
   unsigned m=arraysize(args);
   cout<<m;
}

int main()
{
    string str_arr[]={"hello","foo","bar"};

    func(str_arr);  
}

What i dont understand is:

If the statement arraysize(str_arr) is used in main,it wouldn't pose a problem. The str_arr is an array, so str_arr acts as a pointer, so when we use arraysize(str_arr) that means we're sending the address to arraysize function.(correct me if i'm wrong).

But in function func(), i dont understand why there is a problem, i.e. the statement arraysize(args) sends the address of the string array args(or the address of pointer args).. or is it more complicated since it becomes some double pointer?? Explain?

Also please correct the above code..

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
bkcpro
  • 15
  • 7
  • 2
    it is impossible to determine array size and pointer size. Not without a dedicated terminating byte. This is why we have argc to begin with. – Dmytro Mar 06 '13 at 20:05
  • One more reason to use `std::vector`. – Some programmer dude Mar 06 '13 at 20:07
  • This is one of the advantages of java arrays over c++ pointers, the size of the array is available from the array itself. Dmitry is correct, in C++ what you are asking for is not possible. – Ramón J Romero y Vigil Mar 06 '13 at 20:07
  • @Dmitry Determining array size is easy (it's in the type, so `sizeof` can use it), provided you actually do have an array, and not a pointer to the first element of an array ;-) –  Mar 06 '13 at 20:10
  • Yeah, but I don't think this is the case, and if it was it would rely on compiler magic. – Dmytro Mar 06 '13 at 20:10
  • @Dmitry You're right, in `func` there is no array. But in `main` there is, which clashes with your wording (due to you mixing up the array with the pointer it decays to e.g. when writing `func(str_arr)`). –  Mar 06 '13 at 20:11
  • When you pass an array into a function, it is nothing but an integer that stores the address of the lowest byte of the array inside a dword. From this point of view, there is no difference between a pointer and an array. – Dmytro Mar 06 '13 at 20:14
  • So, it would only be possible in main function but not in a function where the string array is passed. – bkcpro Mar 06 '13 at 20:15
  • @bkunchanapalli exactly, but you can use a function template and then all is good. – juanchopanza Mar 06 '13 at 20:15
  • @Dmitry Why would there be a difference if sent the address through main? Can you explain this part? Using arraysize(str_arr), why wouldnt this pose a problem? – bkcpro Mar 06 '13 at 20:16
  • possible duplicate of [Why does a C-Array have a wrong sizeof() value when it's passed to a function?](http://stackoverflow.com/questions/2950332/why-does-a-c-array-have-a-wrong-sizeof-value-when-its-passed-to-a-function) – Bo Persson Mar 06 '13 at 20:18
  • @Dmitry A pointer is not an integer. More importantly, [an array is not a pointer](http://stackoverflow.com/q/1641957/395760). In `main`, `str_arr` is not a pointer: It is a name for storage space for three `std::string` objects (*that's* what an array is). Using the name `str_arr` in *most* contexts (including passing it to `func`) *creates* a pointer, a pointer to the first object in that array. But the array itself isn't a pointer, much like `int a;` is not a pointer despite the possibility of writing `&a`. –  Mar 06 '13 at 20:18
  • a pointer is an integer. int a = (int)malloc(sizeof(char)); *((char*)a) = 'c'; printf("c=%c\n", (char)*((char*)a)); more clearly. A pointer is a dword, and 32 bits can represent a float, an int, 32 true/false values, 4 characters, or anything of that sort. – Dmytro Mar 06 '13 at 20:24
  • @delnan So you say that only when we start using str_arr the compiler creates a pointer to the first object in the array?? So this is how Cpp is designed.. where can i find this kind of info?? the crux of the subject? How is a pointer actually defined inside machine or how is it converted?? – bkcpro Mar 06 '13 at 20:27
  • @Dmitry A pointer may be implemented as an integer. But conceptually, and as far as the C and C++ languages (both the international standards and the logic inside various compilers) is concerned, integers and pointers are very different beasts in how they are used, operated on, modelled, and implemented. You can do unsafe casts between them, but that enters implementation-defined or even undefined behavior (on quite a few platforms, the two are of different sizes), and most of the time one of the two representations has absolutely no meaning. –  Mar 06 '13 at 20:27
  • Download easy68k simulator and you will learn everything you need. @delnan. No, the only difference is how the language makes you think it works. same as when you cast a float to an int, it doesn't copy memory, it does magic to make you think float is represented simpler then it actually is. IN TRUTH though, C variables are nothing but bytes, words, dwords, and structs containing combinations of those, or mallocs with dedicated structure. – Dmytro Mar 06 '13 at 20:27
  • @bkunchanapalli Yeah, pretty much. There are standards that define these languages (though the full, final document is costy and not a whole lot of people own it). But for most purposes, Stackoverflow is a great source, both with regards to the standards and the implementations. As for internals of pointers: You'll need an accurate model of the guts of computers to understand, this is really outside the scope of a question (let alone comments!)... –  Mar 06 '13 at 20:29
  • @delnan, it is very important in C to understand EXACTLY how c thinks memory is represented. The only thing is, if C has a new addressing model, a pointer will stop being an int and will become int64. In any case, all data is collection of bytes, and understanding this helps people a great deal, as thinking of data in terms of "char" "int" and "double" is wrong, these are specialized uses of dedicated amount of bytes. It is an abstraction for yourself, not the compiler. Using them makes your life easier, like units in physics. But it is still important to understand what the units really are. – Dmytro Mar 06 '13 at 20:31
  • @Dmitry *Of course* everything's bits and bytes in the end, but that doesn't mean it I have to restrict my mental model and reasoning to this simplistic, error-prone, unintuitive representation. Your precious dword is just a number of transistors maintaining a certain current -- does that mean you have to confuse the circuitry in your coffee machine with the memory range that stores your user name? Edit: Actually, the compiler cares a great bit about types too -- for example, types guide large parts of the code generation process (`imul` or `fmul`?) *and* help optimizations (such as TBAA). –  Mar 06 '13 at 20:33
  • No, but when you start debugging data structures, you will realize that C does not care what you think the data is. Imaging dealing with void*s to void*s to void*s to void*s, all the sudden all data disappears and you are left with "Oh gosh, is this an int or a char?" the answer is "it's either an address or a value of some size, you just don't know what size nor what what the type of the address is". This is cured by understanding your data very well in terms of sizes and length. I had an assignment to make a trie data structure. Understanding size is very powerful, and helps a lot. – Dmytro Mar 06 '13 at 20:35
  • @delnan So, you need to get it by practice or start machine implementation from scratch?.. Any references where i can start(books,sites)?? – bkcpro Mar 06 '13 at 20:36
  • Try easy68k, it takes seconds to download, and you will have a clear understanding of machine if you can do basic things in it. Books are good for reinforcing what you know, learning from books is a pain, try experimentation instead, C,C++,assembly ARE computer science. – Dmytro Mar 06 '13 at 20:37
  • @Dmitry Now, I'm not saying the underlying representation don't matter or should be ignored. In fact, I am quite excitable about optimizing the hell out of programs based on such low-level reasoning, using individual bits to design more optimal data structures. What I'm opposed to is jumping to conclusions and ignoring more helpful models when there is no need to get down and dirty. This is what you're doing: You're implying every abstraction above machine words is without merit and should be ignored (but below that, abstractions magically become useful again?). –  Mar 06 '13 at 20:38
  • @Dmitry Do we need to refer Documentation for Memory Representation? Also if char and int are representation of bytes.. So, finally it's upto the compiler to differentiate those both?? – bkcpro Mar 06 '13 at 20:40
  • @bkunchanapalli This is really hard for me to answer because my own understanding developed over *years* of trying, failing, having my assumptions challenged, and so on ad infinitum. My knowledge base is likewise assembled from many places, many of which I can't even make out any more. I'm really not sure how you could or should start, sorry. –  Mar 06 '13 at 20:40
  • Actually all im saying is that strong understanding of C data helps a great deal. ADTs 1) keep data together increasing efficiency of cache 2) keep data in convenient addressible form, bit masking is not time efficient. 3) Without ADTs we cannot make effective simulations and tools. I am simply emphasizing that people are confused in what pointer really is, and this causes a great amount of grief when making complicated projects. There is an important distinction between structs, primitives, and pointers, the distinction is hard for beginners, and often experts to grasp. – Dmytro Mar 06 '13 at 20:41
  • Also, the difference between char and int is that char is definited as a byte, and int is defined as dword. This is also subject to change as older ints used to be 16 bit, and unicode characters are 16 bit as well. This causes a huge amount of confusion, and is the reason why we need primitives to begin with, they abstract away specialized use of data. rather than saying "allocate 12 bytes" you can say "allocate space for 12 characters - hello world". The latter makes more sense for C programmers. – Dmytro Mar 06 '13 at 20:43
  • 1
    @Dmitry Speaking of confusion, I also see a lot of confusion in your definitions ;-) A char is indeed a byte in that it's the smallest addressable unit of memory *for C-the-language*, but it's not an octet (8 bit) on all platforms. An `int` is not defined to be a dword except perhaps by some **ABIs** (which honestly isn't worth much when you want your software to run on more than a single platform). A unicode character is not 16 bit in any sense of the word - a UTF-16 *code unit* but that's rarely useful or important (there are other, better encodings; not all *code points* fit into 16 bit). –  Mar 06 '13 at 20:48
  • Yeah a byte isnt 8 bit on all platforms, but lately most platforms agreed that a byte is 8 bits. And yes, the whole point of C is to allow you to deal with memory in various ways rather than assembly's "a pointer is just a number which stores an address". However, in situations where you are dealing with addresses, values, and structs, it is very important to distinguish between the three. Pointers aren't fixed on all platforms, but using the tools provided by C such as sizeof and strlen and so on, you can determine sizes by calling library functions and maintain portability. – Dmytro Mar 06 '13 at 20:59
  • why cant str_arr[sizeof(str_arr)/sizeof(string)] (or) str_arr["length of str_arr"]be declared as NULL('\0') when it is initialized that would solve lot of problems?? :) I guess so.. so that the size of string array could be calculated in function by traversing the pointer until NULL is reached??? – bkcpro Mar 06 '13 at 21:08

5 Answers5

0

There is no way to determine the size of an array when sent to a function. You also have to remember that only a pointer to the array is sent to the function, which makes it even theoretically quite implausible to calculate the array's size.

Vivek Ghaisas
  • 961
  • 1
  • 9
  • 24
0

The information of the array's size is never visible in your function, as you threw it away when you decided to use string args[] for the argument. From the compiler's perspective, it's the same as string* args. You could change the function to:

template<size_t M>
void func(string (&args)[M])
{
   cout<<M;
}

but it seems you already know that, right?

Daniel Frey
  • 55,810
  • 13
  • 122
  • 180
0

str_arr is an array of strings. When you do sizeof(str_arr), you get the size of that array. However, despite the fact that args looks like an array of strings, it's not. An argument that looks like an array is really of pointer type. That is, string args[] is transformed to string* args by the compiler. When you do sizeof(args) you are simply getting the size of the pointer.

You can either pass the size of the array into the function or take a reference to the array with a template parameter size (as you did with arraysize):

template <size_t N>
void func(string (&args)[N])
{
   // ...
}
Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
0

If the statement arraysize(str_arr) is used in main,it wouldn't pose a problem. The str_arr is an array, so str_arr acts as a pointer, so when we use arraysize(str_arr) that means we're sending the address to arraysize function.(correct me if i'm wrong).

I have to correct you here. You state a correct premise, but draw the wrong conclusion.

The key point is indeed that str_arr is an array in main. While an array decays to a pointer in many (most) expression contexts, this does not apply when a reference to array is initialized. That is the reason why array_size is declared to take a reference to array parameter - this is the only way to have a parameter of array type, which implies that it comes with a defined length.

That is not the case for func. When a function parameter is declared to be of plain array type, the the array to pointer decay is applied to that declaraction. Your declaration of func is equivalent to void func(string * args). Thus args is a pointer, not an array. You could call func as

string str_non_array;
func(&str_non_array);

Because of this, a reference-to-array can't bind to it. And anyways, args has completely lost all information about the size of the array it is pointing to.

You could use the same reference-to-array trick as is used in arraysize: declare func as

template <std::size_t N>
void func(string (&args)[N]);

But this gets impractical to do everywhere (and may lead to code bloat, if applied naively to all array-handling code). The C++ equivalent of an array-with-length as available in other languages is std::vector<string> (for dynamically sized arrays) or std::array<string,N> (for fixed size known at compile time). Note that the latter can cause the same code bloat as mentioned above, so in most cases, std::vector<string> would be the preferred type for array that you need to pass to various functions.

JoergB
  • 4,383
  • 21
  • 19
  • `std::array` leads to as much code-bloat as versions of `func` with `template`. – Daniel Frey Mar 06 '13 at 20:33
  • As you told, reference to array is initialized, how come str_arr is initialized as a reference but doesnt convert into a pointer? Sorry if this isnt a valid question? Basically what i am asking is how come str_arr is not being used as pointer in my code context. – bkcpro Mar 06 '13 at 20:50
  • @bkunchanapalli: `str_arr` in `main` *is* an array. It is used as an array in contexts that require an array, but decays to a pointer in contexts that expect a pointer. In your call to `func` (your version) the parameter type is `string *`, so `str_arr` is used as (decays to) a pointer. If you would use `arraysize(str_arr)` in `main`, the parameter type is reference-to-array and that reference `v` would bind directly to `str_arr` *without* array to pointer conversion. In `func` that is not possible, because `args` is a pointer rather than an array. – JoergB Mar 07 '13 at 09:15
0

Dmitry is right and I would like to explain it a bit further. The reason its happening is because array is not a First Class citizen in C++ and when passed as parameter it decays to pointer and what you get in called function is a pointer to its first element and size is lost. You can refer C++ arrays as function arguments to see what alternative options are available.

Community
  • 1
  • 1
manu4543
  • 508
  • 1
  • 8
  • 15