2

My question is related to this but from slightly different site.

Say we have the same array

const char *data[] = {
    "Str1",
    "Str Two",
    "Long string three",
};

Okay, compiler knows the length of array and we can get it in compile time: sizeof(data)/sizeof(data[0]). At the same time compiler knows the size of each string, say the result of strlen(data[1]) is actually known at compile time. But strlen is runtime call, and sizeof[i] will always return the sizeof(char *) value. Is there any way to get the length of each string of array without separate definition of each string as variable or macro?

shved
  • 386
  • 1
  • 12
  • `sizeof(data[0])/sizeof(data[0][0]);` – hetepeperfan May 17 '21 at 14:39
  • 3
    Compiler also knows `sizeof("whatever")`, so make your strings macros and use the macros to initialize, and also use `sizeof(macro)` where you need the sizes. – Eugene Sh. May 17 '21 at 14:41
  • 4
    @hetepeperfan correct me if im wrong, but doesnt that just divide the size of a char pointer by the size of a char? – lulle2007200 May 17 '21 at 14:44
  • 1
    What's wrong with `strlen`? – klutt May 17 '21 at 14:48
  • @lulle is right, there something strange was suggested. – shved May 17 '21 at 14:49
  • @lulle yep that's correct: https://godbolt.org/z/xGM5sWo5c – yano May 17 '21 at 14:50
  • @EugeneSh. This is the same as extra variables. I want to avoid this. – shved May 17 '21 at 14:50
  • 1
    @shved Macro is not a variable. It does not consume any memory. – Eugene Sh. May 17 '21 at 14:51
  • @shved The problem is that your array is just an array of char pointers that point to a string literal. You can take the size of an array element, but you effectively get the size of a char pointer. – lulle2007200 May 17 '21 at 14:52
  • You can declare the strings seperately (e.g. const char string1[] = "abc"; const char string2[] = "def"), then make a a const array of char pointers that point to the string literals (e.g. 'const char *data[] = {string1, string2}') and a const array that holds the sizes (e.g. const size_t sizes[] = {sizeof(string1) , sizeof(string2)}') – lulle2007200 May 17 '21 at 14:55
  • @EugeneSh. Yep I know, some times variable does not consume any memory. I meant something else: it consume more effort to update an array. This is theoretical question. I found the case and suggested to think on it and maybe someone knows the perfect solution? – shved May 17 '21 at 14:58
  • *Is there any way to get the length of each string of array without separate definition of each string as variable or macro* - Simply **NO** – Eugene Sh. May 17 '21 at 15:24
  • `sizeof("whatever")` is a valid approach, but note that evaluates to 1 larger than `strlen("whatever")` – William Pursell May 17 '21 at 15:47

2 Answers2

2

UPDATE 2

Per your request, I will unequivocally state there is no part of the standard that allows you to obtain the length of a constant string except through the use of functions like strlen. You won't be able to create a macro that can do what you want.

You wrote:

strlen is runtime call

It can be a run-time call or a compile-time call depending upon the compiler's optimization level, which, so far, no other person has mentioned. I'm a huge fan of letting the compiler do the work for me.

$ cat t.c
#include <stdio.h>
#include <string.h>
int main (int argc, char *argv[]) {
    char *data[] = {"abc"};
    printf("sl=%ld\n", strlen(data[0]));
}
$ gcc t.c
$ nm a.out
                 U ___stack_chk_fail
                 U ___stack_chk_guard
0000000100008018 d __dyld_private
0000000100000000 T __mh_execute_header
0000000100003f00 T _main
                 U _printf
                 U _strlen
                 U dyld_stub_binder
$ gcc -O1 t.c
$ nm a.out
0000000100008008 d __dyld_private
0000000100000000 T __mh_execute_header
0000000100003f70 T _main
                 U _printf
                 U dyld_stub_binder

UPDATE 1, prompted by chqrlie. See the instruction at 100003f7b. Altering the number of characters in that string will produce a different constant being loaded into the esi register.

$ objdump --disassemble-symbols=_main a.out

a.out:  file format mach-o 64-bit x86-64


Disassembly of section __TEXT,__text:

0000000100003f70 <_main>:
100003f70: 55                           pushq   %rbp
100003f71: 48 89 e5                     movq    %rsp, %rbp
100003f74: 48 8d 3d 33 00 00 00         leaq    51(%rip), %rdi  # 100003fae <dyld_stub_binder+0x100003fae>
100003f7b: be 03 00 00 00               movl    $3, %esi    #### This is the length of the string constant
100003f80: 31 c0                        xorl    %eax, %eax
100003f82: e8 05 00 00 00               callq   0x100003f8c <dyld_stub_binder+0x100003f8c>
100003f87: 31 c0                        xorl    %eax, %eax
100003f89: 5d                           popq    %rbp
100003f8a: c3                           retq

But even if it is a run-time call, there are two things to remember:

  1. The cost of an optimized strlen call is quite small compared to many other operations that you would probably perform on the string.
  2. You can minimize the frequency of your calls to strlen with responsible factoring.
Jeff Holt
  • 2,940
  • 3
  • 22
  • 29
  • 1
    The fact that `_strlen` is absent from the symbols does not mean it was computed at compile time... It might just have been inlined... – chqrlie May 17 '21 at 16:10
  • Perfect point! But again this is the compiler's decision, not the standard. We can not have any guarantee that ``strlen`` will be optimized. – shved May 17 '21 at 16:35
  • I see, no one suggested absolutely working variant, so Eugeny said in comment the shortest answer is *NOT*, but your is the most close to solution. Can you please add short *You can not* answer? – shved May 17 '21 at 16:44
  • @shved Do you mean "can you please say that there is no such way to implement this with a macro"? – Jeff Holt May 17 '21 at 16:47
  • @JeffHolt I mean there is no magic call like ``super_sizeof(data[i])`` which returns the size of string in array like ``sizeof(data)`` in compile time without adding extra variables or macros. But most of times compiler will optimize the ``strlen`` call to do exactly what I was asking for. To be clear. – shved May 17 '21 at 20:25
0

You can if you don't mind changing the definition of data a bit.

struct Elem
{
  size_t len;
  const char* str;
}

#define ARRAY_SIZE(x) (sizeof((x)) / sizeof((x)))
#define ELEM(x) ({ ARRAY_SIZE((x)) - 1, ((x)) })

const Elem[] data = {
  ELEM("Str1"),
  ELEM("Str Two"),
  ELEM("Long string three")
};
rveerd
  • 3,620
  • 1
  • 14
  • 30
  • Nice try, but it useless in generic approach like ```memcpy(dst, data[1], sizeof(data[1]));``` – shved May 17 '21 at 16:37