1

The most intuitive way to lowercase a string is to loop through it and lower each character:

do {
    *Str = tolower(*Str);
} while (*++Str);

However, like memcpy() being faster than copying each byte in a for loop, is there a faster way to lowercase a string in C? Like can I lower multiple chars at a time using bitwise magic or something?

  • 1
    You could try to vectorize or parallelize it but it's a good idea to give more information about why looping is too slow for your use case. How big is your input, for example? – ggorlen Jul 20 '20 at 23:28
  • 6
    Are you having a performance problem? I'd stick with this unless it's a confirmed hotspot in your program. – John Kugelman Jul 20 '20 at 23:28
  • 2
    There's no standard C way that's better. There might be implementation-specific extensions. – Barmar Jul 20 '20 at 23:28
  • @Barmar I am making the assumption that the character encoding is ASCII –  Jul 20 '20 at 23:29
  • 2
    You know that won't lowercase the first character? And it only works with single-byte character-sets? – Deduplicator Jul 20 '20 at 23:29
  • @user13783520 Some of the answers at the linked question work with that assumption. They still all loop, but use different methods to lowercase each character. – Barmar Jul 20 '20 at 23:31
  • 1
    @Barmar This is not a duplicate. I am asking for a _faster_ way, not _a_ way. –  Jul 20 '20 at 23:31
  • There's no faster way than looping with standard C. – Barmar Jul 20 '20 at 23:32
  • If you have a perf problem with lower-casing a string, it's not from the lowercase loop itself. It's from the design that requires you to call to invoke that routine so much. – selbie Jul 20 '20 at 23:32
  • The ASCII assumption has nothing to do with it. – Barmar Jul 20 '20 at 23:32
  • @Barmar There's nothing faster? That's a strong claim. You may think optimization is unnecessary, but saying it's impossible is a whole nother matter. – John Kugelman Jul 20 '20 at 23:33
  • @Barmar Maybe bitwise magic? –  Jul 20 '20 at 23:34
  • 1
    @JohnKugelman I said "with standard C". There may be implementation-specific optimizations. – Barmar Jul 20 '20 at 23:34
  • @user13783520 There's bitwise magic in one of the answers at the linked question. – Barmar Jul 20 '20 at 23:34
  • Standard C doesn't disallow loop unrolling, table lookups, and other generic optimizations. (Those may or may not be helpful but that's a matter for answers to address.) Also, the OP didn't say "standard C answers only". An answerer could very well write non-standard C with appropriate disclaimers. – John Kugelman Jul 20 '20 at 23:36
  • the only thing I can think of is checking the assembly code to make sure tolower is being inlined because a function call is unnecessary here – M47 Jul 20 '20 at 23:41
  • @M47 `tolower()` is unlikely to be inlined, as the compiler probably cannot prove the locale is ASCII-only. Nor is it likely to try all that hard. – Deduplicator Jul 20 '20 at 23:42
  • @Barmar: A program with `#if` statements that test prerequisites and implements a specialized fast solution if they are met and a general solution otherwise could be strictly conforming. – Eric Postpischil Jul 21 '20 at 00:37

1 Answers1

2

Portable the lookup table will be probably the fastest. For a specific encoding (for example ASCII) some char arithmetic.

const char table[256] = {
                         ['a'] = 'a', ['A'] = 'a',
                         ['b'] = 'b', ['B'] = 'b',
                         ['c'] = 'c', ['C'] = 'c',
                         ['d'] = 'd', ['D'] = 'd',
                         ['e'] = 'e', ['E'] = 'e',
                         ['f'] = 'f', ['f'] = 'f',
                         ['g'] = 'g', ['g'] = 'g',
                         ['h'] = 'h', ['H'] = 'h',
                         ['i'] = 'i', ['I'] = 'i',
                         ['j'] = 'j', ['J'] = 'j',
                         /* etc etc */
};


char *strl_p(char *str)
{
    char c;
    char *saved = str;
    unsigned char *ustr = str;

    while(*ustr)
    {
        c = table[*ustr];
        if(c) *ustr = c;
        ustr++;
    }
    return saved;
}

char *strl_ascii(char *str)
{
    char c;
    char *saved = str;

    while(*str)
    {
        c = *str;
        if(c >= 'A' && c <= 'Z') *str = c - ('A' - 'a');
        str++;
    }
    return saved;
}

0___________
  • 60,014
  • 4
  • 34
  • 74
  • Yes, such a small table is liable to be unbeatable. Unless one can just mindlessly or with 0x20 for any known stretch. – Deduplicator Jul 20 '20 at 23:48