2

So I am learning C and to get practice I do some "LeetCode" challenges. For a palindrome checker I have tried two approaches, one using indices and one using pointers.

  • Where does the large performance difference come from?

And as a follow up question:

  • What method is usually preferred and for what reasons?

enter image description here enter image description here

For reference here are the code blocks.

Pointers:

bool isPalindrome(char *s) {

    size_t length = strlen(s);
    char *left = s;
    char *right = s + length - 1;

    while (left < right) {
        
        while (left < right && !isalnum(*left))
            ++left;
        
        if (left == right)
            return true;

        while (right > left && !isalnum(*right))
            --right;

        if (right == left)
            return true;
        
        if (tolower(*left) != tolower(*right))
            return false;

        ++left;
        --right;
    }

    return true;
}

Indices:

bool isPalindrome(char *s) {

    size_t length = strlen(s);
    size_t left = 0;
    size_t right = length - 1;

    while (left < right) {
        
        while (left < right && !isalnum(s[left]))
            ++left;
        
        if (s[left] == '\0')
            return true;

        while (right > left && !isalnum(s[right]))
            --right;

        if (right == 0)
            return true;
        
        if (tolower(s[left]) != tolower(s[right]))
            return false;

        ++left;
        --right;
    }

    return true;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
OldCrow
  • 37
  • 6
  • 3
    The codes are not functionally identical. – Weather Vane Jul 02 '23 at 19:11
  • 2
    "have tried two approaches, one using indices and one using pointers. ... What method is usually prefered" ? --> 1) The one with the correct functionality 2) Use the approach with the lower [order of complexity](https://en.wikipedia.org/wiki/Big_O_notation) 3) Use the one with the most clarity. – chux - Reinstate Monica Jul 02 '23 at 19:20
  • 2
    `s[left]` is *exactly* the same as `*(s + left)`. There is no reason to expect a performance difference. – BoP Jul 02 '23 at 19:21
  • 2
    Both codes are broken when `s[0] == 0`. Fix that first. – chux - Reinstate Monica Jul 02 '23 at 19:22
  • Anyway IMO 8 ms and 2 ms are too short to realistically compare the codes. Aim for an execution time of at least one second, by providing a large enough data sample. – Weather Vane Jul 02 '23 at 20:04
  • @chux-ReinstateMonica One constraint of the challenge was that the string will always be bigger than zero. Otherwise yeah you would be right. – OldCrow Jul 02 '23 at 20:05
  • @WeatherVane Yeah I was not caring about performance for performance reasons, just courious what was causing the performance difference, as I said I am new to C and still trying to understand some concepts. – OldCrow Jul 02 '23 at 20:08
  • 1
    My point stands: you cannot realistically compare them with such short timing. – Weather Vane Jul 02 '23 at 20:09
  • 1
    Modern compilers will generate pretty similar, if not identical, code for both cases. LeetCode results are not benchmarks and are not indicative of performance. If you are interested in performance, run your own benchmarks. Mine [show](https://quick-bench.com/q/zZ7OMMXRrRdtzgovEDgw9GwIOUA) that there is no consistent difference in performance. (I fixed your code a bit but did not test it extensively so there still could be bugs). – n. m. could be an AI Jul 02 '23 at 20:30
  • Obligatory: Leetcode, Hacker Rank, and other competition sites are poor ways to learn a programming language. – Chris Jul 02 '23 at 20:41
  • @Chris Okay what do you suggest? I basically use leetcode to give me test tasks to tease my old brain a bit and to test if can do a task with my knowledge of a language. – OldCrow Jul 02 '23 at 20:56
  • 1
    If you're just learning the language? [A good book.](https://stackoverflow.com/questions/562303/the-definitive-c-book-guide-and-list) – Chris Jul 02 '23 at 21:02

2 Answers2

3

Where does the large performance difference come from?

There are many potential reasons for a performance difference, but it would be helpful to understand how you compile the code, how you measure performance and with what input. Here are a few hints:

  • the 2 functions posted do not implement the same semantics: in the pointer version, you exit the loop if (left == right) whereas the equivalent test in the index version is if (s[left] == '\0') which is not true in most cases where if (left == right) is true. If the functions behave differently, no surprise they perform differently.

  • did you enable optimisations? without optimisations, s[left] may generate more code than *left, but with optimisations enabled, most compilers will generate code with similar performance, either converting the index approach to a pointer approach, or using addressing modes that use multiple registers, with or without scaling, without a penalty.

What method is usually preferred and for what reasons?

Both methods are equivalent if implemented properly, and compile to equivalent performance with modern compilers.

Here are reasons to use pointers:

  • Using pointers may be more appropriate if local rules encourage it
  • Pointers are quite specific to C (and C++), it makes the code more familiar to die hard C programmers in their 60s

Reasons to use index variables with type size_t:

  • using indices is more general, works in most languages and probably more idiomatic in C++ language (for .
  • code is more readable for programmers from various backgrounds
  • code is easier to port to other languages.

Before you focus on performance, focus on correctness!

  • both functions have undefined behavior on the empty string, a trivial palindrome.
  • you have multiple redundant tests.
  • the macros isalnum() and tolower() from <ctype.h> are only defined for a restricted range of int values: all values of type unsigned char and the special negative value EOF expands to. Passing negative char values has undefined behavior on platforms where type char is signed by default. You should use a cast to (unsigned char).

Here are modified versions:

bool isPalindrome_idx(const char *s) {
    size_t len = strlen(s);
    if (len > 1) {
        size_t left = 0;
        size_t right = len - 1;
        while (left < right) {
            while (!isalnum((unsigned char)s[left])) {
                if (++left >= right)
                    return true;
            }
            while (!isalnum((unsigned char)s[right])) {
                if (left >= --right)
                    return true;
            }
            if (tolower((unsigned char)s[left]) != tolower((unsigned char)s[right])) {
                return false;
            } else {
                left++;
                right--;
            }
        }
    }
    return true;
}

bool isPalindrome_ptr(const char *s) {
    size_t len = strlen(s);
    if (len > 1) {
        const char *left = s;
        const char *right = s + len - 1;
        while (left < right) {
            while (!isalnum((unsigned char)*left)) {
                if (++left >= right)
                    return true;
            }
            while (!isalnum((unsigned char)*right)) {
                if (left >= --right)
                    return true;
            }
            if (tolower((unsigned char)*left) != tolower((unsigned char)*right)) {
                return false;
            } else {
                left++;
                right--;
            }
        }
    }
    return true;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Thank you so much for that detailed explanation. – OldCrow Jul 02 '23 at 20:15
  • And for clarity I do not performance test this code. I will write and test it in Visual Studio 2022(Visual C compiler) and then submit it on leetcode.com. The site will test it. And give me a result. They compile it with gcc 8.2 using the gnu11 standard and using -O1 optimisation. – OldCrow Jul 02 '23 at 20:35
  • @OldCrow: testing performance with `-O1` is meaningless. Is performance the actual goal of this test? – chqrlie Jul 02 '23 at 20:38
  • No no performance is not a goal. They just show you how you compared against other submitted solutions. – OldCrow Jul 02 '23 at 20:48
1

Change if (right == 0) in the indexed example to if (right == left), to make them the same. Then their performance should be very similar.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • Totally overlooked that part. Thank you.! So pointer and indices make no difference got that. – OldCrow Jul 02 '23 at 20:10
  • By themselves, generally not, with modern processors. However the choice may result in more or fewer other operations to control the loop and counting, and either might end up faster. Though a good optimizing compiler would ideally wipe out those differences as well. – Mark Adler Jul 02 '23 at 21:57
  • 3
    I would venture my opinion that the choice should come down to not how a compiler will read your code, but rather how a human will. – Mark Adler Jul 02 '23 at 21:58