2

I found code:

char *BoyerMoore_negative(char *string, int strLength)
{
    char *data ="nice bad good worst";
    int dataLength = strlen(data) - 1;
    int skipTable[256], i;
    char *search;
    register char lastChar;

    if (strLength == 0)
        return NULL;

    // Initialize skip lookup table
    for (i = 0; i < 256; i++)
        skipTable[i] = strLength;   //strlength is numbers of words in sentence

    search = string; //string is array of words in the sentence
    i = --strLength; // Decrease strLength here to make it an index

    do {
        skipTable[*search++] = i; // ---> skiptable is int array and search is string. Then what happens on LHS on each iteration???
    } while (i--);

    lastChar = *--search; // ---> Decreamenting string pointer? What does it mean on RHS?

    do {
        // Have we found the entire string?
        if (i-- == 0)
            return search;
    } while (*--search == string[i]);

    // Skip past the part of the string that we scanned already
    search += (strLength - i + 1);     ---> what happends here?
    dataLength--;

Can someone give hint of ---> pointers? I dont know whether I can ask this kinda question or not but for my understanding these concept I did!

mirod
  • 15,923
  • 3
  • 45
  • 65
user123
  • 5,269
  • 16
  • 73
  • 121
  • 5
    Don't use `char *` to point to a string literal, use `const char *`, or, better, `std::string`. – chris Aug 15 '13 at 09:36
  • 5
    You need a good book or tutorial. See e.g. [the lists here](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list). – Some programmer dude Aug 15 '13 at 09:36
  • @chris: ok, but what is the problem with it? – user123 Aug 15 '13 at 09:37
  • 1
    @Karimkhan, String literals are not modifiable. If you try to do so, the `const` is the difference between a compiler error and UB. In C++11, you're not even allowed to have a `char *` pointing to a string literal. – chris Aug 15 '13 at 09:38
  • @chris Not even allowed to have the pointer? Got a reference for that (in The Standard, or just some page/SO question with a description)? – BoBTFish Aug 15 '13 at 09:46
  • 1
    @BoBTFish, I think that going from the '03 standard to the '11 standard, the part that said you could have a `char *` point to a constant character disappeared. – chris Aug 15 '13 at 09:50
  • 1
    Anybody has noticed that there is anything about C++ in that code snippet? Its C. – Manu343726 Aug 15 '13 at 09:55
  • @Manu343726: If I write it in my c++ program then will you hang me? – user123 Aug 15 '13 at 09:59
  • 4
    @Karimkhan I might refuse to work with you if you wrote this in a c++ program I had to work on... – BoBTFish Aug 15 '13 at 10:01
  • 1
    @Karimkhan C++ has a subset compatible with C (Nowadays thats not entirely true, C++ and C standards have diverged since the end of the 90's), and you are using things that have any meaning in C++. For example, the `register` keyword: http://www.drdobbs.com/keywords-that-arent-or-comments-by-anoth/184403859 – Manu343726 Aug 15 '13 at 10:05
  • @BoBTFish: What improvement? – user123 Aug 15 '13 at 10:05
  • @Manu343726: Thanks, it was new for me! so register does not make any sense .. – user123 Aug 15 '13 at 10:10
  • 2
    @Karimkhan the rule of thumb here is: If you use C, use C. If you use C++, use C++. **But don't use C and give it as C++**. – Manu343726 Aug 15 '13 at 10:12
  • 1
    Everyday I log in to stackoverflow, and edit questions like this (*A C question tagged as a C++ question*) **one or two times per day**. – Manu343726 Aug 15 '13 at 10:16
  • @chris, right, in C++03 the conversion from string literals to `char*` is defined in 4.2 [conv.array] and listed in D.4 [depr.string] as deprecated. That wording is not in C++11. – Jonathan Wakely Aug 15 '13 at 10:37
  • @JonathanWakely, Thanks, I knew it was something similar to that. I'll have to remember it's string literals -> `char *`. – chris Aug 15 '13 at 10:46

2 Answers2

1

I presume you know what the code is supposed to do. The function has a very evocative name.

So, addressing the queries in code directly, without analysing the code too much (it's incomplete):

search is a pointer. Not the string itself, but a pointer to that string. One might expect string to be pointing the start of some string, and search is set to point to the same value.

skipTable[*search++] = i; dereferences the pointer, obtaining the value at whatever search was pointing at, and then increments the pointer. Since skipTable is an array of 255 ints and search is a char pointer, this will never fall out of bounds on x86 if a char is unsigned (compiler dependent). Really, there should be a cast; skipTable[(unsigned char) *search++]. (strictly speaking, this still isn't portable).

So i is set to the value at skipTable[*search] and search (which is a pointer) is incremented (presumably to point to the next value in some array).

lastChar = *--search; is similar to before. search is decremented and then dereferenced. So this gets the previous value in the string.

search += (strLength - i + 1); again, a pointer is being increased. If strLength represents the length of a string

As has been pointed in out in comments, the following is not a good idea: char *data ="nice bad good worst"; The string you've defined in code cannot be modified, but because you have a pointer to it, you're allowed to try. The result of trying to modify the string pointed to be data is undefined, but will almost invariably result in a crash.

xen-0
  • 709
  • 4
  • 12
  • "Since skipTable is an array of 255 ints and search is a char pointer, this will never fall out of bounds on x86." Except that char could be signed. And quite likely is. – aragaer Aug 15 '13 at 10:20
  • That is true. There should be a cast as well. – xen-0 Aug 15 '13 at 10:38
1
skipTable[*search++] = i;

This is actually hazardous on a platform where char may be signed, in which case some values could produce a negative index and an illegal access. Should be:

skipTable[*search++ & 255] = i;

What it and its surrounding loop do is to populate skipTable with some information about each character you might discover in your search space. Specifically, if you find character c as you search, then the value of skipTable[c] will be how far from the start of the string you're searching for that that character appears. If it does not appear in that string then we get the default maximum value -- the length of the whole string.

lastChar = *--search;

This means that the value of search before the expression is pointing one character further on than we wanted it to. That is, it counted right off the end of the string and is now pointing at (probably) a NUL character. We want to subtract one from that, and then see what character it's pointing to. We also want to keep the modified value of search.

search += (strLength - i + 1);

You seem to have deleted some code for brevity, so it's hard to say exactly what's going on, here.

What it looks like is that it's just performed a string comparison, starting at the far end of the strings, and found a mismatch. i probably counted down from strLength while string decremented from the end of the string to be compared. In that case, the line above would re-set search to where it pointed before the start of the comparison, plus one more character to begin the search from that point.

sh1
  • 4,324
  • 17
  • 30