1

I've came across a strange behavior. While debugging, when a while-loop loops at first time: after going though /* "data-url" */ and /* "data-author" */ parts of code I have the next result in Debugging windows -> Watches:

(I'm using Code::Blocks IDE, Ubuntu 13.04)

The length of dataUrl_tempString is 8 bytes, the length of dataAuthor_tempString is 11 bytes, the length of dataName_tempString is 9 bytes...

1st pic

But after going through /* data-name */ part of code I have the the result that confuses me:

2nd pic

Now they are not of size 8, 11 and 9 bytes!

What is the matter?

Could you help me finding the reason of such behavior?


Here is the code of that function:

int SubString_Search(char *fnameNew, char *strUrl, char *strAuthor, char *strName) {

    FILE *fp;
    FILE *ofp_erase;
    FILE *ofp;
    char ch_buf;
    int count = 0;

    char dataUrl[8] = "";
    char dataAuthor[11] = "";
    char dataName[9] = "";
    char *dataUrl_tempString = &dataUrl[0];
    char *dataAuthor_tempString = &dataAuthor[0];
    char *dataName_tempString = &dataName[0];


    if( (fp = fopen("output_temp.txt", "r")) == NULL) {
        printf("File could not be opened.\n");
        return (-1);
    }
    else {
        /* Erasing 'NEW' file if exists */
        ofp_erase = fopen(fnameNew, "w");
        fclose(ofp_erase);
    }



    ofp = fopen(fnameNew, "a");
    rewind(fp);

    while(!feof(fp)) {

        /* "data-url" */
        fread(dataUrl_tempString, 8, sizeof(char), fp);
        if(memcmp(dataUrl_tempString, strUrl) == 0) {
            fseek(fp, 2, SEEK_CUR);     // going up to required place to copy a string
            while( (ch_buf = getc(fp)) != '"') {
                fputc(ch_buf, ofp);
            }
            fputc('\n', ofp);
        }
        fseek(fp, -8, SEEK_CUR);


        /* "data-author" */
        fread(dataAuthor_tempString, 11, sizeof(char), fp);
        if(memcmp(dataAuthor_tempString, strAuthor) == 0) {
            fseek(fp, 2, SEEK_CUR);     // going up to required place to copy a string
            while( (ch_buf = getc(fp)) != '"') {
                fputc(ch_buf, ofp);
            }
            fputc(' ', ofp);
            fputc('-', ofp);
            fputc(' ', ofp);
        }
        fseek(fp, -11, SEEK_CUR);


        /* "data-name" */
        fread(dataName_tempString, 9, sizeof(char), fp);
        if(memcmp(dataName_tempString, strName) == 0) {
            fseek(fp, 2, SEEK_CUR);     // going up to required place to copy a string
            while( (ch_buf = getc(fp)) != '"') {
                fputc(ch_buf, ofp);
            }
            //fputc() not needed
        }
        fseek(fp, -8, SEEK_CUR); // jumping over 1 symbol from the beginning: `-8` instead of `-9`...


        count++;
        if(count == 5)
            break;
    }

    rewind(fp);
    fclose(fp);
    fclose(ofp);

    return 0;
}
yulian
  • 1,601
  • 3
  • 21
  • 49
  • 5
    Side note: `while (!feof(fp))` is [almost always wrong](http://stackoverflow.com/q/5431941/1256452). Here it's wrong-but-harmless, acting like `while (1)` (`fseek` clears the `feof` indicator) but your `break` takes care of it. – torek Aug 19 '13 at 17:36
  • 2
    OT: `sizeof(Char)` is always `1`. Always check the outcome of system calls. – alk Aug 19 '13 at 17:37
  • @alk, I've read one book about **C** and have just began reading S.McConnell "Code Complete" and I think I'll follow your advise in the future... But not now ;) – yulian Aug 19 '13 at 17:40

2 Answers2

5

A string needs to have space for a '\0' termination - you only allocated 8 bytes for a string with 8 characters (which therefore needs 9 bytes minimum). Depending on what follows in memory, you will get unpredictable results.

Floris
  • 45,857
  • 6
  • 70
  • 122
  • Yeah - I knew straight off it was going to be that or strlen(). – Martin James Aug 19 '13 at 17:29
  • Do you mean `dataUrl_tempString` variable? I manipulate them as an array, not a string. – yulian Aug 19 '13 at 17:30
  • `dataUrl_tempstring` points to `dataUrl[8]` - therefore you only "own" the first eight bytes. A string needs a termination. You don't have space for that - so anything could happen. – Floris Aug 19 '13 at 17:32
  • Uh, I've been using `strcmp` to `dataUrl_tempString`... I'll remove this error and if it helps, I'll accept your answer. – yulian Aug 19 '13 at 17:36
  • 1
    @YulianKhlevnoy Use `strncmp()` or `memcmp()` instead. –  Aug 19 '13 at 17:55
1

You might like to change the call to

int strcmp(const char *s1, const char *s2);

to become calls to

int memcmp(const void *s1, const void *s2, size_t n);

This shall fix the issue, as long as you do not use other members of the str*() family of function on those (non 0-terminated) char arrays.

Note: However memcmp() always compares the number of characters passed as 3rd parameter (n). This might not be what you want.


Update:

Alternativly (as mixure of both calls above) there also is:

int strncmp(const char *s1, const char *s2, size_t n);

Which compares up until it finds a 0-terminator in either s1 or s2 and to a maximum of n characters.

alk
  • 69,737
  • 10
  • 105
  • 255
  • So `memcmp()` works to arrays as `strcmp()` to strings does? I want to compare arrays as strings.. – yulian Aug 19 '13 at 17:46
  • @YulianKhlevnoy: Please see the note I just added to my answer. – alk Aug 19 '13 at 17:46
  • And what is the *3rd parameter*? – yulian Aug 19 '13 at 17:47
  • @YulianKhlevnoy: The number of bytes to be compared. Also so please see `man memset`. – alk Aug 19 '13 at 17:49
  • Here it is! These two functions are equal is some sense. So, if I compare "data-url" and "data-url" it will return `0` (as `strcmp()`) does. Thanks! – yulian Aug 19 '13 at 17:56
  • @H2CO3, I've been reading a lot for the last time, so my brains are of ~100 degrees above zero... But I follow your advise every time (Now I've looked for `memcmp()` reference). – yulian Aug 19 '13 at 17:58
  • 1
    @YulianKhlevnoy Very good! I know what you are talking about :) –  Aug 19 '13 at 17:59
  • @YulianKhlevnoy: Your IDE cheats you. It can not handle non-`0`-terminated "strings". – alk Aug 19 '13 at 18:29
  • 1
    You are storing the string in successive locations in memory that is initially all set to zeros. From the output you show you can deduce that the strings are stored in the order url, name, author. Prove this to yourself by changing the strings (so they each start with something different). Once the "middle string" (name) is used, the only string termination is at the end of the block (after author). Give yourself more space. Anything that is asked to display a string looks for the `'\0'` and keeps going until it finds it (or segfaults). – Floris Aug 19 '13 at 19:33