0

I have managed to produced a code that works perfectly fine and does exactly what it should. My problem is: I absolutely don't understand why. In my understanding it should NOT work.

#define PARAMS 12

char** parseFile(char* tmp[], char* location) {

    // Parse a file at <location> and retrieve 6 key: value pairs from it.
    // All key: value pairs are stored in tmp, resulting in an array of
    // 12 char pointers, each pointing to a char array that contains either
    // a key or a value.

    // ...

    do {
        yaml_parser_scan(&parser, &token);
        if (token.type == YAML_SCALAR_TOKEN && i<PARAMS) {
            strcpy(tmp[i], (char*)token.data.scalar.value);
            i++;
        }
        if (token.type != YAML_STREAM_END_TOKEN) {
            yaml_token_delete(&token);
        }
    } while (token.type != YAML_STREAM_END_TOKEN);

    // ...

    return tmp;
}

int main(int argc, char* argv[]) {
    int i=0;
    char **tmp = (char **)calloc(PARAMS, sizeof(char *));
    char **params = (char **)calloc(PARAMS, sizeof(char *));
    for (i=0; i<PARAMS; i++) {
        tmp[i] = (char *)calloc(32, sizeof(char));
        params[i] = (char *)calloc(32, sizeof(char));
    }

    memcpy(params, parseFile(tmp, argv[1]), sizeof(char) * 56); // WHY 56?
    for (i=0; i<PARAMS; i++) {
        printf("PARAM_[%d] = %s\n", i, params[i]);
    }

    free(tmp);
    free(params);

    return 0;
}

I could swear that I would have to specify a size of sizeof(char) * 384 (1 * 12 * 32) in memcpy for the code to work. But if I do, it doesn't. It only works and produces the correct result if I specify sizeof(char) * 56. Why 56? What's that value supposed to mean? I do realise that I have 6 key: value pairs and that 6 * 8 + 8 is 56, but I still don't get it.

[EDIT]

The use case is this:

The program connects to a server based on connection parameters (IP address, port) that are provided in a yaml configuration file. A minimalistic version of that file could look like this:

---
  # Connection parameters
  tx_addr: 192.168.1.124
  tx_port: 8080

The program needs the values of the key:value pairs as those are passed to functions which establish the connection to the server. So, what I need to do is to parse the configuration file and look for allowed keys (the parameters could be anything; I only know the keys). E.g., if the key "tx_addr" is found, the program knows that the value to that key shall be passed as the IP address parameter of the functions establishing the server connection.

ci7i2en4
  • 834
  • 1
  • 13
  • 27
  • 1
    first of all, `sizeof(char)` is by definition `1`. Then, your `params` is an array of pointers, which you overwrite with your `memcpy()`, this is just undefined behavior. Anything can happen, even what you expected -- your program is still horribly broken. –  Apr 04 '18 at 09:42
  • So, what would be the best way to copy the content of tmp to params? – ci7i2en4 Apr 04 '18 at 09:44
  • So, what **is** `tmp`? I can't figure out what you're actually *trying* to accomplish with that code... –  Apr 04 '18 at 09:45
  • Not entirely related, but if I had written this code, it would have at least checked if `calloc()` returned anything other than `NULL`. – WedaPashi Apr 04 '18 at 09:48
  • The goal is to get all 6 key: value pairs from parseFile() such that I can access the keys and values individually. I thought that by passing an empty two-dimensional array (tmp) to the function, let the function fill and return it, and then copying it to another local two-dimensional array in main(), I could work with the latter array. – ci7i2en4 Apr 04 '18 at 09:48
  • @WedaPashi The code produces the desired result every time I run it, and I have run it many times already. This is not meant to be an excuse but because of that it just didn't occur to me until now that something could be wrong with it. I'm happy to hear suggestions on how to improve it. – ci7i2en4 Apr 04 '18 at 09:50
  • When I saw that 384 didn't work, I started trying other values that would make any sense in the given mathematical context. 56 was the first and only value that led to a working code. PARAMS is the number of keys and values in the file I'm reading in parseFile(). PARAMS/2 would be the number of key: value pairs. – ci7i2en4 Apr 04 '18 at 09:58
  • sizeof(char *) is 8 byte – ci7i2en4 Apr 04 '18 at 09:59
  • @4386427 and then it would be a shallow copy at best, overwriting the pointers that were assigned with `params[i] = (char *)calloc(32, sizeof(char))`, therefore leaking memory ... so I still can't make sense out of that code (I doubt it does what was intended). –  Apr 04 '18 at 10:06
  • @4386427 I have added the part where tmp is updated to the code, please see the original posting. – ci7i2en4 Apr 04 '18 at 10:07
  • @FelixPalmen Like I said, the code gets me exactly the desired result every time I run my program. But if you do know how to correct what you believe to be wrong, I would be more than happy to hear it. – ci7i2en4 Apr 04 '18 at 10:08
  • @ci7i2en4 I can't tell you because I still don't get the *intention* behind that code.... Code doing what you expect doesn't mean anything in C :o –  Apr 04 '18 at 10:10
  • @4386427 so do I, but maybe in this case, a concise description of the use case would tell more than this weird code ;) –  Apr 04 '18 at 10:15
  • @ci7i2en4 a [mcve] could help as well, including a *sample input* and the *desired output*... –  Apr 04 '18 at 10:19
  • I have added the use case to my original posting. – ci7i2en4 Apr 04 '18 at 10:58
  • Please read and understand [the question on why not to cast the return value of `malloc()` and family in C](/q/605845). – Toby Speight Apr 04 '18 at 13:32

1 Answers1

1

I assume that your aim is to copy all key:value pairs from tmp to params.

You can't copy all the key:value pairs from tmp to params with a single memcpy. You can't be sure that malloc have given you two consecutive memory areas.

You have to copy the strings one-by-one.

parseFile(tmp, argv[1]);
for (i=0; i<PARAMS; i++) {
    memcpy(params[i], tmp[i], 32);
}

In order to get two consecutive memory areas (so that you can copy with a single memcpy), you need to allocate the variables as 2D arrays (https://stackoverflow.com/a/40847465/4386427 and https://stackoverflow.com/a/42094467/4386427).

Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
  • You are starting to see what I want but I think I'm already ahead of you: I'm not trying to copy the pairs, I'm already trying to copy a key, than (to the next position of the array) a value, a key, a value, ... until I have all parameters. Ultimately, I need an array (params) that looks like: params[0] = key1, params[1] = value1, params[2] = key2, etc. – ci7i2en4 Apr 04 '18 at 10:56
  • ^^ If we imagine this two-dimensional array with an x and y axis, my thought was that I have 12 (PARAMS macro) elements (pointers to char arrays) on the x axis, and each of those elements has 32 byte on the y axis as this is the amount of memory I'm allocating (for each array within the one holding the char pointers). – ci7i2en4 Apr 04 '18 at 11:01
  • @ci7i2en4 Well, you can see it like that. But it's more common to talk about rows and columns. You would have 12 rows, each with 32 columns. – Support Ukraine Apr 04 '18 at 11:09
  • @ci7i2en4 BTW: Why do you have both `tmp` and `params`? Why don't you read directly into `params`? Another thing - since you deal with key:value pairs, why not define a struct with a key and a value – Support Ukraine Apr 04 '18 at 11:11
  • Yes, you are right. In any case it's a two-dimensional matrix and those do have rows and columns. I'm still trying to figure out your solution, and how it differs from what I'm doing. I thought that I had already allocated two 2D arrays (params, tmp), one of which I pass to parseFile()?! – ci7i2en4 Apr 04 '18 at 11:13
  • @ci7i2en4 You don't allocate a 2D array. You allocate a 1D array of 12 pointers. Then you set each of these pointers to point a 1D array of 32 chars. Functionally yours can be used as a 2D array (i.e. array[x][y]) but memory wise the layout is very different. A 2D array use a single contiguous memory area. Your method uses 13 different memory areas (which is why you can't do the memcpy) – Support Ukraine Apr 04 '18 at 11:25
  • Okay, now I finally understand what the problem is! Thank you. I was (wrongly) assuming that my params and tmp arrays were being allocated as matrixes, one area in memory for each matrix. But now I see that the 12 arrays (the pointer array is pointing to) each can be anywhere ... So, the solution, as you have pointed out, is either using two true 2D arrays, or structs. I'll try that. – ci7i2en4 Apr 04 '18 at 11:28