0

After reading the documentation, I still did not figure out how to write data from memory (variables) to datasets correctly. For example, with the construction CREATE TRUNCATE CHUNKED DATASET dsetint AS INT(UNLIMITED), "INSERT INTO dsetint (-1) VALUES (" << val << ")" there is no problem here the variable is substituted and the record goes.

and for example "CREATE TRUNCATE CHUNKED DATASET dsetchr AS VARCHAR(UNLIMITED)") char bl[] = "blabal"; "INSERT INTO dsetchr(-1) FROM MEMORY " << HDFql::variableRegister(&bl); writes only zeros

And so the example that came out of the question link

At the input we have an array of bytes and an array with data sizes in the first array

  uint8_t* ptr ;
    HDFql::execute("CREATE TRUNCATE FILE test.h5");
    HDFql::execute("USE FILE test.h5");

    status = HDFql::execute("CREATE TRUNCATE CHUNKED DATASET inclusion AS UNSIGNED VARTINYINT(UNLIMITED)");

    std::vector<int> s{ 6,12 };
    std::vector<uint8_t> v{0x35,0x34, 0x35, 0x00, 0x00, 0x35,0x36,0x36, 0x36, 0x36, 0x36, 0x36,0x36,0x00, 0x36, 0x00, 0x36, 0x36 };
    ptr = &v[0];

    int number = HDFql::variableRegister(&ptr);
    for (int i = 0; i < s.size();i++)
    {
        scriptst << "INSERT INTO inclusion(-1) VALUES FROM MEMORY " << number << " SIZE " << s[i];
        status = HDFql::execute(scriptst);
        ptr = ptr + s[i];
        HDFql::execute("ALTER DIMENSION Inclusions TO +1");
        scriptst.str(std::string());
        scriptst.clear();
    }
   // my expectation that there will be a record in the file
    // dataset  ->row1      0x35,0x34, 0x35, 0x00, 0x00, 0x35
    // dataset  ->row2      0x36,0x36, 0x36, 0x36, 0x36, 0x36,0x36,0x00, 0x36, 0x00, 0x36, 0x36
    // dataset  ->row3  

I would like to enter data into a file without a flaw from the array and be able to append them at any time.

UPDATE My experimentation and reading of the documentation led me to this example, but it still doesn't do what I need. Here an array of bytes is stored in a dataset, 1 byte per line. But I still would like to store an array of bytes of different lengths on a dataset line

status = HDFql::execute("CREATE TRUNCATE FILE test_Titnyint.h5");
    status = HDFql::execute("USE FILE test_Titnyint.h5");
    status = HDFql::execute("CREATE TRUNCATE CHUNKED DATASET inclusion AS UNSIGNED TINYINT(UNLIMITED)");
    std::vector<uint8_t> array{0x35,0x34, 0x00, 0x37, 0x35, 0x35,0x36,0x36, 0x36, 0x36, 0x00, 0x36,0x36,0x36, 0x36, 0x36, 0x36, 0x36 };
     uint8_t* ptr ;
    ptr = &array[0];
    for (int i = 0; i < array.size();i++)
    {
        int number = HDFql::variableRegister(ptr);
        if (i > 0) { status =HDFql::execute("ALTER DIMENSION inclusion TO +1"); }
        scriptst << "INSERT INTO DATASET inclusion(-1) VALUES FROM MEMORY " << number ;
        status = HDFql::execute(scriptst);
        HDFql::variableUnregister(ptr);
        ptr = ptr + 1;
        scriptst.str(std::string());
        scriptst.clear();
    }
     status = HDFql::execute("CLOSE FILE");

remark: vectors "s" and "v" should be perceived as input from outside.

OlegMart
  • 5
  • 2

1 Answers1

0

To write variable-length data into an HDF5 dataset (in your case of data type VARTINYINT) using HDFql, you need to use a variable of type struct HDFQL_VARIABLE_LENGTH. As an example:

HDFQL_VARIABLE_LENGTH my_data;

std::vector<uint8_t> values1 {0x35,0x34, 0x35, 0x00, 0x00, 0x35};

std::vector<uint8_t> values2 {0x36, 0x36, 0x36, 0x36, 0x36, 0x36, 0x36, 0x00, 0x36, 0x00, 0x36, 0x36};


HDFql::execute("CREATE TRUNCATE AND USE FILE test.h5");

HDFql::execute("CREATE TRUNCATE CHUNKED DATASET inclusion AS UNSIGNED VARTINYINT(UNLIMITED)");

number = HDFql::variableRegister(&my_data);


my_data.address = values1.data();

my_data.count = values1.size();

script << "INSERT INTO inclusion(-1) VALUES FROM MEMORY " << number;

HDFql::execute(script);


my_data.address = values2.data();

my_data.count = values2.size();

HDFql::execute("ALTER DIMENSION inclusion TO +1");

HDFql::execute(script);
SOG
  • 876
  • 6
  • 10
  • what if I don't know the exact content of the vectors. in the question I indicated them as variables for simplicity, in fact, these are input parameters for the save-to-file module and can be quite large. Therefore, I would like to avoid unnecessary copying. – OlegMart Dec 17 '21 at 15:47
  • I have updated the code snippet above to reflect your comment! – SOG Dec 19 '21 at 23:53
  • Do I understand correctly reading such a record will be in a double loop and it will not work to get `HDFQL_VARIABLE_LENGTH` right away. `while(HDFql::cursorNext() == HDFQL_SUCCESS) { while(HDFql::subcursorNext() == HDFQL_SUCCESS) {*HDFql::subcursorGetUnsignedTinyint() } }` – OlegMart Dec 21 '21 at 10:18
  • Not sure if I understand your comment, but one thing is to read data from a dataset and populate memory (i.e. a user-defined variable) with it (which is what the code snippet that I have posted above does). Another is to read data from a dataset and populate a cursor with it (which is what your last comment seems to indicate). You can pick either method knowing that if you choose one (e.g. populate memory), you can't retrieve the data using the other (e.g. from a cursor). – SOG Dec 21 '21 at 11:25
  • I apologize for the crooked wording. I wanted to ask if there is a way to get a record directly into a variable of type `HDFQL_VARIABLE_LENGTH`, and not by a double loop through the cursor reading each byte? – OlegMart Dec 21 '21 at 14:39
  • No problem! Yes, you can get a record directly into a variable of type `HDFQL_VARIABLE_LENGTH` by doing something similar to this: `script << "SELECT FROM inclusion INTO MEMORY " << number; HDFql::execute(script);`. – SOG Dec 21 '21 at 20:03