0

Here is my code snippet:

void
    DirectIO::writeFileUnix(const std::string &file_path,
                            unsigned long long int index,
                            size_t size,
                            std::string &bytes_data) {
        WRITE_COUNT++;

        const int file_descriptor = open(file_path.c_str(),
                                         O_WRONLY | O_DIRECT | O_SYNC | O_CREAT, S_IRUSR | S_IWUSR);
        if (file_descriptor == -1 || size % SECTOR_SIZE != 0) {
            throw std::exception();
        }

        char *buffer = &bytes_data[0];
        ssize_t number_of_bytes_written = pwrite(file_descriptor, buffer, size, index * size);

        if (number_of_bytes_written == -1 || number_of_bytes_written < size) {
            throw std::exception();
        }

        close(file_descriptor);
    }

The function seems correct logically. But number_of_bytes_written is always -1. The file may get created if not exists. I don't understand why it isn't working.

Update 1

Okay, so I got the problem. char* buffer is limited by null character. Hence, the line char *buffer = &bytes_data[0]; will point to a only a small part of string bytes_data.

When I came to know this, I updated that part of my code to

void *buffer = bytes_data.data();
        ssize_t number_of_bytes_written = pwrite(file_descriptor, buffer, size, index * size);

        if (number_of_bytes_written == -1 || number_of_bytes_written < size) {
            throw std::exception();
        }

And data() function of std::string doesn't care about null characters according to its C++ reference page as well as this SO post

But still, it isn't working out.

If anyone wants to try out the code, I am posting here my sample data to try on.

bytes_data - "\000\000\000\000\005\000\000\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\000....\000" (octal representation as supported by clion)
index = 0
size = 4096 (length of bytes_data)
file_path = "abc.binary"

Update 2

I tried writing a string will no null character and still it throws an error. Hence, the problem isn't will null character. This seems some memory alignment issue as the disk alignment is already taken care of (I might be wrong because of my limited knowledge in C++).

Kadam Parikh
  • 422
  • 4
  • 17
  • If `pwrite()` returns -1, it sets `errno` to the reason why. You should include a useful error message - most people would use `perror()` or `strerror()` to turn it into a human-readable error. – Shawn Apr 21 '20 at 10:50
  • Your file offsets are aligned to a sector boundary, but what about the alignment of the userspace buffer? – Ian Abbott Apr 21 '20 at 10:59
  • @Shawn I ain't aware of `perror()`. Thanks for the tip. Will update my code using it. @IanAbbott Can't say about this. But I found the problem. I am converting integer to bytes. Hence, for an integer of value 0, string will contain "0000" (considering 2 bytes hex value). Now, the thing is, this 0 is counted as '\0' by c_str (C string). Ahd hence, the buffer is empty. Will also update the question. – Kadam Parikh Apr 21 '20 at 11:05
  • Doesn't '0' and '\0' have different meaning? @IanAbbott – Kadam Parikh Apr 21 '20 at 11:08
  • `pwrite` won't care about the contents of the buffer, only its address and length and whether the memory it occupies is valid. – Ian Abbott Apr 21 '20 at 11:24
  • So, instead of std::string, I should use something else for storing bytes, right? Can you suggest any options? @IanAbbott – Kadam Parikh Apr 21 '20 at 11:33
  • 1
    @KadamParikh No, std::string is perfectly capable of storing '\0'. The problem is that you are interpretting the data in the std::string as a null terminated string. But you are doing that somewhere in the code you haven't shown. If you want the size of std::string just use the size method. This works correctly even if the string contains null bytes. – john Apr 21 '20 at 11:57
  • @john I already have the size of string. Also, I realized the same and hence tried to eliminate this. bytes_data contains null character but the buffer I want doesn't. I tried modifying the code but still it doesn't work. I have updated the question including the changes I have tried and sample data. Can you have a look at it please? Thank you.. – Kadam Parikh Apr 21 '20 at 12:14
  • 1
    @KadamParikh You've misunderstood the problem. `char* buffer = &bytes_data[0];` is perfectly OK. buffer is just a pointer, and it will point to exactly the same place in both your new and old code. I've afraid I don't know the answer to your problem, but judging by the code above it's got nothing to do with nulls in your data. – john Apr 21 '20 at 12:18
  • 2
    What is the memory address `&bytes_data[0]`? I know Linux cares about the alignment of the buffer passed to `pwrite` when using `O_DIRECT` -- it needs to be sector aligned. You never mentioned you were using Linux, but other operating systems that support `O_DIRECT` may have similar restrictions. – Ian Abbott Apr 21 '20 at 12:21
  • @IanAbbott As john mentioned that this has nothing to do with null characters, I too have now started thinking of memory alignment. The address will be different in different runs but here a value from my current run - `0x55cadc0a8100` – Kadam Parikh Apr 21 '20 at 12:24
  • @KadamParikh `0x55cadc0a8100` doesn't seem to be aligned to a sector boundary. Check the man page for open(2) on your system to check what restrictions it imposes for the use of `O_DIRECT`. – Ian Abbott Apr 21 '20 at 12:28
  • @IanAbbott yes you are right. I tried writing all 1s and still it throws error. That means, this is surely the alignment problem. I wasn't aware about memory alignment. Will need to read it first. Will update shortly. Thank you.. – Kadam Parikh Apr 21 '20 at 12:36
  • Do you really need to use `O_DIRECT`? You might be better off removing `O_DIRECT` and calling `posix_fadvise(file_descriptor, 0, 0, POSIX_FADV_DONTNEED);` between the `pwrite` and `close` calls. – Ian Abbott Apr 21 '20 at 12:54
  • @IanAbbott Yes, I needed to use O_DIRECT (project specific usecase). Thanks for your suggestion. I wasn't aware about fadvise till now. Also, thank you for the help. I was able to solve the problem. Got late replying here due to some other work. Thanks.. – Kadam Parikh Apr 23 '20 at 06:08

1 Answers1

1

As, updated in the question at last, the issue wasn't due to null character. Well it was at some point, but the modified code in last edit made sure that null character caused no issues.

With the help of @IanAbbott, I found out that the issue was with (virtual) memory alignment. Until then, I always thought that only file offsets are needed to be aligned. But by looking at man page of open(), I found that virtual memory address space also needs to be aligned with SECTOR_SIZE.

By reading a few more articles and documents, I found that mmap is made available for this task. The address returned by mmap (a memory space mapped to file) is always aligned. Hence, finally, I used mmap for doing the task and the code ran without any errors.

Kadam Parikh
  • 422
  • 4
  • 17