7

I thought you just open a file and seek to the position you want to overwrite and start writing, but it seems that it only adds bytes at that position. How to I remove bytes from a file or overwrite bytes?

An example:

use std::fs::OpenOptions;
use std::io::{prelude::*, Seek, SeekFrom};

fn main() {
    let mut file = OpenOptions::new()
        .read(true)
        .append(true)
        .create(true)
        .open("/tmp/file.db")
        .unwrap();

    let bytes: [u8; 4] = [1, 2, 3, 4];

    file.seek(SeekFrom::Start(0)).unwrap();
    file.write_all(&bytes).unwrap();
}

Output file before:

00000000: 0102 0304 0a                             .....

Output file after:

00000000: 0102 0304 0102 0304 0a                   .........

As you can see, seeking to 0 does not overwrite the 4 bytes already in the file. Instead it prepends them to the file.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Fallen
  • 196
  • 3
  • 10
  • Are you specifically trying to avoid reading the whole file contents, modifying it, and then writing it all back again? – loganfsmyth Jun 04 '18 at 23:02
  • @loganfsmyth Yes, I am. The file is a database so it could be very large. – Fallen Jun 04 '18 at 23:50
  • If you're looking to write randomly into a large file, you might be interested in reading about memory mapped files? – Simon Whitehead Jun 05 '18 at 00:16
  • @loganfsmyth This is why I am confused, I had the same initial expectation. I seem to be doing something wrong. – Fallen Jun 05 '18 at 00:37
  • @SimonWhitehead Reliability is very important to me. I read about MMap a little on Wikipedia and the part that worries me is "The data is saved to the source file on the disk once the last process is finished." Does this mean that I will lose all changes if the system loses power? – Fallen Jun 05 '18 at 00:46
  • @Fallen Based on the code sample you've provided, I have to assume you understand the principles behind database design as you appear to be writing your own custom one? Transactional consistency can be wrapped around flushing a memory mapped file to disk. That is, the concept of a transaction can be implemented and a transaction is "finalised" once you flush the memory mapped file. "Data is saved to the source file on the disk once the last process is finished" is just default behaviour for most operating systems ... but you can actually ask the OS to flush the mmap to disk before that happens – Simon Whitehead Jun 05 '18 at 01:09
  • You are correct, I am writing a custom database. I want to store financial transactions so I want the reliability of the data to be as high as possible. Thank you for pointing me in the right direction, I found the flush behavior you were talking about at https://docs.rs/memmap/0.6.2/memmap/struct.MmapMut.html#method.flush. Looks like I'll be moving on to using Mmaps. – Fallen Jun 05 '18 at 01:20
  • While you're at it, I would advise you revise [what ACID means](https://en.wikipedia.org/wiki/ACID) and specifically [what a transaction is and how it works](https://en.wikipedia.org/wiki/Database_transaction). Taking note of the part which says "a successful transaction must get written to durable storage" i.e. flush the change to disk. Good luck on your journey ... although I'll say this: I hope your custom database is not expected to hold important financial transactions... writing a proper/reliable database is no easy task. – Simon Whitehead Jun 05 '18 at 01:25
  • Just as another thing - RE performance: you might want to review how memory paging works and specifically how you might be able to get the most out of your database via paging your database reads/writes. – Simon Whitehead Jun 05 '18 at 01:26
  • 6
    And I applaud you for turning a nigh-unanswerable question into an answered one, by responding to feedback! That is really too rare on this site. – trent Jun 05 '18 at 02:01
  • @SimonWhitehead Thank you for your input on how to write a better database. I will definitely ensure all transactions are fully ACID and look into memory paging. – Fallen Jun 05 '18 at 03:16
  • 1
    @trentcl Didn't want to make it too easy for you guys. – Fallen Jun 05 '18 at 03:17

1 Answers1

13

Its because you're using append(true). From the documentation:

This option, when true, means that writes will append to a file instead of overwriting previous contents. Note that setting .write(true).append(true) has the same effect as setting only .append(true).

Using write(true) instead, adds write permissions:

let mut file = OpenOptions::new()
    .read(true)
    .write(true) // <--------- this
    .create(true)
    .open("/tmp/file.db")
    .unwrap();

..and your code will work as expected.

Simon Whitehead
  • 63,300
  • 9
  • 114
  • 138
  • 1
    So append mode ignores the seek call? Seems odd but matches with https://stackoverflow.com/a/10631901/155423 – Shepmaster Jun 05 '18 at 02:20
  • I believe this is also how it works in .NET land - you're either overwriting or appending and all write operations move the cursor. – Simon Whitehead Jun 05 '18 at 02:23
  • 2
    :( I had `.write(true)` in the very beginning but saw the very same line in the documentation about `.append()` implies `.write()`, and of course I want to extend the file as well, so I swapped `.write()` for `.append()`. Seems unintuitive that append should ignore seeks and yet imply write. But fine, take your points you god among men! :P – Fallen Jun 05 '18 at 03:12
  • It seems unintuitive but it's consistent with a long tradition of such functions inherited from C. If you'd like to open an issue to ask for clarification on the documentation, I'm sure someone will be glad to do it. – mcarton Jun 05 '18 at 07:53