4

I am somewhat new to Rust and I am trying to speed up the performance on my multithreaded downloader. The program takes in a URL, number of chunks and filepath.

What is the best way to write to file shared across threads? I have tried to use Arc<Mutex<File>> and locking the file before writing. Tried to optimize program with cargo build --release. I have also tried to open the file in each thread and write to it without locking (somehow that was the best).

A similar program in Go is sharing a file across goroutines and writing to it there. I am surprised that Go beats Rust, since I expected Rust to have C/C++ performance.

The Rust function is used in 10 threads, sharing a single file. Not sure what the fastest way is to write to a single file shared across threads.

One of the methods for writing in Rust:

let read_size: usize = 64 * 1024
let mut buf_reader = BufReader::with_capacity(read_size, r.body_mut());

let mut f = open_file("test.mp4", true)?;
let mut buf_writer = BufWriter::with_capacity(read_size, &mut f);

let mut buf = [0; 64*1024]; // buffer read size in KiB
let mut n: usize; // declare var for how much was read

buf_writer.seek(SeekFrom::Start(range.start)).unwrap();

loop {
    n = buf_reader.read(&mut buf)?;
    if n < 1 {
        break;
    }

    let n1 = buf_writer.write(&buf[..n]).unwrap();
    written += n1 as u64;
}

Writing in Go:

readSize := 32 * 1024
buf := make([]byte, readSize)
for {
    nr, err := res.Body.Read(buf)
    if err != nil {
        if err == io.EOF {
            t.File.WriteAt(buf[:nr], start)
            break
        }
        fmt.Println(err)
        return
    }

    nw, err := t.File.WriteAt(buf[:nr], start)
    handleErr(&err)

    start += int64(nw)
}

EDIT: According to my research, Go uses syscall.Pwrite which is one call to the system to write at an offset (does not require seeking first). In Rust, you must seek before writing, which is 2 system calls. "It's usually an universal truth that the less system calls program issues the more efficient it is." I hope that there is a way to write at an offset using Rust with one system call.

  • Why are you using such a small capacity on the read/write buffers? – loganfsmyth May 13 '20 at 23:17
  • @loganfsmyth you're right. I thought docs said Kb but it is in bytes. Will let you know how it goes once I change that. – Hussein Elguindi May 13 '20 at 23:32
  • @loganfsmyth After increasing read/write size to 64 KiB. CPU usage is down, memory is up a little, disk usage per second remained around the same. Can't think of anything to speed it up now. – Hussein Elguindi May 13 '20 at 23:42
  • How are you benchmarking this and how many times are you calling the function and such? It may help to have a more complete example. – loganfsmyth May 13 '20 at 23:50
  • @loganfsmyth I am using task manager to look at memory usage, disk usage, etc. The function in my test purposes is called 10 times, each on its own thread, writing to the same file. The data that is being written is from a get request body in each thread. – Hussein Elguindi May 13 '20 at 23:55
  • where is the multi threading ? – Stargateur May 14 '20 at 00:33
  • @Stargateur it isn't included in the code that I posted but i will update it. I was hoping to get tips on writing to a file. – Hussein Elguindi May 14 '20 at 02:23
  • You would probably get better performance using `async/await` file operations as provided by `async-std` or Tokio. (Which is probably more or less what Go does) – Jmb May 14 '20 at 06:36
  • @Jmb I hadn't thought of that and to me it sounds like it would work. Will implement it and see if I have better results. – Hussein Elguindi May 14 '20 at 21:57
  • also note that Rust's standard threads are system threads, where Go's standard threads are green threads (goroutines as they call them), which are much more lightweight. You can create such green threads in Rust with the `crossbeam` or `rayon` crates – LotB May 15 '20 at 04:50
  • @LotB I read that green threads aren't as powerful as regular threads. I am convinced that the problem is with the writing system calls, as Go uses pwrite, while in rust you have to seek then write. – Hussein Elguindi May 15 '20 at 17:25

0 Answers0