I'm trying to speed up a data pipeline using Rust. The pipeline contains bits of Python code that I don't want to modify, so I'm trying to run them as-is from Rust using rust-cpython and multiple threads. However, the performance is not what I expected, it's actually the same as running the python code bits sequentially in a single thread.
Reading the documentation, I understand when invoking the following, you actually get a pointer to a single Python interpreter that can only be created once, even if you run it from multiple threads separately.
let gil = Python::acquire_gil();
let py = gil.python();
If that's the case, it means the Python GIL is actually preventing all parallel execution in Rust as well. Is there a way to solve this problem?
Here's the code of my test:
use cpython::Python;
use std::thread;
use std::sync::mpsc;
use std::time::Instant;
#[test]
fn python_test_parallel() {
let start = Instant::now();
let (tx_output, rx_output) = mpsc::channel();
let tx_output_1 = mpsc::Sender::clone(&tx_output);
thread::spawn(move || {
let gil = Python::acquire_gil();
let py = gil.python();
let start_thread = Instant::now();
py.run("j=0\nfor i in range(10000000): j=j+i;", None, None).unwrap();
println!("{:27} : {:6.1} ms", "Run time thread 1, parallel", (Instant::now() - start_thread).as_secs_f64() * 1000f64);
tx_output_1.send(()).unwrap();
});
let tx_output_2 = mpsc::Sender::clone(&tx_output);
thread::spawn(move || {
let gil = Python::acquire_gil();
let py = gil.python();
let start_thread = Instant::now();
py.run("j=0\nfor i in range(10000000): j=j+i;", None, None).unwrap();
println!("{:27} : {:6.1} ms", "Run time thread 2, parallel", (Instant::now() - start_thread).as_secs_f64() * 1000f64);
tx_output_2.send(()).unwrap();
});
// Receivers to ensure all threads run
let _output_1 = rx_output.recv().unwrap();
let _output_2 = rx_output.recv().unwrap();
println!("{:37} : {:6.1} ms", "Total time, parallel", (Instant::now() - start).as_secs_f64() * 1000f64);
}