I'm trying to decode some non-standard AIS (bunch of nmea strings with extra info tagged on) data using the gpsd library. The AIS data is read from a continuously increasing text file (one per day with newlines being new data). Some processing is done in python and then pushed to the GPSD decoder to be decoded and returned to python for more work. At the moment this is done using os.system or subprocess.check_output (they both take the same amount of time) with the command:
echo "single_nmea_string" | gpsdecode
This works but it's painfully slow. If I write all the NMEA strings to a text file and do a bulk decode it's 10-50 times faster:
cat all_processed_nmea_strings.txt | gpsdecode
but this can't work in real time since I need the incoming data to be processed as soon as possible.
Is there a way to open up a pipe to gpsdecode (or any other cmdline tool) in Python, send it the nmea_strings as they're ready and read the results without having to start/stop the tool the whole time? I'm already using multiprocessing with queues to speed up the processing but the bottleneck is the decoding bit.
Any ideas?
EDIT: Further testing shows that the "slowness" might not be with the cmdline decoding. It might be where I split the data up between workers. Will need to do some profiling.