3

I am currently working on a project that involves connecting two devices to a python script, retrieving data from them and outputting the data.

Code outline:

• Scans for paired devices

• Paired device found creates thread instance (Two devices connected = two thread instances )

• Data is printed within the thread i.e. each instance has a separate bundle of data

Basically when two devices are connected two instances of my thread class is created. Each thread instance returns a different bundle of data.

My question is: Is there a way I can combine the two bundles of data into one bundle of data?

Any help on this is appreciated :)

Ilish
  • 35
  • 1
  • 2
  • 8
  • Thanks for giving a code overview, but I think a MCVCE would help as well. – Christian Dean Sep 26 '16 at 14:10
  • You can put data from both threads in a single structure (like list) with two conditions: do it with relative small chunks and use [locks](http://stackoverflow.com/questions/10525185/python-threading-how-do-i-lock-a-thread) for access to the structure. – Stanislav Ivanov Sep 26 '16 at 14:24

2 Answers2

4

I assume you are using the threading module.

Threading in Python

Python is not multithreaded for CPU activity. The interpreter still uses a GIL (Global Interpreter Lock) for most operations and therefore linearizing operations in a python script. Threading is good to do IO however, as other threads can be woken up while a thread waits for IO.

Idea

Because of the GIL we can just use a standard list to combine our data. The idea is to pass the same list or dictionary to every Thread we create using the args parameter. See pydoc for threading.

Our simple implementation uses two Threads to show how it can be done. In real-world applications you probably use a Thread group or something similar..

Implementation

def worker(data):
    # retrieve data from device
    data.append(1)
    data.append(2)
l = []
# Let's pass our list to the target via args.
a = Thread(target=worker, args=(l,))
b = Thread(target=worker, args=(l,))
# Start our threads
a.start()
b.start()
# Join them and print result
a.join()
b.join()
print(l)

Further thoughts

If you want to be 100% correct and don't rely on the GIL to linearize access to your list, you can use a simple mutex to lock and unlock or use the Queue module which implements correct locking.

Depending on the nature of the data a dictionary might be more convenient to join data by certain keys.

Other considerations

Threads should be carefully considered. Alternatives such as asyncio, etc might be better suited.

somnium
  • 1,487
  • 9
  • 15
  • 1
    An implementation of "Further thoughts" described here could make use of a `Queue` (from Queue module). It implements the necessary locking to be thread-safe. – mguijarr Sep 26 '16 at 14:39
  • This works great, thank you. When I added this to my code it combined all the data from both device however it did not print until I interrupted the script. How should I go about printing the data when there is new data to send? – Ilish Sep 26 '16 at 15:16
  • You could have a 3rd thread that consumes the queue/list and prints it. Depending on the data maybe printing inside the thread is still feasible? Another approach some python people are using is using multiprocess to spawn a process that handles the devices and another that continues reading and computing. See https://docs.python.org/2/library/multiprocessing.html. – somnium Sep 26 '16 at 21:59
  • How about multi-threading the data into a database and retrieve it from there? All of a sudden the input is decoupled from the output. Yes, this assumes a DB capable of handling multiple connections at once, but those aren't exactly rare. Most of them are built for it. – Mast Jul 24 '17 at 13:15
0

My general advice: Avoid using any of these things

  • avoid threads
  • avoid the multiprocessing module in Python
  • avoid the futures module in Python.

Use a tool like http://python-rq.org/

Benefit:

  • You need to define the input- and output data well, since only serializable data can be passed around
  • You have distinct interpreters.
  • No dead locks
  • Easier to debug.
guettli
  • 25,042
  • 81
  • 346
  • 663