8

Let's say we are using Python and calling some DLL libraries written in C++. We open a very large dataset in Python and then we would like to call a library written in C++ and add an array with that opened data as a parameter. Library would do something with that array and then return it back to Python code.

So the question is: Is it possible to use the same location of a memory? Because in that case we do not need to copy a huge amount of data two times.

Matphy
  • 1,086
  • 13
  • 21

2 Answers2

5

It all comes down to how you load your data in memory and what type of data it is.

If it's numerical data and you use e.g. a numpy array, it's are already stored with a memory layout trivially usable from C or C++ code. It's easy to obtain the address of the block of data (numpy.ndarray.ctypes.data) and pass it to the C++ code through ctypes. You can see a nice example here. Image data is similar in this regard (PIL images are in a simple memory format and the pointer to their data can be obtained easily).

If, on the other hand, your data is in regular "native" Python structures (e.g. regular lists or regular objects), the situation is more tricky. You can pass them straight to C++ code, but it's code that must know about Python data structures - so, written especially for this purpose, using python.h and dealing with the non-trivial Python API.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • Thank you! However I am currently using **Shiboken** wrapper because I am using **PySide2** in order to call **Qt** written in C++. – Matphy Feb 14 '18 at 08:54
  • I often shared large datasets between Python and C++ in SIP/PyQt extensions, so it should be possible without much hassle. In that case, generally you keep the data on the C++ side and export accessors for Python. But again, all depends on what dataset exactly you want to share; you should add some more details to your question. – Matteo Italia Feb 14 '18 at 09:03
2

This works using memory mapped files. I do not claim high speed or efficiency in any way. These are just to show an example of it working.

 $ python --version
 Python 3.7.9

 $ g++ --version
 g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

The C++ side only monitors the values it needs. The Python side only provides the values.

Note: the file name "pods.txt" must be the same in the C++ and python code.

#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
 
int main(void)
  {
  // assume file exists
  int fd = -1;
  if ((fd = open("pods.txt", O_RDWR, 0)) == -1)
     {
     printf("unable to open pods.txt\n");
     return 0;
     }
  // open the file in shared memory
  char* shared = (char*) mmap(NULL, 8, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

  // periodically read the file contents
  while (true)
      {
      printf("0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X\n", shared[0], shared[1], shared[2], shared[3], shared[4], shared[5],           shared[6], shared[7]);
      sleep(1);
      }

   return 0;
   }

The python side:

import mmap
import os
import time
 
fname = './pods.txt'
if not os.path.isfile(fname):
    # create initial file
    with open(fname, "w+b") as fd:
         fd.write(b'\x01\x00\x00\x00\x00\x00\x00\x00')

# at this point, file exists, so memory map it
with open(fname, "r+b") as fd:
    mm = mmap.mmap(fd.fileno(), 8, access=mmap.ACCESS_WRITE, offset=0)

    # set one of the pods to true (== 0x01) all the rest to false
    posn = 0
    while True:
         print(f'writing posn:{posn}')

         # reset to the start of the file
         mm.seek(0)
 
         # write the true/false values, only one is true
         for count in range(8):
             curr = b'\x01' if count == posn else b'\x00'
             mm.write(curr)

         # admire the view
         time.sleep(2)

         # set up for the next position in the next loop
        posn = (posn + 1) % 8

    mm.close()
    fd.close()

To run it, in terminal #1:

 a.out  # or whatever you called the C++ executable
 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00

i.e. you should see the 0x01 move one step every couple of seconds because of the sleep(2) in the C++ code.

in terminal #2:

python my.py  # or whatever you called the python file
writing posn:0
writing posn:1
writing posn:2

i.e. you should see the position change from 0 through 7 back to 0 again.

JohnA
  • 699
  • 2
  • 7
  • 15