2

I'm passing data between programs written in C++ and python.

I've found one of the simplest ways to do this is to compile a C++ program, then just call it in python using subprocess.call('cprog.exe arg1 arg2', shell=True), and to transfer the data via the arguments arg1, arg2.... etc

This avoids using Cython, boost... etc.... which I have found are a huge pain to get working on windows. Especially on an aging server not connected to the internet. (please don't reply to this post trying to help with cython or boost etc. - I want to restrict discussion to the main question.)

My question is: What are the limits of this approach?

Can I serialise/encode/decode entire arrays of data and pass them via the command line arguments in this fashion? What about files several gigabytes in size?

Would this be a faster approach than writing to, then reading from, the hard drive?


EDIT: It seems this is relevant: Maximum Length of Command Line String

Ben
  • 353
  • 1
  • 6
  • 14
  • 2
    The C++ language specification does not specify the maximum size of command line arguments that are passed to `main()`. The actual limits depend entirely on your operating system. – Sam Varshavchik Jun 14 '17 at 12:29
  • 1
    It depends on your OS where the limnit is, but there will be one. The usual way to pass data to a subrpocess wuold be to use stdin. –  Jun 14 '17 at 12:29
  • Based on your responses I changed my search terms and found this: https://stackoverflow.com/questions/3205027/maximum-length-of-command-line-string Which I think contains the answer! Thanks – Ben Jun 14 '17 at 12:36
  • IIRC the cmd shell's command line is limited to 8K characters. However, the example you give doesn't require the shell, so if you use `shell=False` you can increase that to the Windows limit of 32,766 characters. – Eryk Sun Jun 14 '17 at 12:38
  • Not using `shell=True` also has other advantages pointed out in this answer: https://stackoverflow.com/a/3172488/2305545 – NOhs Jun 14 '17 at 12:41
  • Great information - thank you! – Ben Jun 14 '17 at 15:15

2 Answers2

1

Not a full answer (I can't comment on any hard restrictions), but remember that any data that you pass on the command line will have to fit in memory, and as far as I know there is no way to free this memory for the duration of that process... so this would not be a practical way to share gigabytes of data.

As you're using shell=True, the command that you build will be interpreted by the shell, which may (will) impose its own set of restrictions on how many arguments are accepted, and how large the arguments may be. It may also impose a restriction on how much memory the arguments may consume in total. It might be interesting to state your OS / shell so that others may pitch in - I'm suspecting Windows and cmd.exe due to your tags, but can't be sure!


This is a bad idea in general, and (though it's not an answer to your question), you should seriously look at using pipes for inter-process communication (IPC). That will remove your encode/decode overhead, any concern for size restrictions, and any concern for writing the data to disk.

Attie
  • 6,690
  • 2
  • 24
  • 34
0

The signature of main is

int main(int argc, const char** argv)

You are therefore limited by the size of argv, and will not be guarenteed to be able to pass in more than 32767 arguments. There may be other limits you hit before that based on your OS and machine, but this is the one that is built into the language.

Note that for any given compiler, one may have a larger size of int, and may be able to support a greater number of arguments, but it is not guarenteed by the standard.

UKMonkey
  • 6,941
  • 3
  • 21
  • 30
  • 1
    A) arguments are backwards B) why 2147483647 ? – NathanOliver Jun 14 '17 at 12:35
  • A) whoops - semantics anyhow ;) B) int has a well defined range - or are you saying argc could be negative? interesting – UKMonkey Jun 14 '17 at 12:37
  • @NathanOliver If we assume that an `int` is 32-bits, then the maximum value for argc would be 2147483647. – CodingHero Jun 14 '17 at 12:37
  • We can't assume `int` is 32 bits. The standard mandates that it max must be what a 16 bit integer can hold but it can hold more than that. `int` could be 64 bits which would give you a much bigger value. – NathanOliver Jun 14 '17 at 12:39
  • You forgot that the shell can put a lower restriction in the number of command line argument than the size of int. – nefas Jun 14 '17 at 12:42
  • @NathanOliver fair point - the 16 bit one is the only one that is guarenteed by the language (I thought int was 32) I'll update the answer – UKMonkey Jun 14 '17 at 12:43
  • I didn't saw that. Sorry – nefas Jun 14 '17 at 12:44