Communication between two separate Python engines

Question

The problem statement is as follows:

I am working with Abaqus, a program for analyzing mechanical problems. It is basically a standalone Python interpreter with its own objects etc. Within this program, I run a python script to set up my analysis (so this script can be modified). It also contains a method which has to be executed when an external signal is received. These signals come from the main script that I am running in my own Python engine.

For now, I have the following workflow: The main script sets a boolean to True when the Abaqus script has to execute a specific function, and pickles this boolean into a file. The Abaqus script regularly checks this file to see whether the boolean has been set to true. If so, it does an analysis and pickles the output, so that the main script can read this output and act on it.

I am looking for a more efficient way to signal the other process to start the analysis, since there is a lot of unnecessary checking going on right know. Data exchange via pickle is not an issue for me, but a more efficient solution is certainly welcome.

Search results always give me solutions with subprocess or the like, which is for two processes started within the same interpreter. I have also looked at ZeroMQ since this is supposed to achieve things like this, but I think this is overkill and would like a solution in python. Both interpreters are running python 2.7 (although different versions)

What are the requirements vis-a-vis the two processes? Do they need to be separate processes? Do they have to run on separate machines? Separate userids? Why have you chosen two processes for your approach, and what constraints are there? — aghast, Apr 10 '17 at 23:18
can't you structure things so the abaqus process is main, and call the other as a subprocess? — agentp, Apr 11 '17 at 00:49
Probably better actually to have the "main" process be a `python 3.x` environment, and have it call `"abaqus python aq_script.py"` for example which reads and pickles `abaqus` data. And then another output script back to `abaqus` called the same way after pickling again. It's slow, but if you need `scipy` or `tensorflow` etc it'll stil be faster than implementing in abaqus's ancient `numpy` implementation — Daniel F, Apr 11 '17 at 12:56
@AustinHastings: both processes need to be separate and are running on the same machine. My 'main script' has to run on my main python engine since it does not need to start abaqus in every case. This way, abaqus is just a part of the program which is accessed whenever needed. — mooisjken, Apr 11 '17 at 23:06
@agentp: abaqus is called as a subprocess from within my main script: `subprocess.call('abaqus cae script = test.py',shell=True)`. This outputs a message that the abaqus license server has been started and opens the abaqus program with the given script. I need a way to communicate with this script from my main script. Structuring it the other way around is not logical as my previous answer pointed out. — mooisjken, Apr 11 '17 at 23:10
@DanielForsman: avoiding abaqus's ancient `numpy` is indeed one of the benefits of having both separate programs running. I am right in thinking that your solution aligns with my current workaround? And how can I make Abaqus more efficient in receiving the signal, instead of tirelessly checking for a change in a pickled file? — mooisjken, Apr 11 '17 at 23:15
@Rightleg: during my search on the web, I came across sockets, but did not understand really well how the capabilities could be leveraged in my problem. Probably this is because I am not familiar with this concept and it is hard to understand. Basically I need some sort of communication line between both processes. Conceptually speaking, this communication line can be setup in my main script, and its details can be given to abaqus when the `subprocess.call` is executed. I do not have an idea how to make this work in practice. — mooisjken, Apr 11 '17 at 23:20
@mooisjken There might be more efficient ways than using sockets to solve your problem, but this is IMO the easiest solution to implement. Basically, you make your two applications communicate over the localhost. You'd need to design a protocol in a more complex situation, but in Python, you can easily send (and parse when receiving) a string, which make is easy to understand. Say you want to run the operation 5 with the parameters 11 and 32.7: send `"operation:5; parameters:(11, 32.7)"` from your main script to your worker. There are plenty of code examples on the web for socket programming. — Right leg, Apr 12 '17 at 09:01

aghast · Accepted Answer · 2017-04-12T01:35:44.153

Edit:

Like @MattP, I'll add this statement of my understanding:

Background

I believe that you are running a product called abaqus. The abaqus product includes a linked-in python interpreter that you can access somehow (possibly by running abaqus python foo.py on the command line).

You also have a separate python installation, on the same machine. You are developing code, possibly including numpy/scipy, to run on that python installation.

These two installations are different: they have different binary interpreters, different libraries, different install paths, etc. But they live on the same physical host.

Your objective is to enable the "plain python" programs, written by you, to communicate with one or more scripts running in the "Abaqus python" environment, so that those scripts can perform work inside the Abaqus system, and return results.

Solution

Here is a socket based solution. There are two parts, abqlistener.py and abqclient.py. This approach has the advantage that it uses a well-defined mechanism for "waiting for work." No polling of files, etc. And it is a "hard" API. You can connect to a listener process from a process on the same machine, running the same version of python, or from a different machine, or from a different version of python, or from ruby or C or perl or even COBOL. It allows you to put a real "air gap" into your system, so you can develop the two parts with minimal coupling.

The server part is abqlistener. The intent is that you would copy some of this code into your Abaqus script. The abq process would then become a server, listening for connections on a specific port number, and doing work in response. Sending back a reply, or not. Et cetera.

I am not sure if you need to do setup work for each job. If so, that would have to be part of the connection. This would just start ABQ, listen on a port (forever), and deal with requests. Any job-specific setup would have to be part of the work process. (Maybe send in a parameter string, or the name of a config file, or whatever.)

The client part is abqclient. This could be moved into a module, or just copy/pasted into your existing non-ABQ program code. Basically, you open a connection to the right host:port combination, and you're talking to the server. Send in some data, get some data back, etc.

This stuff is mostly scraped from example code on-line. So it should look real familiar if you start digging into anything.

Here's abqlistener.py:

# The below usage example is completely bogus. I don't have abaqus, so
# I'm just running python2.7 abqlistener.py [options]
usage = """
abacus python abqlistener.py [--host 127.0.0.1 | --host mypc.example.com ] \\
        [ --port 2525 ]

Sets up a socket listener on the host interface specified (default: all
interfaces), on the given port number (default: 2525). When a connection
is made to the socket, begins processing data.
"""



import argparse

parser = argparse.ArgumentParser(description='Abacus listener',
    add_help=True,
    usage=usage)

parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
                    help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
                    help='port number of listener (default: 2525)')

args = parser.parse_args()

import SocketServer
import json

class AbqRequestHandler(SocketServer.BaseRequestHandler):
    """Request handler for our socket server.

    This class is instantiated whenever a new connection is made, and
    must override `handle(self)` in order to handle communicating with
    the client.
    """

    def do_work(self, data):
        "Do some work here. Call abaqus, whatever."
        print "DO_WORK: Doing work with data!"
        print data
        return { 'desc': 'low-precision natural constants','pi': 3, 'e': 3 }

    def handle(self):
        # Allow the client to send a 1kb message (file path?)
        self.data = self.request.recv(1024).strip()
        print "SERVER: {} wrote:".format(self.client_address[0])
        print self.data
        result = self.do_work(self.data)
        self.response = json.dumps(result)
        print "SERVER: response to {}:".format(self.client_address[0])
        print self.response
        self.request.sendall(self.response)


if __name__ == '__main__':
    print args
    server = SocketServer.TCPServer((args.host, args.port), AbqRequestHandler)
    print "Server starting. Press Ctrl+C to interrupt..."
    server.serve_forever()

And here's abqclient.py:

usage = """
python2.7 abqclient.py [--host HOST] [--port PORT]

Connect to abqlistener on HOST:PORT, send a message, wait for reply.
"""

import argparse

parser = argparse.ArgumentParser(description='Abacus listener',
    add_help=True,
    usage=usage)

parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
                    help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
                    help='port number of listener (default: 2525)')

args = parser.parse_args()

import json
import socket

message = "I get all the best code from stackoverflow!"

print "CLIENT: Creating socket..."
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

print "CLIENT: Connecting to {}:{}.".format(args.host, args.port)
s.connect((args.host, args.port))

print "CLIENT: Sending message:", message
s.send(message)

print "CLIENT: Waiting for reply..."
data = s.recv(1024)

print "CLIENT: Got response:"
print json.loads(data)

print "CLIENT: Closing socket..."
s.close()

And here's what they print when I run them together:

$ python2.7 abqlistener.py --port 3434 &
[2] 44088
$ Namespace(host='', port=3434)
Server starting. Press Ctrl+C to interrupt...

$ python2.7 abqclient.py --port 3434
CLIENT: Creating socket...
CLIENT: Connecting to :3434.
CLIENT: Sending message: I get all the best code from stackoverflow!
CLIENT: Waiting for reply...
SERVER: 127.0.0.1 wrote:
I get all the best code from stackoverflow!
DO_WORK: Doing work with data!
I get all the best code from stackoverflow!
SERVER: response to 127.0.0.1:
{"pi": 3, "e": 3, "desc": "low-precision natural constants"}
CLIENT: Got response:
{u'pi': 3, u'e': 3, u'desc': u'low-precision natural constants'}
CLIENT: Closing socket...

References:

argparse, SocketServer, json, socket are all "standard" Python libraries.

Haven't tested it completely, but I can say at least the `import` statements work in `abaqus python`, which means it should work. As I'm not much of a programmer, are there any pitfalls I should expect when passing big data (1GB+) through the socket? Usually I want to pass a big dictionary or `numpy` array, should that work or should I send it as a text stream? — Daniel F, Apr 12 '17 at 07:21
For that much data, you'd have to send it in smaller chunks - say 4kb blocks or 16kb or something. You should also consider writing the file out and passing the filename - I don't know which way you'll get better performance. — aghast, Apr 12 '17 at 07:28
So still pickling and unpickling to disk, but at least there's a way for the main script and abaqus script to tell each other when their parts are finished. — Daniel F, Apr 12 '17 at 07:42
Make sure you open your file as binary, and the `.tofile` method should directly dump the raw data. That might speed things up. Then, try using `.data` to get access to the raw buffer to pass through the socket. You REALLY don't want to be converting a 1gb array back and forth to any other format. — aghast, Apr 12 '17 at 07:50
@AustinHastings: thank you very much! This solution was exactly what I was looking for. I would like to note a small error: the client part should have `host='localhost'` instead of `host=''`. And then a point of attention: when you want to send multiple messages over time, you have to reinitialize the socket, connect again, then send and receive, and afterwards close again. I did not find the reason why the connection times out. The server keeps running and does not have to be restarted. — mooisjken, Apr 13 '17 at 21:21
For possible future Abaqus users: this is my current workflow: I start abaqus in my main script using `subprocess.Popen('abaqus cae script=C:\FCP\workspace\TestAbaqus.py', shell=True)`. Since starting up abaqus takes some time (around 10 seconds), we cannot immediately establish a connection in our main script. Therefore, I keep trying to make a connection in a `while`-`try`-`except` construction, until it succeeds. After that, the rest of the main script can run since I know that at that moment the abaqus environment is fully up and running. — mooisjken, Apr 13 '17 at 21:22
Are you `close`ing the connection each time on the client side? That might be why you can't just interact... — aghast, Apr 13 '17 at 21:25

Matt P · Answer 2 · 2017-04-11T22:08:57.170

To be clear, my understanding is that you are running Abaqus/CAE via a Python script as an independent process (let's call it abq.py), which checks for, opens, and reads a trigger file to determine if it should run an analysis. The trigger file is created by a second Python process (let's call it main.py). Finally, main.py waits to read the output file created by abq.py. You want a more efficient way to signal abq.py to run an analysis, and you're open to different techniques to exchange data.

As you mentioned, subprocess or multiprocessing might be an option. However, I think a simpler solution is to combine your two scripts, and optionally use a callback function to monitor the solution and process your output. I'll assume there is no need to have abq.py constantly running as a separate process, and that all analyses can be started from main.py whenever it is appropriate.

Let main.py have access to the Abaqus Mdb. If it's already built, you open it with:

mdb = openMdb(FileName)

A trigger file is not needed if main.py starts all analyses. For example:

if SomeCondition:
    j = mdb.Job(name=MyJobName, model=MyModelName)
    j.submit()
    j.waitForCompletion()

Once complete, main.py can read the output file and continue. This is straightforward if the data file was generated by the analysis itself (e.g. .dat or .odb files). OTH, if the output file is generated by some code in your current abq.py, then you can probably just include it in main.py instead.

If that doesn't provide enough control, instead of the waitForCompletion method you can add a callback function to the monitorManager object (which is automatically created when you import the abaqus module: from abaqus import *). This allows you to monitor and respond to various messages from the solver, such as COMPLETED, ITERATION, etc. The callback function is defined like:

def onMessage(jobName, messageType, data, userData):
    if messageType == COMPLETED:
        # do stuff
    else:
        # other stuff

Which is then added to the monitorManager and the job is called :

monitorManager.addMessageCallback(jobName=MyJobName,  
    messageType=ANY_MESSAGE_TYPE, callback=onMessage, userData=MyDataObj)
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()

One of the benefits to this approach is that you can pass in a Python object as the userData argument. This could potentially be your output file, or some other data container. You could probably figure out how to process the output data within the callback function - for example, access the Odb and get the data, then do any manipulations as needed without needing the external file at all.

score 0 · Answer 3 · answered May 02 '18 at 08:46

I agree with the answer, except for some minor syntax problems.

defining instance variables inside the handler is a no no. not to mention they are not being defined in any sort of init() method. Subclass TCPServer and define your instance variables in TCPServer.init(). Everything else will work the same.

Communication between two separate Python engines

3 Answers3

Linked