I have created server.py and client.py with the intention of sending both text and binary files between the two. My code works for both small text and small binary files, however large binary files do not work.
In my testing, I use a 1.5 KB .ZIP file and I can send this without any problem. However, when I try sending a 44 MB .ZIP file, I am running into an issue.
My client code works as follows:
- The client creates a dictionary containing metadata about the file to be sent.
- The binary file is base64 encoded and is added as a value to the "filecontent" key of the dictionary.
- The dictionary is JSON serialised.
- The length of the serialised dictionary is calculated and fixed-length prefixed to the serialised dictionary.
- The client sends the entire message to the server.
On the server:
- The server receives the fixed-length header and interprets the size of the message in the transmission.
- The server reads the message in chunks of MAXSIZE (for testing set to 500), storing them temporarily.
- Once the entire message is received, the server joins the entire message.
- The server base64 decodes the value belonging to the "filecontent" key.
- Next, it writes the content of the file to disk.
As I said, this works fine for my 1.5 KB .ZIP file, but for the 44 MB .ZIP file it breaks in step 3 on the server. The error is thrown by the json.decoder. It complains about "Unterminated string starting at..."
While troubleshooting, I found that the last part of the message did not arrive. This explains the complaint from the json.decoder. I also found that the client sends 61841613 as the fixed length header, where it should be 62279500. A difference of 437887.
When I do not let the client calculate the size of the message, but simply hardcode the size as 62279500, then everything works as expected. That leads me to believe there is something wrong with the way the client calculates the message size for larger files. However I cannot work out what's wrong.
Here are the relevant parts of the code:
# client.py
connected = True
while connected:
# Actual dictionary contains more metadata
msg = { "filename" : "test.zip" , "author" : "marc" , "filecontent" : "" }
myfile = open("test.zip", "rb")
encoded = base64.b64encode(myfile.read())
msg["filecontent"] = encoded.decode("ascii")
msg = json.dumps(msg)
header = "{:<10}".format(len(msg))
header_msg = header + msg
client.sendall(header_msg.encode("utf-8"))
# server.py
HEADER = 10
MAXSIZE = 500
connected = True
while connected:
msg = conn.recv(HEADER).decode("utf-8")
SIZE = int(msg)
totalmsg = []
while SIZE > 0:
if SIZE > MAXSIZE:
msg = conn.recv(MAXSIZE).decode("utf-8")
totalmsg.append(msg)
SIZE = SIZE - MAXSIZE
else:
msg = conn.recv(SIZE).decode("utf-8")
totalmsg.append(msg)
SIZE = 0
msg = json.loads("".join(totalmsg))
decoded = base64.b64decode(msg["filecontent"])
myfile = open(msg["filename"], "wb")
myfile.write(decoded)
myfile.close()