0

What I am doing is sending the file name and the file size first, an encoded (utf-8) json string, so the server can know in advance the size of the file so he knows when all data have arrived.

It's working great when just one client is sending the file, but when 2 or more clients are sending at same time very often the server crashes (sometimes it works, what makes it more confusing) with 'utf-8' codec can't decode byte 0xff in position 36: invalid start byte, in the first line of the run() fuction.

I don't have a clue why is this happening, because each client has his individual process, and shouldn't be having any conflict between them.

Client:

import socket, json

f = 'img.jpg'
f_bin = open(f, 'rb').read()
info = {'name': f, 'size': len(f_bin)}
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect(('89.1.59.435', 9005))
    s.send(json.dumps(info).encode())
    total_sent = 0
    while total_sent < info['size']:
        try:
            sent = s.send(f_bin[total_sent:])
        except Exception as err:
            break
        total_sent += sent

Server:

import socket, threading, json

def get_file(conn, info):
    remaining = info['size'] # file size, our trigger to know that all packages arrived
    file_bin = b''
    progress = None
    while remaining > 0:
        try:
            package = conn.recv(1024)
        except Exception as err:
            return None
        file_bin += package
        remaining -= len(package)
    return file_bin

def run(conn):
    info = json.loads(conn.recv(1024).decode()) # 'utf-8' codec can't decode byte 0xff in position 36: invalid start byte
    file_bin = get_file(conn, info)
    if file_bin is not None:
        dest = 'files/{}'.format(info['name'])
        with open(dest, 'wb') as f:
            f.write(file_bin)
        print('success on receiving and saving {} for {}'.format(info['name'], conn.getpeername()))
    conn.close()

host, port = ('', 9005)
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    sock.bind((host, port))
    sock.listen(5)
    while True:
        conn, addr = sock.accept()
        print('conn', addr)
        threading.Thread(target=run, args=(conn,)).start()

I removed the prints just to post the relevant part and illustrate the problem

Miguel
  • 1,579
  • 5
  • 18
  • 31
  • http://stackoverflow.com/questions/22216076/unicodedecodeerror-utf8-codec-cant-decode-byte-0xa5-in-position-0-invalid-s Seems like you have Non-ASCII characters in your file – Haifeng Zhang Feb 13 '17 at 20:44
  • Thanks, @HaifengZhang , I already have been on that link. I can send all types of files if just one client, when two or more are connected it breaks... Non ascii chars? How can i control that if the files are images/videos (pretty much all types). I can send it all, when just one client, if more it can break or not (most of times it does)... – Miguel Feb 13 '17 at 20:48

1 Answers1

1

TCP is a streaming protocol so there is no guarantee that recv received exactly the JSON header. It may have only part of the JSON or it may include some of the later binary data. In your case, with multiple connections active, the recv is likely delayed enough to get the extra data.

You need some way to know the size of the header. One common way is to pick a character that marks the end. JSON strings don't include NUL ('\x00`) so that works well. Write a NUL right after the header and the server can scan for that null to split it out again. I changed the code a bit so it can run on my machine and added some rudimentary error handling in this example.

client.py

import socket, json, sys

f = 'img.jpg'
f_bin = open(f, 'rb').read()
info = {'name': f, 'size': len(f_bin)}
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect(('localhost', 9005))
    s.send(json.dumps(info).encode())
    s.send(b'\x00')
    total_sent = 0
    while total_sent < info['size']:
        try:
            sent = s.send(f_bin[total_sent:])
        except Exception as err:
            break
        total_sent += sent

server.py

import socket, threading, json, io, struct

def reset_conn(conn):
    """Reset tcp connection"""
    # linger zero
    conn.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER,
        struct.pack('ii', 1, 0)) # on/off, linger-timeout
    conn.shutdown(socket.SHUT_RDWR)
    conn.close()

def get_file(conn, info):
    remaining = info['size'] # file size, our trigger to know that all packages arrived
    file_bin = b''
    progress = None
    while remaining > 0:
        try:
            package = conn.recv(1024)
            if not package:
                print("Unexpected end of file")
                reset_conn(conn)
                return None
        except Exception as err:
            return None
        file_bin += package
        remaining -= len(package)
    return file_bin

def run(conn):
    # get header
    buf = io.BytesIO()
    while True:
        c = conn.recv(1)
        if not len(c):
            print("Error, no header received")
            reset_conn(conn)
            return
        if c == b'\x00':
            break
        buf.write(c)
    info = json.loads(buf.getvalue().decode())
    del buf
    file_bin = get_file(conn, info)
    if file_bin is not None:
        dest = 'files/{}'.format(info['name'])
        with open(dest, 'wb') as f:
            f.write(file_bin)
        print('success on receiving and saving {} for {}'.format(info['name'], conn.getpeername()))
    conn.close()

host, port = ('localhost', 9005)
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    sock.bind((host, port))
    sock.listen(5)
    while True:
        conn, addr = sock.accept()
        print('conn', addr)
        threading.Thread(target=run, args=(conn,)).start()
Miguel
  • 1,579
  • 5
  • 18
  • 31
tdelaney
  • 73,364
  • 6
  • 83
  • 116
  • Hello, thanks for the anwser, I understood. Can you explain me what is happening in reset_conn function please? Specialy in regards of Linger and pack. Thanks – Miguel Feb 13 '17 at 22:04
  • The underlying socket doesn't close just because you abandon the socket object. You need to close it specifically w/ the SHUT_RDWR call. The linger trick causes a RESET to be sent to the other side so it knows the session was aborted abnormally and the file was not sent. Without that code, you risk unused socket connections building up in your server. – tdelaney Feb 13 '17 at 22:14
  • It seens you are not using the `buf` , or you just forget to erase it after testing? – Miguel Feb 13 '17 at 22:15
  • What comes after the if? I have an idea but I would like to see how would you do. – Miguel Feb 13 '17 at 22:19
  • Cut/paste error on my part. I've added the missing code. – tdelaney Feb 13 '17 at 22:20
  • Thank you tdelaney – Miguel Feb 13 '17 at 22:48