0

My client sends data to me and every data point they send is essentially one row in a dataframe I am creating.

The problem is I have to set a buffersize when receiving the data. Based on some other potential solutions, I could either:

  1. Make the client send me the len of the data before sending the data itself
  2. Make the client send me the data in predefined chunk sizes (i.e. padding with spaces)

However, I don't have the option to change what the client sends. So the client will send data as shown in sample_data below. Then my server receives it as shown below in the output.

Is there a way I can receive this data in the way I intended? (i.e. As each datapoint individually)

CLIENT

# Client SENDs

# <Set up socket and connect to server>

sample_data = [
    'SNO_A17 80 ABC',   # 14 chars
    'SNO_D99 50 DEF',   # 14 chars
    'SNO_M2 90 GHI',    # 13 chars
    'SNO_J123 999 JKL'  # 16 chars
]

for item in sample_data:
    socket.send(item.encode('utf-8'))

MY SERVER

# My Server RECVs

# <Set up socket and accept connection from client>

while True:
    message = conn.recv(32).decode('utf-8')
    print(message)

    split_message_and_place_into_dateframe_function()
    print('Placed message into dataframe')

Output

> 'SNO_A17 80 ABC'    #This output is correct and as intended
> 'Placed message into dataframe'
> 'SNO_D99 50 DEFSNO_M2 90 GHISNO_J'
> ValueError: cannot set a row with mismatched columns

Edit: 2 possibly noteworthy points:

  • I know each point begins with 'SNO_'
  • While I don't know the size of each datapoint string, I know they cannot possibly exceed some high number like 50

I considered this and felt that one possible solution could be to meticulously store every message I receive and then join them and then split them with 'SNO_' as the delimiter. However, given I'll be receiving thousands of datapoints per second, I'm thinking of possible solutions with less iterations.

codingray
  • 106
  • 7

1 Answers1

0

Assuming you're using TCP, there is no way to force the TCP layer to send you a particular number of bytes in a given recv() call; you'll get as many (or as few) bytes as you get, and it's up to the receiving code to handle them appropriately regardless of the bytes-delivered-per-recv().

I suggest that you call recv() with some reasonable, fixed maximum-size value, and then append the resulting data to a string or buffer that you keep in memory. Then after each recv()-and-append, search for the first SNO_ prefix in the in-memory string, and parse all the data before that prefix as a data-line, then remove it from the beginning of the string. Repeat until you no longer detect a SNO_ prefix, then go on to the next recv() call.

That's not the most efficient thing in the world, but it's probably the best you can do without changing the data format being sent.

Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234