4

I'm currently working on a project that involves some very remote data gathering. At the end of each day, a very short summary is sent back to the server via a satellite connection.

Because sending data over the satellite is very expensive, I want the data to be as compact as possible. Additionally, the service I'm using only allows for sending data in the following two formats: ASCII, hexadecimal. Most of the data I will be sending consists of floats, where the precision should be as high as possible without taking up too much space.

Below I have a working version of what I'm currently using, but there should be a more efficient way to store the data. Any help would be much appreciated.

import ctypes

#------------------------------------------------------------------
#This part is known to the client as well as the server
class my_struct(ctypes.Structure):
    _pack_ = 1
    _fields_ = [('someText', ctypes.c_char * 12),
                ('underRange', ctypes.c_float),
                ('overRange', ctypes.c_float),
                ('TXPDO', ctypes.c_float)]

def print_struct(filled_struct):
    d = filled_struct
    for name,typ in d._fields_:
        value = getattr(d, name)
        print('{:20} {:20} {}'.format(name, str(value), str(typ)))

#------------------------------------------------------------------
# This part of the code is performed on the client side
#Filling the struct with some random data, real data will come from sensors
s = my_struct()
s.someText = 'Hello World!'.encode()
s.underRange = 4.01234
s.overRange = 4.012345
s.TXPDO = 1.23456789

#Rounding errors occurred when converting to ctypes.c_float
print('Data sent:')
print_struct(s)

data = bytes(s) #Total length is 24 bytes (12 for the string + 3x4 for the floats)
data_hex = data.hex() #Total length is 48 bytes in hexadecimal format

#Now the data is sent over a satellite connection, it should be as small as possible
print('\nLength of sent data: ',len(data_hex),'bytes\n') 

#------------------------------------------------------------------
# This part of the code is performed on the server side
def move_bytes_to_struct(struct_to_fill,bytes_to_move):
    adr = ctypes.addressof(struct_to_fill)
    struct_size = ctypes.sizeof(struct_to_fill)

    bytes_to_move = bytes_to_move[0:struct_size]
    ctypes.memmove(adr, bytes_to_move, struct_size)

    return struct_to_fill

#Data received can be assumed to be the same as data sent
data_hex_received = data_hex
data_bytes = bytes.fromhex(data_hex_received)
data_received = move_bytes_to_struct(my_struct(), data_bytes)

print('Data received:')
print_struct(data_received)
Alex
  • 941
  • 3
  • 11
  • 23
  • why not use zlib ? https://stackoverflow.com/questions/26753147/how-to-gzip-a-bytearray-in-python – hootnot Jul 09 '18 at 13:37
  • @hootnot If the total length of data is 24 bytes, it might not help much. There is some overhead to the zlib data. – Mattias Nilsson Jul 09 '18 at 13:40
  • It looks for me, that you simply need to use the appropriate routine from `binascii` module afterwards, e.g. `b2a_base64()`. – guidot Jul 09 '18 at 14:01
  • Since the size of the data is very small, compression won't do much here. In this case: len(data) = 24, compressed_data = zlib.compress(data), len(compressed_data) = 30 – Alex Jul 09 '18 at 14:20
  • @guidot Wouldn't `b2a_base64()` make strings longer? – Mattias Nilsson Jul 09 '18 at 14:20
  • @MattiasNilsson: Binary data always gets longer if the transmission format only allows printable characters. `hexlify()` doubles the amount of bytes and base64 imposes less overhead. – guidot Jul 09 '18 at 14:24
  • @guidot I might have misinterpreted, but I thought that the "hexadecimal" in the description meant binary data. – Mattias Nilsson Jul 09 '18 at 14:30
  • @Alex: It remains unclear, whether the string length and the number of floats is constant or even the whole structure is variable. – guidot Jul 09 '18 at 15:10
  • @guidot The whole structure remains variable, I can add and remove floats and strings in my_struct() and as long as the client and the server are using the same class the system should keep working – Alex Jul 09 '18 at 15:16
  • @Alex But then you have a versioning problem I guess. If your structure changes, how do you know what the data means? If you really want to keep the message size to a minimum you can't for example pass along variable names. I guess a leading "version byte" might help there, but that still requires updates in your code as things change. – Mattias Nilsson Jul 10 '18 at 10:38
  • Maybe I misunderstood your question, let me clarify. The structure might change once every few months at most. The client and the server will have access to the same struct mentioned in the code. – Alex Jul 11 '18 at 07:15

1 Answers1

2

I don't know if you are overcomplicating things a bit. The struct module will let you do pretty much what you are doing, but simpler:

struct.pack("fffs12", 4.01234, 4.012345, 1.23456789, 'Hello World!'.encode())

This of course depends on you knowing how long your string is, but you could also not care about that:

struct.pack("fff", 4.01234, 4.012345, 1.23456789) + 'Hello World!'.encode()

But, about saving things more efficient: The more you know about your data, the more shortcuts you can take. Is the string only ascii? You could squeeze each char into maybe 7 bits, or even 6. That could bring your 12 bytes string to 9.

If you know the range of your floats, you could perhaps trim that as well.

If you can send larger batches, compression might help as well.

Mattias Nilsson
  • 3,639
  • 1
  • 22
  • 29
  • I assume, that real world requires more than `pack()`, since it only allows one fixed string length. Most likely some type/encoding overhead needs to be added, like *string with 12 byte follows*, *n floats follow*, but the question is a bit short on details. – guidot Jul 09 '18 at 14:47
  • @guidot Depends, I guess. If you know that you have 3 floats and then everything after that is a string, you could get away with this. But like I wrote: The more you know about the data, the more shortcuts you can take. But like you say, the question doesn't really explain it all. – Mattias Nilsson Jul 09 '18 at 14:55