0

Background
I am converting my data to binary as server side expects binary type.

Question:
How to convert a list of numbers to String and then reconstruct the list ?

File content:

1 1 1
1 2 1
1 3 1
1 4 1
1 5 1
1 6 1
1 7 1
1 8 1
1 9 1
1 10 1
1 11 1
1 12 1
1 13 1
1 14 1
1 15 1


In client: I am reading the whole file, appending each value to a list. Then list is converted to array which is converted to string before data is sent to server.

In server: I am mapping the string back to a list of values. The list is then converted to a list of tuples (x, y, w) using grouper. Then (x, y, z) is fed to Point and the newly constructed object is appended to a list.

Note I can't use bytearray as this is an artificial data sample, I'll have numbers much greated than a byte can represent.

Code:

from itertools import izip_longest
import array

def grouper(iterable, n, fillvalue=None):
    #Collect data into fixed-length chunks or blocks
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

class Point:
    def __init__(self, x, y, w):
        self.x = x
        self.y = y
        self.w = w


if __name__ == "__main__":
    myList = []
    listOfObjects = []
    with open('data10mb.txt') as f:
        for line in f:
            s = line.split()
            x, y, w = [int(v) for v in s]
            myList.append(x)
            myList.append(y)
            myList.append(w)
    L = array.array('h', myList).tostring()


Data sent to server


Data received

    myList = list(map(ord, list(L)))
    myList = list(grouper(myList, 3))
    s = len(myList)
    for i in range (0, s):
        x, y, w = myList[i]
        obj = Point(x, y, w)
        listOfObjects.append(obj)

Expected Output:

1      <---- first line in file
1 
1
-------- 
1      <--- sixth line in file
7
1

Actual Output:

1
0
1
1
0
4

I am not sure what I've done wrong.. I've asked this question 4 days ago. "How to convert .txt file to a binary object".

server specifies that the data that should be sent is: binary: A byte array. I can't have a simple bytearray here as for python bytearray is limited to hold numbers 0-256 and the numbers represented in my file are much bigger.

What should I use instead ? As for the upper section its clear data is being mixed and I am not parsing correctly on server side, or perhaps I've done something wrong in code and I don't see it...


EDIT! I've tried sending list instead of string but server doesn't accept.

TypeError: write() argument 1 must be string or buffer, not list.

Thanks in advance!

Community
  • 1
  • 1
Tony Tannous
  • 14,154
  • 10
  • 50
  • 86
  • Any reason you're not using the inverse of `.tostring` (eg: `.fromstring`) on the server? – Jon Clements Mar 15 '17 at 07:38
  • @JonClements str object has no attr `.fromstring`. – Tony Tannous Mar 15 '17 at 07:41
  • If `L` is your string trasmitted from the client, then when you do `list(L)` you're breaking the byte string into single bytes. You want `array.array('h', L)` to keep each element as two bytes. – Jon Clements Mar 15 '17 at 07:44
  • 1
    Can't you just send an appropriately typed `array.array` . What's the range of your numbers? – juanpa.arrivillaga Mar 15 '17 at 07:44
  • @juanpa.arrivillaga range up to `1,000,000` – Tony Tannous Mar 15 '17 at 07:45
  • @JonClements this seemes to work! please write it as an answer. Thank you! – Tony Tannous Mar 15 '17 at 07:46
  • Then use an array of 32 bit unsigned ints and just send that, `array.array` can act as a buffer – juanpa.arrivillaga Mar 15 '17 at 07:54
  • @juanpa.arrivillaga sending string finishes in 16 seconds while sending array.array takes more than 1 minutes.. (It is still running and 2 minutes already elapsed).. – Tony Tannous Mar 15 '17 at 08:02
  • @TonyTannous I'm just wondering if in this case you can't just build a list of lists eg: `with open('data10mb.txt') as fin: data = [[int(el) for el in line.split()] for line in fin]`... Then send `repr(data)` to the server, and on the server side `import ast`, then build your `Point` objects, like: `objects = [Point(*args) for args in ast.literal_eval(received_data)]`... just an idea... (it avoids the grouping issue and having to worry about what byte representation to use) – Jon Clements Mar 15 '17 at 08:13

1 Answers1

1

In your code:

L = array.array('h', myList).tostring()

You are creating a bytestring packed as two byte integers. On the server side, you're then using list(L) which takes each element of L to generate a list, but in this case it doesn't retain the 2-byte packing as it sees each element in the bytestring as a single byte, eg:

>>> import array
>>> a = array.array('h', [1234, 5678])
>>> s = a.tostring()
>>> s
b'\xd2\x04.\x16'
>>> list(s)
[210, 4, 46, 22] # oops - wrong!

So, rebuild the array from the source data to get back what you sent.

>>> array.array('h', s)
array('h', [1234, 5678]) # better!

Also note that in your comment you say range up to 1,000,000 - the 'h' format is a 2 byte signed integer so you'll need to use 'l' for a signed long instead to represent the values sufficiently... (see type codes in the array documentation for available options)

to_send = array.array('l', [1000000, 12345]).tostring()
recieved = array.array('l', to_send)
Jon Clements
  • 138,671
  • 33
  • 247
  • 280