1

I am needing to calculate the length of a dictionary in octects to conform to the BUFR standard which states:

Total length of BUFR message, in octets (including Section 0)

I have been able to find bytes, but not information for decoding octects. To get bytes I would do:

sys.getsizeof(json_list)
gerrit
  • 24,025
  • 17
  • 97
  • 170
Jordan
  • 83
  • 2
  • 10
  • https://en.wikipedia.org/wiki/Octet_(computing) `The octet is a unit of digital information in computing and telecommunications that consists of eight bits. The term is often used when the term byte might be ambiguous, as the byte has historically been used for storage units of a variety of sizes.` How is what you've already found not sufficient? – Sean Pianka Oct 03 '18 at 18:09
  • I just stumbled across that same information, for some reason I was thinking of bit not bytes. If your response is answer worthy you can reply and I'll accept it as correct. – Jordan Oct 03 '18 at 18:14
  • 1
    Um, what is the length of a dictionary in octects suppose to entail? Note, you are working with objects in Python that essentially contain lots of pointers to different objects, so, you are going to have to think about exactly what that means vis a vis this binary format. – juanpa.arrivillaga Oct 03 '18 at 18:14
  • The dictionary contains lists of floats and lists of lists of floats. – Jordan Oct 03 '18 at 19:14

1 Answers1

2

sys.getsizeof() will give you the size of an object in memory. But from your description, it sounds like you're looking for the length of some serialization (into a message) of the dictionary.

It looks like you're using JSON, and that makes sense. For example using json.dumps():

json_string = json.dumps(your_dict)

The next question is how do you get the length (in octets) of that string.

Well len(json_string) will give you the number of characters, but for most encodings, the number of bytes required to transmit those characters will be different.(Docs)

So first you need to encode your string to bytes, then use the length of the resulting bytes object:

len(json_string.encode(<your encoding>))

Which will give you the number of octets needed to transmit that dictionary.

Note: any other requirements of the message, such as headers, delimiters, escaping, formatting, etc will be in addition to this number.

jedwards
  • 29,432
  • 3
  • 65
  • 92
  • 2
    `sys.getsizeof` does not give you an estimate (at least not for built in objects). You just need to understand what it means, it gives you the size of an object but not the objects referenced (internally) by the object. – juanpa.arrivillaga Oct 03 '18 at 18:15
  • For stdlib it may always give you the actual number, but it's implementation defined: *getsizeof() calls the object’s `__sizeof__` method* – jedwards Oct 03 '18 at 18:16
  • Yes, that is true. But it is guaranteed that built-in objects will return correct results. – juanpa.arrivillaga Oct 03 '18 at 18:16
  • @juanpa.arrivillaga took it out nonetheless -- no need to confuse on minutiae – jedwards Oct 03 '18 at 18:17