3

I'm trying to read the length of some metadata from a .lrf file. (Used with the program LoLReplay)

There's not really documentation on these files, but I have already figured out how to do this in C++. I'm trying to re-write the project in python for multiple reasons, but I come across an error.

To first explain, the .lrf file has metadata immediately at the start of the file in this format:

  • first 4 bytes are for something I have no clue about.

  • next 4 bytes store the length of the metadata in hexidecimal, up until the end of the metadata, which after is the actual contents of the replay.

  • bytes after the initial 8 bytes are the metadata in json format

The problem I'm having is actually reading the metadata length. This is the current function I have:

def getMetaLength(self):
    try:
        file = open(self.file,"r")
    except IOError:
        print ("Failed to open file.")
        file.close()
    #We need to skip the first 4 bytes.
    file.read(4)
    mdlength = file.read(4)
    print(hex(mdlength))
    file.close()

When I call this function, the shell returns a traceback stating:

    Traceback (most recent call last):
    File "C:\Users\Donald\python\lolcogs\lolcogs_main.py", line 6, in <module>
    lolcogs.getMetaLength()
    File "C:\Users\Donald\python\lolcogs\LoLCogs.py", line 20, in getMetaLength
    file.read(4)
    File "C:\Python32\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3648:       character maps to <undefined>

My best guess is that read() is trying to read characters that are encoded in some unicode format, but these are definitely just bytes that I am attempting to read. Is there a way to read these as bytes? Also, is there a better way to skip bytes when you are attempting to read a file?

Oleh Prypin
  • 33,184
  • 10
  • 89
  • 99
shadefinale
  • 95
  • 1
  • 3
  • 9
  • Try opening the file in _binary mode_: `f = open(self.file,"rb")`. Also, don't name it `file` since it will conflict with built in `file` type name. – Paulo Bu Feb 28 '14 at 23:35
  • @PauloBu There seems to be no such type anymore, though... – Oleh Prypin Feb 28 '14 at 23:36
  • In Python2.7 it is defined. In Python3 no. But reading the OP's code he's probably using Python 3 so ignore my comment :) – Paulo Bu Feb 28 '14 at 23:38
  • @PauloBu Thanks, I used "rb" instead of just "r" and now I get the error "TypeError: 'bytes' object cannot be interpreted as an integer" but the c++ version had to do some tricky stuff so I already have a general idea of what to do to fix this. – shadefinale Feb 28 '14 at 23:40

3 Answers3

3

In Python 3 files are opened in text mode with the system's encoding by default. You need to open your file in binary mode:

file = open(self.file, 'rb')

Another problem you will run into is that file.read(4) will give you a string of 4 bytes (which the hex function doesn't understand). And you possibly want an integer. For that, refer to int.from_bytes, or, more generally, to the struct module. Then you can print that number in hexadecimal format as so:

mdlength = int.from_bytes(file.read(4), byteorder='big')
print(hex(mdlength))
Oleh Prypin
  • 33,184
  • 10
  • 89
  • 99
  • Amazing! The int.from_bytes() function is exactly what I needed. In c++ I don't know if there is an equivalent function but I had to do this manually in c++ and was about to do it manually in python until I read your comment! Thanks! – shadefinale Feb 28 '14 at 23:54
3

Binary files should be handled in binary mode:

f = open(filename, 'rb')

For skipping bytes, I typically use file seek (SEEK_CUR or SEEK_SET) or I just do arbitrary file.read(n) if I didn't want to bother with formality. Only time I really use seeking is if I wanted to jump to a specific position.

Interpreting binary data I just stick to the unpack method provided by the struct module, which makes it easy to define whether you want to interpret a sequence of bytes as an int, float, char, etc. That's how I've been doing it for years so maybe there are more efficient approaches like the from_bytes method described in other answers.

With the struct module you can do things like

struct.unpack("3I", f.read(12))

To read in 3 (unsigned) integers at once. So for example given the format you've reversed engineered I would probably just say

unk, size = struct.unpack("2I", f.read(8))
data = f.read(size)
MxLDevs
  • 19,048
  • 36
  • 123
  • 194
1

You should open the file in binary mode: open(filename, 'rb').

Heikki Toivonen
  • 30,964
  • 11
  • 42
  • 44