How can I convert this language to actual numbers and text?

Question

I am working on natural language processing project with deep learning and I downloaded a word embedding file. The file is in .bin format. I can open that file with

file = open("cbow.bin", "rb")

But when I type

file.read(100)

I get

b'4347907 300\n</s> H\xe1\xae:0\x16\xc1:\xbfX\xa7\xbaR8\x8f\xba\xa0\xd3\xee9K\xfe\x83::m\xa49\xbc\xbb\x938\xa4p\x9d\xbat\xdaA:UU\xbe\xba\x93_\xda9\x82N\x83\xb9\xaeG\xa7\xb9\xde\xdd\x90\xbaww$\xba\xfdba:\x14.\x84:R\xb8\x81:0\x96\x0b:\x96\xfc\x06'

What is this language and How can I convert it into actual numbers and text using python?

This might be the machine executable. Where did you get it from? — none none, Mar 06 '22 at 12:41
"The file is in `.bin` format"—`.bin` isn't a format. It's a file extension. Lots of applications use `.bin` file extensions for arbitrary binary data, there's no standard. — ChrisGPT was on strike, Mar 06 '22 at 12:44
It's unusual to have a non-executable binary file that's not accompanied by some sort of documentation on how to interpret it. Have you tried contacting the authors of the file? — none none, Mar 06 '22 at 12:47
Best of luck. If you manage to get to the bottom of it, be sure to answer the question — none none, Mar 06 '22 at 12:53

score 1 · Accepted Answer · answered Mar 06 '22 at 13:11

1

This weird language you are referring to is a python bytestring.

As @jolitti implied that you won't be able to convert this particular bytestring to readable text.

If the bytestring contained any characters you recognize then would have been displayed like this.

b'Guido van Rossum'

answered Mar 06 '22 at 13:11

Nikhil Devadiga

428
2
9

So, Do I need to contact the authors of the file? How would they help me? – floyd Mar 06 '22 at 13:18
1

Yes, please ask them. The file you have is just a stream of bytes and not something to be parsed. I don't understand what word embedding file is. But reading a short summary on word embedding, I would guess this is a trained model and you would have to _load_ this file to use it. – Nikhil Devadiga Mar 06 '22 at 13:37

How can I convert this language to actual numbers and text?

1 Answers1