Read a binary mp4 to a numpy array

Question

I read this answer on how to pass an mp4 file from client to server using python's FastAPI. I can read the file into its binary form like as suggested:

contents = file.file.read()
contents
Out[25]: b'\x00\x00\x00\x18ftypmp42\x00\x00\x00\x00isommp42..

Now, I want to load the content into a numpy array. I have looked on several answers on the web but all of them read the file from disk, something like:

import skvideo.io  
videodata = skvideo.io.vread("video_file_name")

However, I want to avoid the disk operation of writing and deleting the binary string.

Any help would be much appreciated.

`np.frombuffer` could help you, see https://numpy.org/doc/stable/reference/generated/numpy.frombuffer.html — Carlos Horn, Nov 29 '22 at 11:57
This looks awfuly like a XY problem. What is it you are trying to do exactly? Because, I can't imagine any situation where loading the bytes of a .mp4 into a numpy array can make any sense (it is very easy to do, tho: `np.frombuffer(contents, dtype=np.uint8)` should do. But what do you expect to do from that...) — chrslg, Nov 29 '22 at 11:58
Quite recently, I've commented similarly in an almost identical question about '.mp3'. And turned out (it tooks a awful while to get to it, althought I knew from the beginning we would end up there) that the asker wanted to have the sound samples, not the meaningless bytes of the file. Isn't it a similar situation? Because, I can get why you can be interested into the bytes (eg, just to save them. Or maybe you are doing super sharp things on the .mp4 format). But why in the form of a numpy array? numpy array's are for when you want to do operations, additions, and such, on bytes. — chrslg, Nov 29 '22 at 12:02
No such operation can have sense for the bytes of a `.mp4`. If it is just for saving, storing, passing, etc, you are better off keeping them as the byte-string you already have. — chrslg, Nov 29 '22 at 12:03
But, well, if you know what you are doing, then `np.frombuffer(contents, dtype=np.uint8)` is your answer (the dtype part is important. Since frombuffer does not just iterates bytes to create an array, but expect to find the data representation as is in the buffer. And without the `dtype` it will try to create an array of float64 from your buffer. Which, 7 times out of 8, will fail because an array of float64 can be represented only by buffers of bytes of len multiple of 8. And if len of contents happen to be multiple of 8, it will succeeds, giving your meaningless floats — chrslg, Nov 29 '22 at 12:06
@chrslg, I am totally with you, that raw bytes are not too useful for most processing, but it is what the asker wanted. However, it could be some exercise on compression algorithms, and this is the preparation for some histogram analysis... who knows... — Carlos Horn, Nov 29 '22 at 12:07
@CarlosHorn My point exactly when I said "unless you are doing super sharp things on the .mp4 format". I, myself, reinvented so many wheels... once upon a time (that was long ago, and less stupid then than it would be now) I would never read any file format without implementing the parsing myself (it was before anything as sophisticated as mpeg existed tho). So, I have no problem with that. But even in this kind of situation, I am not sure numpy array are the best tool. Anyway, I gave the answer in case. My point is not to dismiss the question. Just to be sure my answer is useful — chrslg, Nov 29 '22 at 12:11
@CarlosHorn Don't want to say "I told so" (especially since I admitted that, indeed, you could be right and the real question could be really about reading bytes). But well... I told so :D — chrslg, Nov 30 '22 at 13:53

score 0 · Answer 1 · answered Nov 29 '22 at 12:22

0

So, I commented heavily about how unsure I am that it is really the thing to do, and how I suspect XY problem on this.

But, just to give a formal answer to the question, as is, I repeat here what I said in comments:

np.frombuffer(contents, dtype=np.uint8)

Is the way to turn a byte string into a numpy array of bytes (that is of uint8 integers)

The dtype part is important. Since frombuffer does not just iterates bytes to create an array, but expect to find the data representation as is in the buffer. And without the dtype it will try to create an array of float64 from your buffer. Which, 7 times out of 8, will fail because an array of float64 can be represented only by buffers of bytes of len multiple of 8. And if len of contents happen to be multiple of 8, it will succeeds, giving your meaningless floats.

For example, on a .mp4 of mine

with open('out.mp4', 'rb') as f:
    content=f.read()

len(content)
# 63047 - it is a very small mp4

x=np.frombuffer(content)
# ValueError: buffer size must be a multiple of element size

x=np.frombuffer(content[:63040])
x.shape
# (7880,)
x.dtype
# np.float64
x[:10]
#array([ 6.32301702e+233,  2.78135139e-309,  9.33260821e-066,
#        1.15681581e-071,  2.78106620e+180,  3.98476928e+252,
#                    nan,  9.02529811e+042, -3.58729431e+222,
#        1.08615058e-153])

x=np.frombuffer(content, dtype=np.uint8)
x.shape
# (63047,)
x.dtype
# uint8
x[:10]
# array([  0,   0,   0,  32, 102, 116, 121, 112, 105, 115], dtype=uint8)

answered Nov 29 '22 at 12:22

chrslg

9,023
5
17
31

Thanks for the response. I have tried np.frombuffer, but it seems to give me a flattened array. As I said in the post I'm trying to transfer an mp4 file from the client so I can not say anything about the given mp4 height / width dimensions (this is up to the client). Regarding this being an XY problem, well, I have already tried many solutions, (I even posted the direct question on stack overflow at https://stackoverflow.com/questions/74451359/reading-video-uploaded-by-client-in-python-server but no one commented). – Yonatan Nov 30 '22 at 12:21
@Yonatan Your comment is the ultimate proof it is a XY problem. Saying: "I don't know the width×height of .mp4 data" is exactly the equivalent of "I want the sound samples of the .mp3" of the analogy I was using. So, I go back to what I was saying before even answering: you don't want to read bytes of .mp4. They are meaningless to you. You wont find pixels in them. – chrslg Nov 30 '22 at 13:38
XY problem is not related to what you've tried. It is related to what you choose not to disclose here. While people are debating on how to read bytes from a .mp4, they were missing a crucial information: in reality you don't want to read the bytes of .mp4. You want to read pixels. And you wrongly believe that one step to do that is to get the bytes of .mp4. So, instead of asking about your real question (how to read pixels) you've asked about your supposed solution. – chrslg Nov 30 '22 at 13:40
@Yonatan. So please, explain what you really want to do. Because, contrarily to your apparent belief "As I said in the post I'm trying to transfer an mp4 file from the client" said nothing about the fact that you wanted pixels. Even tho, as you've read, I almost knew from the beginning that you'll end up revealing that in reality you don't care for the bytes of the .mp4 file, that sentence was not even one of the reason I thought so. And now, I understand that you want pixels. But I still don't know enough to be able to answer. Depending on what you want, there is a 2/3 chance that my answer – chrslg Nov 30 '22 at 13:46
would be unusable to you. For example, WHY do you want to transfer video from client? If it is just to store them, you don't really need pixels. .mp4 data is enough. If it is to display images from it to the client, it is way better to try to decode the video at client side, rather than decoding it server side (a heavy task0. If it is to do some processing server side (detecting things in it, removing parts, cropping, adding color effects, etc.) then, indeed, it needs to be decoded server side, but even there, it depends to do what (and what about soundtrack, etc.) – chrslg Nov 30 '22 at 13:49
So, explain all that. So that you can post another question about the real problem (I would leave this question as is. The real question is too different from this one. And after all, this one has an answer, and, immodestly, I would say a good answer. I don't care if your award me points or validation or not. But for the future, it may help people who want to solve the question as you asked it to see this question and my answer as is. So, the best thing is to write a new question. Once we'll have figured out what is the good question) – chrslg Nov 30 '22 at 13:52

Read a binary mp4 to a numpy array

1 Answers1