-1

Currently I have some tracks that simulate people walking around in an area 1280x720pixels that span over 12 hours. The recordings stored are x, y coordinate and the timeframe (in seconds) of the specific recordings.

I want to create a movie sequence that shows how the people are walking over the 12 hours. To do this I split the data into 43200 different frames. This correspond to every frame is one second. My end goal is to use these data in a machine learning algorithm.

The idea is then simple. Initilize the frames, loop through all the x,y coordinate and add them to the array frames with their respective timeframe:

>>> frames = np.zeros((43200, 1280, 720,1))
>>> for track in tracks:
>>>     for x,y,time in track:
>>>     frames[int(time), y,x] = 255 # to visualize the walking     

This will in theory create a frame of 43200 that can be saved as a mp4, gif or some other format and be played. However, the problem occurs when I try to initialize the numpy array:

>>> np.zeros((43200,720,1280,1))
MemoryError: Unable to allocate 297. GiB for an array with shape (43200, 1280, 720, 1) and data type float64

This makes sense because im trying to allocate:

>>> (43200 * 1280 * 720 * 8) * 1024**3 
296.630859375 

I then thought about saving each frame to a npy file but each file will be 7.4MB which sums up to 320GB.

I also thought about splitting the frames up into five different arrays:

>>> a = np.zeros((8640, 720, 1280, 1))
>>> b = np.zeros((8640, 720, 1280, 1))
>>> c = np.zeros((8640, 720, 1280, 1))
>>> d = np.zeros((8640, 720, 1280, 1))
>>> e = np.zeros((8640, 720, 1280, 1))

But I think that seems cumbersome and it does not feel like the best solution. It will most likely slow the training of my machine learning algorithm. Is there a smarter way to do this?

PV8
  • 5,799
  • 7
  • 43
  • 87
Kalle
  • 364
  • 1
  • 2
  • 14
  • Can you please clarify what you are asking? The data you have described is *by definition* of the scale of 300GB – there is no way to make it smaller without it being different data. What kind of "different data" would you be fine with? For example, the array currently uses 8-byte float, but only stores 0 or 255 – which could be simplified to 1-*bit* 0 or 1. Would you be fine with compression? How about a different format, such as a sparse matrix / storing only set-or-unset coordinates? – MisterMiyagi Jul 29 '20 at 13:04
  • Are the frames (lossless) compressible and do you really need 64bit floats? eg. trying for example this: https://stackoverflow.com/a/56761075/4045774 (For your application this has to be adapted but it would be interesting which compression ratio is achievable) – max9111 Jul 30 '20 at 12:13

2 Answers2

0

I would just build the video a few frames at a time, then join the frames together using ffmpeg. There should be no need to store the whole video in memory at once based on the description of the use case.

Jamie Counsell
  • 7,730
  • 6
  • 46
  • 81
0

I think you will have to split your data in different, small arrays, and that probably won't be an issue for machine learning purposes.

However, I don't know if you will be able to create these five numpy arrays as they will also take a total of 297Gb of RAM.

I would probably :

  • save the numpy arrays as PNGs using for instance matplotlib.pyplot.imsave, or
  • store them as short videos, as a person won't be seen more than that on your video anyway, or
  • reduce the fps or the resolution if you really want the whole video in one variable

Let me also add that :

  • The snippet of code you gave can be executed in a much faster time with frames = np.ones((43200, 1280, 720,1))*255, as intricated for loops are very expensive
  • If you were to create an array by setting all of its coefficients one by one, it would be more effective to initialize it with np.empty(shape), as it would spare you the time needed to put all the coefficients to zero only to overwrite them in your for loop