0

There is a website that stores two videos as a list of thousands of PNGs, 31145 images in total. Is there a way to automate the downloading by generating the URLs? (I have no knowledge in coding.)

  1. Here's the 1st video's first frame and its last frame.
  2. Here's the 2nd video's first frame and its last frame.

I couldn't access the directory and batch download the files.

I took a look at this answer but it doesn't apply to me as I use Windows 10, and also checked this answer; I tried to merge them into for /l %x in (1, 1, 19999) do (wget https://cf-images.eu-west-1.prod.boltdns.net/v1/jit/719509184001/570e9336-d36c-4d41-8cbe-a67fe3bdc2b6/main/1280x720/%%xms/match/image.png) which did not work obviously.

I then downloaded Python 3.11 to try this answer but doesn't work, it's probably too old as it tells me urllib2 doesn't exist.

  • A frame every 1ms would mean a picture repeat rate of 1000/s (1000Hz). I would expect a "video" frame rate to be between 20 and 60 Hz. So that might explain those "frame duplicates". It might be enough to download each 16th to 50th frame. – Stephan Nov 26 '22 at 16:29

2 Answers2

1

You need to get two things: generate URLs of images, then download them.

Generating URLs can be done using for loop and formatting, consider following simple example

template = 'xxx/%05dms/match/image.png'
for i in range(1,11): # limited for brevity sake, adjust as requires
    print(template % i)

gives output

xxx/00001ms/match/image.png
xxx/00002ms/match/image.png
xxx/00003ms/match/image.png
xxx/00004ms/match/image.png
xxx/00005ms/match/image.png
xxx/00006ms/match/image.png
xxx/00007ms/match/image.png
xxx/00008ms/match/image.png
xxx/00009ms/match/image.png
xxx/00010ms/match/image.png

%05d denotes put decimal number here, prefixed by zeros to width of 5 characters.

For downloading you might use urllib.urlretrieve rembering to furnish unique names, consider following simple example

import urllib
template_url = 'xxx/%05dms/match/image.png'
template_name = 'image%05d.png'
for i in range(1,11):
    urllib.urlretrieve(template_url % i, template_name % i)

which after you set template_url to real one should download images to current working directory as image00001.png and so on.

Note: as you are using xrange I assume you must use python2 AT ANY PRICE, thus I use urllib.urlretrieve rather than urllib.request.urlretrieve and ancient method of string formatting rather than so-called f-strings.

Daweo
  • 31,313
  • 3
  • 12
  • 25
  • I'm already impressed that the url generation works, however when I tried running the full code it doesn't work. I tried on Python 3.11 then I saw you mention Python2 so I tried installing Python 2.0 but Windows said no, so I tried with Python 2.7 and it replied `IOError: [Errno socket error] [Errno 1] _ssl.c:499: error:140773E8:SSL routines:SSL23_GET_SERVER_HELLO:reason(1000)` – OUTEIRAL DIAS Esteban Nov 26 '22 at 16:59
  • I don't know what happened, nor why. I tried things and this seems to work (somewhat): `import urllib import urllib.request template_url = 'https://cf-images.eu-west-1.prod.boltdns.net/v1/jit/719509184001/570e9336-d36c-4d41-8cbe-a67fe3bdc2b6/main/1280x720/%05dms/match/image.png' template_name = 'image%05d.png' for i in range(1,11): urllib.request.urlretrieve(template_url % i, data=None)` - Problem 1: files are saved on a temp folder with lots of other random stuff - Problem 2: files are not named in a practical way and have no .png extension – OUTEIRAL DIAS Esteban Nov 26 '22 at 17:41
-1

On Python 3.11 for Windows 10 64-bit

import urllib
import urllib.request
template_url = 'https://cf-images.eu-west-1.prod.boltdns.net/v1/jit/719509184001/570e9336-d36c-4d41-8cbe-a67fe3bdc2b6/main/1280x720/%05dms/match/image.png'
template_name = 'image%05d.png'
for i in range(0,20000):
    f = template_name % i
    urllib.request.urlretrieve(template_url % i, f)

It is very slow, it took me at least 5 hours to download everything as it does it 1-by-1 and sometimes stops working (due to the website) and doesn't restart automatically. And almost 80% of the images downloaded are duplicates so it's extremely unpractical (7 Go). And all the images are downloaded to a temp folder. But the code works! I believe it was made to replicate Minecraft but in software like After Effects so all the frames are just a teeny tiny bit different from each other...


Sources: Python 1 2 3 4 5 ; Stackoverflow 1 2 3 4 ; and @Daweo's help