0

I've a huge number of png files, which are named the following way: PREFIX_00000000.png.

These files come from extracting frames from video files using ffmpeg -skip_frame nokey -i ORIGINALVIDEO.MP4 -vsync vfr -frame_pts true -r 1000 FRAMES_%d.png which outputs files like:

FRAMES_15416721.png
FRAMES_26708343.png
FRAMES_38000000.png  
FRAMES_49291677.png
FRAMES_60583335.png
FRAMES_71875085.png

(I came up with this by this answer. )

PREFIX_ is a fixed string and is followed by eight digits. Till now the video files are 24h max – which brings me to, 86400000ms (8 digits). Maybe the video files will longer – thus I'm looking for a solution which could be changed to any number of digits. These numbers represent milliseconds (Since the beginning of the video file / representing the time into the video).

Now i'm searching for a way to convert the filename from the given format to the format PREFIX_HH:mm:ss.fff.png. (String, followed by Hour:Minute:Second.Millisecond). So a file named FRAMES_49291677.png would become FRAMES_13:41:31.677.png.

Sample values would be the following. (I added examples for >24h as well.)
Input:

PREFIX_1.png
PREFIX_000000001.png
PREFIX_1012.png
PREFIX_00001012.png
PREFIX_12106.png
PREFIX_000012106.png
PREFIX_725463.png
PREFIX_00725463.png
PREFIX_1204242.png
PREFIX_001204242.png
PREFIX_9864336.png
PREFIX_09864336.png
PREFIX_36012486.png
PREFIX_036012486.png
PREFIX_49291677.png
PREFIX_049291677.png
PREFIX_86400000.png
PREFIX_086400000.png
PREFIX_113744568.png
PREFIX_261736874.png

Output:

PREFIX_00:00:00.001.png
PREFIX_00:00:00.001.png
PREFIX_00:00:01.012.png
PREFIX_00:00:01.012.png
PREFIX_00:00:12.106.png
PREFIX_00:00:12.106.png
PREFIX_00:02:05.463.png
PREFIX_00:02:05.463.png
PREFIX_00:20:04.242.png
PREFIX_00:20:04.242.png
PREFIX_02:44:30.336.png
PREFIX_02:44:30.336.png
PREFIX_10:00:12.486.png
PREFIX_10:00:12.486.png
PREFIX_13:41:31.677.png
PREFIX_13:41:31.677.png
PREFIX_24:00:00.000.png
PREFIX_24:00:00.000.png
PREFIX_31:25:44.568.png
PREFIX_72:22:16.874.png

The turnside? This should be done via bash (Ubuntu).

Did anyone of you do sth like this? Or does have a solution?

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
Jafix
  • 37
  • 8
  • 4
    please provide a few actual examples and the expected results; also, please expand on what you mean by *`maybe anyday more`* ... does this mean there may be more than 8 numbers (and if so, then include such examples in the sample inputs and expected output) – markp-fuso May 19 '23 at 13:53
  • 2
    you state the 8-digit number represents milliseconds ... milliseconds from what base? milliseconds since start of day? start of week? start of month? start of year? since epoch? – markp-fuso May 19 '23 at 13:55
  • Added an "edit" to my original question to clarify things. Sorry. – Jafix May 20 '23 at 14:53
  • if the millisecs are more than 8 digits (ie, more than 24 hrs), what's the expected output .... add an entry for number of days, or expand hours to whatever is calculated (eg, 25 hrs, 38 hrs, 76 hrs, 235 hrs)? – markp-fuso May 20 '23 at 16:00
  • 1
    It would be best, if the hours would be expanded (like 25 hrs, 38 hrs, 76 hrs, 235 hrs, as you mentioned) – Jafix May 20 '23 at 19:15
  • Ok, thanks for your question and advice. I tried to edit the question so that it fulfills your request. Do you think it's ok now, or should I still change something? – Jafix May 21 '23 at 15:01
  • 1
    Mh, right. I saw it from the readability perspective. But makes more sense as you explained from the testing/diff perspective. I changed to normal codeblocks, hoping it's the easiest way to copy/paste data for testing – Jafix May 21 '23 at 15:45
  • 1
    I think your math is wrong in some cases. For example you say `PREFIX_725463.png` should become `PREFIX_00:02:05.463.png` but `(2*60 + 5) * 1000 + 463` equals `125463`, not `725463`. Please check the match on rows 7, 8, 11, 12, 19 and 20. See the bottom of [my answer](https://stackoverflow.com/a/76290137/1745001) for what I think the output should be instead and if any of that's wrong let us know how you get those values. – Ed Morton May 21 '23 at 19:23
  • I've got an `awk` solution simiilar to @EdMorton's that's generating the same output; I agree the expected output (in the question) is wrong for lines 7, 8, 11, 12, 19 and 20 – markp-fuso May 21 '23 at 19:50
  • after fixing a base10 bug, and modifying to handle fewer digits, my bash version also matches the awk results – jhnc May 21 '23 at 20:12
  • 1
    Yepp, indeed. Did the calculations manually and seems I got an error anywhere. Sry, marked answer as accepted as it solves my problem. – Jafix May 22 '23 at 11:11

2 Answers2

3

Using any awk:

$ cat tst.sh
#!/usr/bin/env bash

shopt -s extglob

while read -r old new; do
    echo mv -- "$old" "$new"
done < <(
    printf '%s\n' PREFIX_+([[:digit:]]).png |
    awk -F '[_.]' '{
        hrs  = int( ($2 / (1000 * 60 * 60)) )
        mins = int( ($2 / (1000 * 60)) % 60 )
        secs = int( ($2 / 1000) % 60 )
        ms   = $2 % 1000
        printf "%s %s_%02d:%02d:%02d.%03d.%s\n", $0, $1, hrs, mins, secs, ms, $3
    }'
)

$ ./tst.sh
mv -- PREFIX_10799999.png PREFIX_02:59:59.999.png
mv -- PREFIX_12345678.png PREFIX_03:25:45.678.png
mv -- PREFIX_87654321.png PREFIX_24:20:54.321.png

Remove the echo once you're happy with the results.

Check the math and tweak if necessary as I mostly just copied it from From milliseconds to hour, minutes, seconds and milliseconds.


Modified to read input from a file and produce the expected output in the question for testing purposes:

$ cat tst.sh
#!/usr/bin/env bash

shopt -s extglob

while read -r old new; do
    #echo mv -- "$old" "$new"
    echo "$new"
done < <(
    # printf '%s\n' PREFIX_+([[:digit:]]).png |
    cat file |
    awk -F '[_.]' '{
        hrs  = int( ($2 / (1000 * 60 * 60)) )
        mins = int( ($2 / (1000 * 60)) % 60 )
        secs = int( ($2 / 1000) % 60 )
        ms   = $2 % 1000
        printf "%s %s_%02d:%02d:%02d.%03d.%s\n", $0, $1, hrs, mins, secs, ms, $3
    }'
)

$ ./tst.sh
PREFIX_00:00:00.001.png
PREFIX_00:00:00.001.png
PREFIX_00:00:01.012.png
PREFIX_00:00:01.012.png
PREFIX_00:00:12.106.png
PREFIX_00:00:12.106.png
PREFIX_00:12:05.463.png
PREFIX_00:12:05.463.png
PREFIX_00:20:04.242.png
PREFIX_00:20:04.242.png
PREFIX_02:44:24.336.png
PREFIX_02:44:24.336.png
PREFIX_10:00:12.486.png
PREFIX_10:00:12.486.png
PREFIX_13:41:31.677.png
PREFIX_13:41:31.677.png
PREFIX_24:00:00.000.png
PREFIX_24:00:00.000.png
PREFIX_31:35:44.568.png
PREFIX_72:42:16.874.png
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Math looks good, but when I run it over given test data it works just partilaly. The results mv -- PREFIX_00725463.png PREFIX_00:12:05.463.png mv -- PREFIX_725463.png PREFIX_00:12:05.463.png mv -- PREFIX_09864336.png PREFIX_02:44:24.336.png mv -- PREFIX_9864336.png PREFIX_02:44:24.336.png mv -- PREFIX_113744568.png PREFIX_31:35:44.568.png mv -- PREFIX_261736874.png PREFIX_72:42:16.874.png are wrong. Do you see the error? – Jafix May 21 '23 at 17:20
  • 1
    I think the script is working and it's your expected output that's wrong, – Ed Morton May 21 '23 at 19:18
2

A pure bash implementation:

ms2hms()(
    for i; do

        if ! [[ $i =~ ^(.+_)([0-9]+)(\.png)$ ]]; then
            echo 1>&2 bad filename: "$i"
            continue
        fi

        pre=${BASH_REMATCH[1]}
        sfx=${BASH_REMATCH[3]}

        t=10#${BASH_REMATCH[2]}
        ((
            f = t%1000,

            t = t/1000,
            s = t%60,

            t = t/60,
            m = t%60,

            h = t/60
        ))

        printf -v o '%s%02d:%02d:%02d.%03d%s' "$pre" $h $m $s $f "$sfx"

        echo mv -i "$i" "$o"

    done
)

You used "milliseconds" but gave 4 digits after the decimal. If you really intended 4 digits, replace {3} with {4} and %03d with %04d (or append a fixed 0 to the printf format string if input actually is milliseconds but you still want 4 digits).

jhnc
  • 11,310
  • 1
  • 9
  • 26
  • Added an "edit" to my original question to clarify things. Sorry. Will try this answer when i'm home again tomorrow. – Jafix May 20 '23 at 14:53
  • Note that this code will produce sensible output even for >99h - `%02d` only specifies minimum field width. – jhnc May 20 '23 at 16:12