i am using a python script with regex module trying to process 2 files and create a final output as required but getting some errors.
cat links.txt
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/XXXXJD8C-32313922.mp4.m3u8?hdnts=exp=1596554537~acl=*/bGxpJD8C-32313922.mp4.m3u8~hmac=2ac95222f1693d11e7fd8758eb0a18d6d2ee187bb10e3c27311e627785687bd5
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/XXXXkxI1-32313922.mp4.m3u8?hdnts=exp=1596554733~acl=*/bM07kxI1-32313922.mp4.m3u8~hmac=dd0fc6f433a8ac74c9eaa2a376fa4324a65ae7c410cdcf8e869c6961f1a5b5ea
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/XXXXpGKZ-32313922.mp4.m3u8?hdnts=exp=1596554748~acl=*/onhIpGKZ-32313922.mp4.m3u8~hmac=d4030cf7813cef02a58ca17127a0bc6b19dc93cccd6add4edc72a2ee5154f236
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/XXXXLbgy-32313922.mp4.m3u8?hdnts=exp=1596554871~acl=*/xGXCLbgy-32313922.mp4.m3u8~hmac=7c515306c033c88d32072d54ba1d6aa4abf1be23070d1bb14d1311e4e74cc1d7
cat name.txt
Introduction Lecture 1
Questions Lecture 1B
Theory Lecture 2
Labour Costing Lecture 352 (Classroom Lecture)
Expected ( final.txt )
https://cdn.jwplayer.com/vidoes/XXXXJD8C-32313922.mp4
out=Lecture 001- Introduction.mp4
https://cdn.jwplayer.com/vidoes/XXXXkxI1-32313922.mp4
out=Lecture 001B- Questions.mp4
https://cdn.jwplayer.com/vidoes/XXXXpGKZ-32313922.mp4
out=Lecture 002- Theory.mp4
https://cdn.jwplayer.com/vidoes/XXXXLbgy-32313922.mp4
out=Lecture 352- Labour Costing (Classroom Lecture).mp4
cat sort.py ( my existing script )
import re
final = open('final.txt','w')
a = open('links.txt','r')
b = open('name.txt','r')
base = 'https://cdn.jwplayer.com/videos/'
kek = re.compile(r'(?<=\/)[\w\-\.]+(?=.m3u8)')
# find max lecture number
n = None
for line in b:
b_n = int(''.join([c for c in line.rpartition(' ')[2] if c in '1234567890']))
if n is None or b_n > n:
n = b_n
n = len(str(n)) # string len of the max lecture number
b = open('name.txt','r')
for line in a:
final.write(base + kek.search(line).group() + '\n')
b_line = b.readline().rstrip()
line_before_lecture, _, lecture = b_line.partition('Lecture')
line_before_lecture = line_before_lecture.strip()
lecture_no = lecture.rpartition(' ')[2]
lecture_str = lecture_no.rjust(n, '0') + '-' + " " + line_before_lecture
final.write(' out=' + 'Lecture ' + lecture_str + '.mp4\n')
Traceback
Traceback (most recent call last):
File "sort.py", line 11, in <module>
b_n = int(''.join([c for c in line.rpartition(' ')[2] if c in '1234567890']))
ValueError: invalid literal for int() with base 10: ''
Edit - It seems that the error is due to the last line in name.txt as my script assumes all lines in name.txt would end in format of Lecture X.
One way to fix it i guess is to edit the script and add a if condition as follows :
If any line in name.txt doesn't end in format - Lecture X , then shift the text succeeding Lecture X prior to word Lecture.
Example the 4th line of name.txt
Labour Costing Lecture 352 (Classroom Lecture)
Could be converted to
Labour Costing (Classroom Lecture) Lecture 352
and edit the below line in my script to match only the last occurrence of "Lecture" in a line in name.txt
line_before_lecture, _, lecture = b_line.partition('Lecture')
i basically need the expected output ( final.txt ) from those 2 files ( names.txt and links.txt ) using the script , if there's a better/smart way to do it , i would definitely be happy to use it. I just theoretically suggested one way of doing it which i have no clue how to do it myself