I'm trying to extract a ImageNet labels from the .txt file that is presented as follows.
998: 'ear, spike, capitulum', 999: 'toilet tissue, toilet paper, bathroom tissue'}
I've tried
label = []
txt = open("imagenet1000_clsid_to_human.txt").readlines()
# print(str(txt))
p = re.compile(r"'(.*?)'")
# print(txt)
for i in range(len(txt)):
# print(txt[i])
# print('\n')
m = p.match(txt[i])
if m:
lis = list(m.group())[:-1]
s = ''.join(lis)
print(s)
label.append(s)
to extract the substring inside the single quotation marks, but it continuously spits out 'None'.
I've tried in online regex compiler, and it worked perfectly fine. Can anybody give some advice for this issue?