1
from urllib.request import urlopen
import re

urlpath =urlopen("http://blablabla.com/file")
string = urlpath.read().decode('utf-8')

pattern = re.compile('*.docx"')
onlyfiles = pattern.findall(string)

print(onlyfiles)

Target output

['http://blablabla.com/file/1.docx','http://blablabla.com/file/2.docx']

But I got this

[]

I get this error message when trying this.

re.error: nothing to repeat at position 0
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Nurdin
  • 23,382
  • 43
  • 130
  • 308

1 Answers1

2

The star from this line:

pattern = re.compile('*.docx"')

Apparently seems to be a python known bug:

Check out this related answers: regex error - nothing to repeat

Try this using word or a-z regexp:

pattern = re.compile('\w*.docx"')
# or
pattern = re.compile('[a-zA-Z0-9]*.docx"')
V. Sambor
  • 12,361
  • 6
  • 46
  • 65