Handling '/' in filenames

Question

I am trying to save html pages with their category using scrapy in python. When trying to save them I would like them to have the name 'WebCategory_http://whatever.com'. Whenever I try to do that with this code:

def parse(self,response):
    content = response.body
    url = response.url
    cat =  str(response.meta['cat'])
    filename = str(cat) + '_' + str(url)
    with open(filename,'wb') as f:
        f.write(response.body)

when I do this, this happens:

IOError: [Errno 2] No such file or directory: 'Arts_https://www.behindthevoiceactors.com/'
2018-11-19 15:43:15 [scrapy.extensions.logstats] INFO: Crawled 45 pages (at 45 pages/min), scraped 0 items (at 0 items/min)
n)

My guess is that '/' is interpreted as part of the path instead of a filename, is there any way keep using '/'?

score 0 · Answer 1 · answered Nov 19 '18 at 14:50

0

No, / is not a valid part of a filename in most filesystems. You need to replace it with a different character.

answered Nov 19 '18 at 14:50

Tordek

10,628
3
36
67

score 0 · Answer 2 · answered Nov 19 '18 at 14:51

0

No, you can't use / in a path name, it's a reserved character (on this system).

Replace the character with something else, for instance:

filename = str(cat) + '_' + str(url).replace('/', '_')

answered Nov 19 '18 at 14:51

Matthieu Brucher

21,634
7
38
62

Handling '/' in filenames

2 Answers2