0

I suppose for each iteration, the content should be written to filename1 and filename2. But while the program is still running and I am checking the two files, they are both empty. However, if I just do first 5 iterations by uncommenting 'i>5', and I can see the program immediately finished and the files are written as expected.

with open(filename1, 'w') as f1, open(filename2, 'w') as f2:
            for i, term in enumerate(self.term_list):
                # if i>5:
                #     break
                para = urllib.parse.quote(term)
                url = urllib.parse.urljoin(self.base_url, para)
                url = urllib.parse.urljoin(url, '?baike=' + self.source.name)
                try:
                    with urllib.request.urlopen(url) as response:
                        html = response.read()
                        html = html.decode('utf-8')
                        html_json = json.loads(html)

                        categories = self.get_category(html_json)
                        f1.write(term + '\t' + str(categories) + '\n')
                        self.term_categories[term] = categories
                        print(term, str(categories))
                except:
                    print("skip: ", term)
                    f2.write(term+ '\n')
                    self.skip_terms.append(term)
                    self.skip_count += 1

Does the program have to wait for all iterations to finish and then write all content into files at the end? That doesn't sound right

  • 5
    This is a performance feature. The file handles buffers what needs to be written and flushes its buffer when it is closed. You can manually flush it at each iteration with `f1.flush()` – user2390182 Mar 26 '20 at 07:39
  • 1
    Did you take a look at [the docs](https://docs.python.org/3/library/functions.html#open)? The buffering defaults are very clearly described. – MisterMiyagi Mar 26 '20 at 07:43
  • That's a very good explanation. – Abigail Min NM Mar 26 '20 at 07:46

0 Answers0