A few things that come to mind immediately:
- Use a
with
statement, rather than manually closing your file.
- Pass a generator expression to
f.writelines
rather than building up a 100000 row list over and over (let the standard library handle how much, if any, it buffers the output).
- Or, better yet, use the
csv
module to handle writing your tab-separated output.
Here's a quick stab at some improved code:
from csv import DictWriter
def flatten_rows_to_file(filename, rows):
with open(filename, 'ab') as f:
writer = DictWriter(f, ['id','price','site_id','rating','shop_id'],
delimiter='\t')
writer.writerows(rows)
Note that if you're using Python 3, you need slightly different code for the opening the file. Use mode 'a'
rather than 'ab'
and add the keyword argument newline=""
. You didn't need the +
in the mode you were using (you are only writing, not writing and reading both).
If the values in your rows
argument may have extra keys beyond the ones you were writing, you'll need to pass some extra arguments to the DictWriter
constructor as well.