My requirement is to generate a csv file with ~500,000 (unique) records which has the following column headers:
csv file example:
email,customerId,firstName,lastName
qa+d43e5efc-6b0f-46ce-a14e-1db63bb77882@example.com,0d981ae1be954ea7-b411-28a98e3ddba2,Daniel,Newton
I tried to write below piece of code for this but wanted to know that is there a better/efficient way to do this Its my first time dealing with a large data set and currently my code takes a really long time to run [more than an hour] so really looking for suggestions/feedback. Thanks
with open('test_csv_file.csv', 'w') as csvf:
writer = csv.writer(csvf)
column_headers = ("email", "customerId", "firstName", "lastName")
writer.writerow(column_headers)
for _ in range(500000):
fake = Faker()
row = (f'qa+{uuid4()}@example.com', uuid4(), fake.first_name(), fake.last_name())
writer.writerow(row)