I have 1500 people's data, and each people has about 10000 features, and each feature has a value. There is dictionary called dict_f, the key of dict_f= {'name': value, 'f1': value, 'f2': value, .......} are the features of people. for example:
name f1 f2 f3 f4 ............
name1 1 2 3 4
name2 1.1 2.1 3.1 4.1
...............................
I want to write these data to a cvs file, and then in another code file, I want to write the cvs file to a data frame with pandas. But I found that the time of writing every people value(the dict_f, please notice that the value of feature for each people is different) was taken about 1.4s. So 1500 people's data will taken about 1500*1.4. So the time is much, I want to reduce the time and improve the speed of writing data to csv.
Some part of code is following(please notice that the lst_field_names_0 are the features name list):
with open('data/feature_data_0_0.csv', mode='wt', encoding='utf-8') as outfile:
fieldnames = lst_field_names_0
writer = csv.DictWriter(outfile, fieldnames, restval='""', dialect=csv.unix_dialect)
writer.writeheader()
for i in range(0, len(name_list), 1):
writer.writerow(dict_f)
Then I want to use the pandas to read the cvs file,
feature_dataframe = pd.read_csv('data/feature_data_0_0.csv')
Could you help me to improve the speed of writing to cvs file.
Thanks in advance!