Description
Basically my problem is about loading data from CSV files. I already made a code able to load a given number of columns inside arrays (see example). Now I would like to improve the code so I can change the number of column to read and load without modifying my code everytime. Said an other way, I would like my code to dynamically adapt to the number of columns I choose. Let me give you an example of my present code.
Code example
Steps :
1. With Tkinter I select the files I want to load, this part of the code returns file_path
, containing the several file paths.
2 Then I define the useful parameters for CSV reading. I create the arrays I want to be loaded with my datas, and then I load the datas.
n = len(file_path) # number of files
# here I just determine the size of each files with a custom function, m is the maximum size
all_size , m = size_data(file_path,row_skip,col_to_read,delim)
# I create the arrays
shape = (n, m)
time = zeros(shape)
CH1 = zeros(shape)
# define CSV parameters before using loadtxt
row_skip = 5
delim = ';'
col_to_read = (0,1) # <= This is where I choose the columns to be read
# I load the arrays
for k in range(0, len(file_path)):
end = all_size[k] # this is the size of the array to be loaded.
# I do this in order to avoid the annoying error
# ValueError: could not broadcast input array from shape (20) into shape (50)
time[k][:end], CH1[k][:end] = loadtxt(file_path[k],
delimiter=delim,
skiprows=row_skip,
usecols=col_to_read,
unpack=True)
My problem is that if each file has 3 columns, i.e col_to_read = (0,1,2)
, I have to add a new array CH2 = zeros(shape)
during creation and during loading. I would like a solution that is dynamically adapting to the number of columns I want to load. Only col_to_read
would be hand changed. Ideally I would like to implement this code inside a function, because I do a lot of data analysis and I don't want the same code being pasted on every program.
First idea
I already found a way to dynamically create a given number of zeros arrays (see here). That's quite direct.
dicty = {}
for i in file_path:
dicty[i] = []
this seems good, but now I would like to make the last line working whatever the number of variables. I believe there is a convenient way to adapt my code and use this dicty
, but there's something I don't understand and I'm stuck.
I would appreciate any help.