lines = [line for line in file(filename)]
will store each line from your data file into an array of strings, note that your data is separated by \t
and each line ends with \n
as all for line in file
do.
Calling lines[0].strip().split('\t')[1:]
means: from list of strings lines
get first line (which I assume contains information about your data and not the actual data) strip()
will remove the \n
at the end, split('\t')
will separate the information into a list where then you store [1:]
(2nd to last elements) inside cols
.
rows = []
is creating an list to store information about your data row-wise just like cols
is a list storing information column-wise.
Since we already parsed information in the first line lines[0]
you want to process the rest so you loop over lines[1:]
, before [1:]
was used to go over the information in the first line and now to go over the remaining lines.
p=line.strip().split('\t')
as before will strip your line of \n
and split it on \t
to store all your row info and data. p[0]
I assume is your row-wise info of data and stored in rows
list while the remainder is the actual data (in your case integers) stored in data
which is an array (list of lists of floats).
Even though your data is all in integers, for line in file
will read it as a string
. float()
is used to be able to store the data as something you can use for mathematical operations later if you so need.
In the end if your file looks like this:
Something name1 name2 name3 ...
condition1 data1.1 data1.2 data1.3 ...
condition2 data2.1 data2.2 data2.3 ...
condition3 data3.1 data3.2 data3.3 ...
your output:
cols = ['name1', 'name2', 'name3', ...]
rows = ['condition1', 'condition2', 'condition3', ...]
data = [[data1.1, data1.2, data1.3, ...], [data2.1, data2.2, data2.3, ...], [data3.1, data3.2, data3.3, ...], ...]