-2

Please explain to me the lines commented below :

def readfile(filename):
    lines = [line for line in file(filename)]
    cols = lines[0].strip().split('\t')[1:] #why [1:] here? what is it doing?
    rows = [] #whats the difference between rows = [] and rows = {}
    data=[]
    for line in lines[1:]: #what lines[1:] is doing?
        p=line.strip().split('\t')
        rows.append(p[0])
        #why we used float below if my file contains only integer numbers?
        data.append([float(x) for x in p[1:]])    
    return rows,cols,data
khelwood
  • 55,782
  • 14
  • 81
  • 108
Mobassir Hossen
  • 247
  • 4
  • 16
  • I would suggest looking at https://stackoverflow.com/questions/509211/understanding-pythons-slice-notation – Andrej Kesely Jul 12 '18 at 09:21
  • `lines[1:]` will generate a sub list (a slice) of your `lines` list which begins from `index 1`. And you should know that lists in python begins from `index 0` – Chiheb Nexus Jul 12 '18 at 09:21
  • `rows = []` assigns to `rows` an empty list, while `rows = {}` assigns an empty dictionary. `lines[1:]` create a sublist starting from the second element of `lines`. – Vasilis G. Jul 12 '18 at 09:22
  • There are four different questions here, making this eligible for close as "too broad" if it *weren't* already a duplicate. Please see the guidance at https://stackoverflow.com/help/on-topic -- each SO question should be about one *specific* programming question that hasn't already been asked on the site. – Charles Duffy Jul 12 '18 at 11:28

1 Answers1

0

lines = [line for line in file(filename)] will store each line from your data file into an array of strings, note that your data is separated by \t and each line ends with \n as all for line in file do.

Calling lines[0].strip().split('\t')[1:] means: from list of strings lines get first line (which I assume contains information about your data and not the actual data) strip() will remove the \n at the end, split('\t') will separate the information into a list where then you store [1:] (2nd to last elements) inside cols.

rows = [] is creating an list to store information about your data row-wise just like cols is a list storing information column-wise.

Since we already parsed information in the first line lines[0] you want to process the rest so you loop over lines[1:], before [1:] was used to go over the information in the first line and now to go over the remaining lines.

p=line.strip().split('\t') as before will strip your line of \n and split it on \t to store all your row info and data. p[0] I assume is your row-wise info of data and stored in rows list while the remainder is the actual data (in your case integers) stored in data which is an array (list of lists of floats).

Even though your data is all in integers, for line in file will read it as a string. float() is used to be able to store the data as something you can use for mathematical operations later if you so need.

In the end if your file looks like this:

Something name1 name2 name3 ...

condition1 data1.1 data1.2 data1.3 ...

condition2 data2.1 data2.2 data2.3 ...

condition3 data3.1 data3.2 data3.3 ...

your output:

cols = ['name1', 'name2', 'name3', ...]
rows = ['condition1', 'condition2', 'condition3', ...]
data = [[data1.1, data1.2, data1.3, ...], [data2.1, data2.2, data2.3, ...], [data3.1, data3.2, data3.3, ...], ...]
Community
  • 1
  • 1
Hadi Farah
  • 1,091
  • 2
  • 9
  • 27
  • 1
    Welcome to the site, and thank you for putting in the effort to build a comprehensive answer! That said, sometimes it's worth trying to get a more specific and unique question first -- a question about a single person's very specific code is likely only to be helpful to them, whereas a question about a single, more general problem will be helpful to everyone with that same issue. See "Answer Well-Asked Questions" in [How to Answer](https://stackoverflow.com/help/how-to-answer), and [How to handle “Explain how this ${code dump} works” questions](https://meta.stackoverflow.com/questions/253894). – Charles Duffy Jul 12 '18 at 11:35
  • 1
    @CharlesDuffy thank you for your clarification. If this needs to be taken down let me know. I would link him the duplicates but everyone seems to send him links to only his first question about `[1:]` however in his post he asks 3 separate questions and no one tackled all 3. One approach would be to link him 3 references or more but truthfully I started with a simple answer and then kept editing into this long and elaborate answer. – Hadi Farah Jul 12 '18 at 11:47
  • 1
    No need to delete the answer -- it *is* clearly likely to be helpful to the OP. The "3 separate questions" thing is part of why the question is problematic -- but since you already put in the effort to build a good answer, we certainly wouldn't want to waste the time you spent. – Charles Duffy Jul 12 '18 at 12:40