0

I'm trying to build a very simple program to turn data that looks like this:

ID  Freq
1   2
2   3
3   4
4   5
5   1
6   1
7   1
8   1
9   1
10  1
11  2
12  2
13  2
14  3
15  3
16  3
17  4
18  5
19  5
20  5
21  5
22  5
23  5
24  5

into two lists in python. This is the for loop I've written:

newlist = []
ID = []

for line in f:
    if len(line.strip())>0:
        l=line.strip().split("\t")
        for i in l[1]:
            newlist+=[i]
        for i in l[0]:
            ID+=[i]

print(newlist)
print(ID)

The problem is it outputs each digit in the multiple digit numbers (e.g. 10 and over in the variable "ID") as a separate element.

e.g:

['1', '2', '3', '4', '5', '6', '7', '8', '9', '1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9', '2', '0', '2', '1', '2', '2', '2', '3', '2', '4']

Instead of this:

['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24']

I've looked at the unzip function for tuples, but this is different as the data are not tuples. Rather, the issue is getting python to read each two-digit number as one iterable, rather than each digit as an iterable.

Cole Robertson
  • 599
  • 1
  • 7
  • 18

1 Answers1

3

You don't need those interior for loops. Just add the items of l directly. Also, use append instead of +=.

newlist = []
ID = []

for line in f:
    if len(line.strip())>0:
        l=line.strip().split("\t")
        newList.append(l[1])
        ID.append(l[0])
Kevin
  • 74,910
  • 12
  • 133
  • 166
  • Thanks. Why append instead of +=? – Cole Robertson Jul 24 '15 at 17:00
  • Every time you do `some_list += [item]`, it creates a brand new copy of `some_list`, adds the item to it, and updates the variable with that value. This operation runs in O(n) time - the longer the list is, the longer it takes. `append` does not create a new copy of the list. It takes O(1) time - no matter how long the list is, it takes the same amount of time. – Kevin Jul 24 '15 at 17:11
  • That makes sense. However, if I were to use an internal for loop (e.g. using a counter) is there a simple syntax change to make it count each data point rather than each digit? – Cole Robertson Jul 25 '15 at 12:25
  • 2
    @Kevin: uhm, `+=` augmented assignment does **not** create a new copy of `some_list`. The whole *point* of adding augmented assignments to the language was to allow for in-place editing of mutables. – Martijn Pieters Jul 25 '15 at 13:43
  • 1
    @Kevin: `newList += [l[i]]` is exactly equivalent to `newList.append(l[i])`. – Martijn Pieters Jul 25 '15 at 13:46