0

I am hoping to find way to append only the unique item numlookup and wholetoken. Is there a good way to do this?

numlookup = defaultdict(list) 
wholetoken = defaultdict(list)

#mydata is file containing mutation description
mydata = open('/mutation_summary.txt')
for line in csv.reader(mydata, delimiter='\t'):
    code = re.match('[a-z](\d+)[a-z]', line[-1], re.I)
    if code: 
        numlookup[line[-2]].append(code.group(1))
        wholetoken[line[-2]].append(code.group(0))

When i try to use set i got this as error when i call lookup(id) and wholelookup(id): TypeError: 'set' object is not callable

lookup =set()
wholelookup =set()

with open('mutation_summary.txt') as mydata:
    for line in csv.reader(mydata, delimiter='\t'):
        code = re.match('[a-z](\d+)[a-z]', line[-1], re.I)
        if code: 
            lookup.add(code.group(1))
            wholelookup.add(code.group(0))
Chad D
  • 499
  • 1
  • 10
  • 17

1 Answers1

0

Why not turn it into a defaultdict of sets? It only keeps the uniques.

If that is not an option, then you could try:

if code:
    if code.group(1) not in numlookup[line[-2]]:
        numlookup[line[-2]].append(code.group(1))
    if code.group(0) not in wholetoken[line[-2]]:
        wholetoken[line[-2]].append(code.group(0))
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
  • I am not sure how to call set on a other part of my code.. but with list i know how to call it.. like list[ids] would calling set with particular ids? – Chad D Jul 25 '12 at 18:52
  • That would work. You could also iterate over a `set`, though you can't index into it – inspectorG4dget Jul 25 '12 at 18:54
  • i don't really get that ... would it work if i put the pass something in for example whole token(id)? – Chad D Jul 25 '12 at 18:57
  • 1
    If you have a set containing the numbers `1,2,3,4,5` and you tried to insert `5`, it will just ignore the insert. If you try to insert `6`, it will insert `6`. You can iterate over a set (`for num in mySet: print num`). You cannot index into a set (`print mySet[4]` is not allowed) – inspectorG4dget Jul 25 '12 at 19:00
  • Nononono! I meant use `defaultdict(set)` – inspectorG4dget Jul 25 '12 at 19:09