-2

I'm going through pset6 dna of CS50. I'm having issues with the last part of the program where I compare a dictionary with part of a list of dictionaries.

code:

import csv
import sys


#ensure csv input file
if len(sys.argv) != 3:
    sys.exit("Include csv and txt file")

dna=[]
#open and read csv
file = open(sys.argv[1],"r")
reader = csv.DictReader(file)
for row in reader:
    dna.append(row)

sequence = ""
#open and read txt
file = open(sys.argv[2], "r")
sequence = file.read()

str_list = []
str_list = reader.fieldnames[1:]
seq_count_list = []
seq_count = 0
seq_master_list = {}

#count consecutive STR in sequence
for STR in str_list:
    STR_original = STR
    str_length = len(STR)
    while STR in sequence:
        seq_count += 1
        STR += STR_original
        seq_count_list.append(seq_count)
    seq_master_list[STR[0:str_length]] = str(seq_count)
    seq_count = 0

#unsure how to use this list of names
dna_names = []
for i in dna:
    dna_names.append(i['name'])

#check if sequence matches anyone in dna csv
for row in dna:
    print(row)
    if seq_master_list in row:
        print("FOUND")
    

print(seq_master_list)

Specifically the code:

for row in dna:
    print(row)
    if seq_master_list in row:
        print("FOUND")

Which tries to match dictionary seq_master_list:

{'AGATC': '22', 'TTTTTTCT': '33', 'AATG': '43', 'TCTAG': '12', 'GATA': '26', 'TATC': '18', 'GAAA': '47', 'TCTG': '41'}

with a list of dictionaries dna that looks like:

[
...
{
'name': 'Kingsley', 'AGATC': '7', 'TTTTTTCT': '11', 'AATG': '18', 'TCTAG': '33', 'GATA': '39', 'TATC': '31', 'GAAA': '23', 'TCTG': '14'
},
{
'name': 'Lavender', 'AGATC': '22', 'TTTTTTCT': '33', 'AATG': '43', 'TCTAG': '12', 'GATA': '26', 'TATC': '18', 'GAAA': '47', 'TCTG': '41'
},
{
'name': 'Lily', 'AGATC': '42', 'TTTTTTCT': '47', 'AATG': '48', 'TCTAG': '18', 'GATA': '35', 'TATC': '46', 'GAAA': '48', 'TCTG': '50'
},
...
]

Right now I'm getting error:

TypeError: unhashable type: 'dict'

I want it to spit out the 'name', in this case 'Lavender'.

Edit:

As per Inline Link from JohnGordon.

for row in dna:
    print(row)
    if seq_master_list.items() in row.items():
        print("FOUND")

I don't get any errors...but it doesn't seem to it (no print out "FOUND")

Thanks to Chris Charley for his solution

Mitch
  • 553
  • 1
  • 9
  • 24
  • 1
    You can't use `in` to check that one dictionary is a subset of another. See this answer https://stackoverflow.com/a/41579450/494134 – John Gordon Oct 16 '21 at 21:42
  • Your while loop is counting total appearances of an STR in the sequence instead of the consecutive occurrences of the STR and finding the longest consecutive run for that STR. – Chris Charley Oct 16 '21 at 22:05
  • @ChrisCharley the while loop updates itself on every loop and adds STR to the value its checking. So it keeps looking for a longer and longer substring. It matches the expected results. I just don't know how to compare it at the end and have it spit out the right 'name' – Mitch Oct 16 '21 at 22:08
  • 1
    Right, I missed that! Sorry – Chris Charley Oct 16 '21 at 22:12
  • @JohnGordon so I can only check if a dict is == to another one? How would I check part of a dict with another? – Mitch Oct 16 '21 at 22:34
  • You can't use `in`, but there is another way. See the answer I linked. – John Gordon Oct 17 '21 at 00:36

2 Answers2

1
for STR in str_list:
    STR_original = STR #I think this part should be changed. Why did you indicate STR_original and use STR below line? 
    str_length = len(STR)
    while STR in sequence:
        seq_count += 1
        STR += STR_original
        seq_count_list.append(seq_count)
    seq_master_list[STR[0:str_length]] = str(seq_count)
    seq_count = 0
ugur
  • 33
  • 9
1

To get the dictionaries to compare for equality, you need to get the name key and value out of the row dictionary. If the 'name' key and value are still in the row dictionary, it will never == the seq_master_list dictionary.

found = False
for row in dna:
    name = row.pop('name') # remove this from the 'row' dict and save the name
    if row == seq_master_list:
        found = True
        print(name)
        break

if not found:
    print('no match')
Chris Charley
  • 6,403
  • 2
  • 24
  • 26