0

I have two 2d lists: Both are of same size with size unknown (different for different set of lists)

For instance:

A = [['ID', 'Name', 'Profession'], [1, 'Tom', 'Teacher'], [2, 'Dick', 'Actor'], [3, 'Harry', 'Lawyer']]

B = [['ID', 'Name', 'Profession'], [1, 'Tom', 'Police'], [2, 'Dick', 'Actor'], [3, 'Harry', 'Lawyer']]

I want to compare the files element wise (e.g: a[0][1] == b[0][1]) for all the elements and print the difference with element index.

I would like to have output something like this:

a[1][2] = Teacher <> b[1][2] = Police

It would be great if I could compare the lists using primary key (ID) in case the list is not in order with output as below:

Profession of ID = 1 does not match, i.e Teacher <> Police

Note: file may be very huge (matrix of 100*10000)

Thanks.

Raphael
  • 47
  • 5
Thoufeeque
  • 15
  • 1
  • 5

4 Answers4

1

You can do this:

A = [['ID', 'Name', 'Profession'], [1, 'Tom', 'Teacher'], [2, 'Dick', 'Actor'], [3, 'Harry', 'Lawyer']]

B = [['ID', 'Name', 'Profession'], [1, 'Tom', 'Police'], [2, 'Dick', 'Actor'], [3, 'Harry', 'Lawyer']]

A = {a[0]: {'Name': a[1], 'Profession': a[2]} for a in A[1:]}
B = {b[0]: {'Name': b[1], 'Profession': b[2]} for b in B[1:]}

for a_id, a_content in A.items():
    a_profession = a_content['Profession']
    b_profession = B[a_id]['Profession']
    equal_profession = a_profession == b_profession
    match = 'matches' if equal_profession else 'does not match'
    diff_profession = f", i.e {a_profession} <> {b_profession}" if not equal_profession else ''
    print(f"Profession of ID = {a_id} {match}{diff_profession}")

Which ouputs:

>>> Profession of ID = 1 does not match, i.e Teacher <> Police
>>> Profession of ID = 2 matches
>>> Profession of ID = 3 matches

marcos
  • 4,473
  • 1
  • 10
  • 24
0

Try this:

A = [['ID', 'Name', 'Profession'], [1, 'Tom', 'Teacher'], [2, 'Dick', 'Actor'], [3, 'Harry', 'Lawyer']]

B = [['ID', 'Name', 'Profession'], [1, 'Tom', 'Police'], [2, 'Dick', 'Actor'], [3, 'Harry', 'Lawyer']]

# If A and B have equal length
for k in range(len(A)):
    i = A[k]
    j = B[k]

    # If the nested lists of both A and B has same length
    l = len(i)-1
    while(l>=0):
        if not i[l] is j[l]:
            print(f"a[{k}][{l}] = {i[l]} <> b[{k}][{l}] = {j[l]}")
        l -= 1
mnislam01
  • 73
  • 7
  • nothing should get printed when elements are matching. by the code provided by you I am getting below output: a[0][2] = Profession <> b[0][2] = Profession a[0][1] = Name <> b[0][1] = Name a[0][0] = ID <> b[0][0] = ID a[1][2] = Teacher <> b[1][2] = Police a[1][1] = Tom <> b[1][1] = Tom a[2][2] = Actor <> b[2][2] = Actor a[2][1] = Dick <> b[2][1] = Dick But it should only be: a[1][2] = Teacher <> b[1][2] = Police – Thoufeeque Jan 07 '20 at 20:57
  • Try now, again. This should work. As I have tested with the lists you provided. If with the ones you're testing is different can you show them? – mnislam01 Jan 08 '20 at 10:28
0

The following code should do the work:

for i in range(len(A)):
        B[i]
        A[i]
        if A[i] == B[i]: continue
        print(f'Differences in row{i}:', end='\n')
        for j in range(len(A[i])):
            if A[i][j] != B[i][j]:
                print(f'    in col {j}: A = {A[i][j]}, B = {B[i][j]}', end='\n')

For given A, B it will print:

Differences in row1:
    in col 2: A = Teacher, B = Police

Should work for any amount of variables you decide to input. Please note f strings were only introduced in python 3.6 so if you have errors you can change to string.format

Isdj
  • 1,835
  • 1
  • 18
  • 36
0

To compare your list elements using their primary key, let's index them by key using a dictionary:

A_dict = {a[0]: a[1:] for a in A}
B_dict = {b[0]: b[1:] for b in B}

Then, iterate on the keys and columns and print the differences:

column_names = A[0][1:]
for id in A_dict.keys():
    for column in range(len(column_names)):
        if A_dict[id][column] != B_dict[id][column]:
            print(f"{column_names[column]} of ID = {id} does not match, "
                  f"i.e {A_dict[id][column]} <> {B_dict[id][column]}")

Which gives the output you wanted:

Profession of ID = 1 does not match, i.e Teacher <> Police

Edit: To answer your comment, if your ID is not necessary in the first colon, it is a little more complicated:

# get the number of the 'ID' column
column_names = A[0]
column_id = column_names.index('ID')

# get the column names without 'ID'
values_name = column_names[0:column_id] + column_names [column_id+1:]

# create a dictionary with keys in column `column_id`
# and values the list of the other column values
A_dict = {a[column_id]: a[0:column_id] + a[column_id+1:] for a in A}
B_dict = {b[column_id]: b[0:column_id] + b[column_id+1:] for b in B}

# iterate on the keys and on the other columns and print the differences
for id in A_dict.keys():
    for column in range(len(column_names) - 1):
        if A_dict[id][column] != B_dict[id][column]:
            print(f"{values_name[column]} of ID = {id} does not match, "
                  f"i.e {A_dict[id][column]} <> {B_dict[id][column]}")
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Raphael
  • 47
  • 5
  • Do not hesitate to ask for details if I used a syntax you are unfamiliar with. – Raphael Jan 07 '20 at 18:07
  • i am a newbie to coding. this was not a list initially. it was in fact a csv file, which i managed to open and somehow convert to a nested list. could you please assist me to get the similar result for comparison of csv files. you can refer below link for detailed question: https://stackoverflow.com/questions/59636059/how-to-compare-2-csv-files-in-python-value-by-value-and-print-the-difference – Thoufeeque Jan 07 '20 at 21:07
  • @Thoufeeque No problem, but what I don't understand is: if you managed to open and convert your CSVs to nested lists, why can't you compare the lists using this answer? Do you get an error? – Raphael Jan 07 '20 at 22:02
  • No your code is working perfectly fine. The problem is I've got the list line by line for only 2 lines which already has so many steps which I feel are irrelevant.. Hope you understand. Please reply on https://stackoverflow.com/questions/59636059/how-to-compare-2-CSV-files-in-python-value-by-value-and-print-the-difference if you have a solution – Thoufeeque Jan 08 '20 at 06:37
  • Also please note that the primary key could be anywhere. Not necessarily at the 1st position. For example: A = [['Name', 'ID', 'Profession'], ['Tom', 1, 'Teacher'], ['Dick', 2, 'Actor']] B = [['Name', 'ID', 'Profession'], ['Tom', 1, 'Police'], ['Dick', 2, 'Actor']] – Thoufeeque Jan 08 '20 at 07:50
  • I'm not sure I understand... Can you mark this answer as accepted and edit the other question to show the list you currently have, and the list you want to make? And also include how you got the list (your code)? Thanks – Raphael Jan 08 '20 at 08:08
  • Ok, I will adapt my answer for the primary key being everywhere – Raphael Jan 08 '20 at 08:09
  • I have updated the other question: https://stackoverflow.com/questions/59636059/how-to-compare-2-csv-files-in-python-value-by-value-and-print-the-difference – Thoufeeque Jan 08 '20 at 09:14
  • 1
    The updated code is working fine for all the columns before primary key. But it's taking column - 1 for all the columns after the primary key. For Name column it is working fine, but for Profession column it is printing as 'id' Output example: Name of ID = 1 does not match, i.e Tom <> Thomas id of ID = 1 does not match, i.e Teacher <> Police – Thoufeeque Jan 08 '20 at 11:10
  • You are right. I updated the answer to correct this mistake – Raphael Jan 08 '20 at 11:43