I am comparing two csv files using R/Rstudio and I would like to compare them line by line, but in a specific order based on their columns. If my data looks like:
first <-read.csv(text="
name, number, description, version, manufacturer
A123, 12345, first piece, 1.0, fakemanufacturer
B107, 00001, second, 1.0, abcde parts
C203, 20000, third, NA, efgh parts
D123, 12000, another, 2.0, NA")
second csv:
second <- read.csv(text="
name, number, description, version, manufacturer
A123, 12345, first piece, 1.0, fakemanufacturer
B107, 00001, second, 1.0, abcde parts
C203, 20000, third, NA, efgh parts
E456, 45678, third, 2.0, ")
I'd like to have a for loop that looks something like:
for line in csv1:
if number exists in csv2:
if csv1$name == csv2$name:
if csv1$description == csv$description:
if csv1$manufacturer == csv2$manufacturer:
break
else:
add line to csv called changed, append a value for "changed" column to manufacturer
else:
add line to csv called changed, append a value for "changed" column to description
and so on so that the output then looks like:
name number description version manufacturer changed
A123 12345 first piece 1.0 fakemanufacturer number
B107 00001 second 1.0 abcde parts no change
C204 20000 third newmanufacturer number, manufacturer
D123 12000 another 2.0 removed
E456 45678 third 2.0 added
and if at any point in this loop something doesn't match, I'd like to know where the mismatch was. The lines can match by number OR description. for example, given the 2 lines above, I would be able to tell that number changed between the two csv files. Thanks in advance for any help!!