there are many questions asked about rearranging CSV files with python but I could not find an answer to the following problem:
I want to "merge" csv files on a specific column in such way that the data is joined. This reminds me of an SQL join by the way. Lets say:
There is a number of csv files which all look the same. For simplicity, lets assume there are only four items per row - three data items and a category. So one file looks like:
a,b,c,category1
e,f,g,category2
a,c,c,category3
Another one like:
0,0,0,category1
1,1,1,category1
5,5,5,category2
6,6,6,category2
9,9,9,category4
Now, I want to merge these files to have the following result:
a,b,c,0,0,0,category1
e,f,g,5,5,5,category2
So, the rows need to be joined in such way that the items of every lines are combined while the "category" item is the joining key and that data may be dropped if there are not enough matching rows. Best would be to have a variable number of files so that not only two but X files can be merged (or joined) this way.
Is there a pythonic way to do this?