I have a dataset containing historical transaction records for real estate properties. Each property has an ID number. To check if the data is complete, for each property I am identifying a "transaction chain": I take the original buyer, and go through all intermediate buyer/seller combinations until I reach the final buyer of record. So for data that looks like this:
Buyer|Seller|propertyID Bob|Jane|23 Tim|Bob|23 Karl|Tim|23
The transaction chain will look like: [Jane, Bob, Tim, Karl]
I am using three datasets to do this. The first contains the names of only the first buyer of each property. The second contains the names of all intermediate buyers and sellers, and the third contains only the final buyer for each property. I use three datasets so I can follow the process given by vikramls answer here.
In my version of the graph dictionary, each seller is a key to its corresponding buyer, and the oft-cited find_path function finds the path from first seller to last buyer. The problem is that the dataset is very large, so I get a maximum recursion depth reached error. I think I can solve this by nesting the graph dictionary inside another dictionary where they key is the property id number, and then searching for the path within ID groups. However, when I tried:
graph = {}
propertyIDgraph = {}
with open('buyersAndSellers.txt','r') as f:
for row in f:
propertyid, seller, buyer = row.strip('\n').split('|')
graph.setdefault(seller, []).append(buyer)
propertyIDgraph.setdefault(propertyid, []).append(graph)
f.close()
It assigned every buyer/seller combination to every property id. I would like it to assign the buyers and sellers to only their corresponding property ID.