I have a dataframe containing variables on trade flows between pairs of countries, one country being the exporter and one being the importer for each row.
I want to create an ID number variable which identifies each unordered country pair, giving the same ID number to each pair regardless of which is the exporter and which is the importer. So Australia-United States would have the same ID as United States-Australia but a different ID to Australia-Great Britain.
This is an example of what the data with the ID variable would look like.
YEAR ISO_EXP ISO_IMP UNORD_PAIR_ID
1970 AUS GBR 1
1970 AUS USA 2
1970 AUS ZIM 3
1970 GBR AUS 1
1970 GBR USA 4
1970 GBR ZIM 5
1970 USA AUS 2
1970 USA GBR 4
1970 USA ZIM 6
1970 ZIM AUS 3
1970 ZIM GBR 5
1970 ZIM USA 6
My dataset has around 2 million rows, comprising around 44,000 country pairs over 47 years.
I have used the following code to create an ID for each ordered country pair.
data$ORD_PAIR_ID <- data %>% group_indices(data$ISO_EXP, data$ISO_IMP)
But I have not been able to work out how to create an ID for unordered pairs.
Any help greatly appreciated.