Unfortunetely, I don't know how to produce a hypothetical dataset to illustrate my question. So I'll just describe what I'm trying to do and hope someone understands.
I have two datasets.
df1: it records which political party is in power for each country in each year. For example: the Republican Party was in power from 2017 to 2021 in US.
df2: it records how Political Parties changes its profile through time. For example: it records that party A is ideologically leftist in 1970 and center-leftist in 1980. This df does not have observations for all years. For example, one does not have data about party A's ideology in 1971, it jumps from 1970 directly into 1980.
What I'm trying to do is extracting party's ideology from df2 to df1.
For example, df1 looks like this:
Country | Year | Government's Political Party ID
X | 1990 | 340
X | 1991 | 340
X | 1992 | 340
X | 1993 | 340
df2 looks like this:
Country | Year | Political Party ID | Ideology
X | 1970 | 340 | center
X | 1985 | 340 | center
X | 1992 | 340 | center-left
X | 1999 | 340 | center-left
df1 and df2 use the same codification of Political Parties. So you may be thinking: well, why not just perform dplyr::left_join()
?
If I do this, my data will look like this:
Country | Year | Government's Political Party ID | Ideology
X | 1990 | 340 | NA
X | 1991 | 340 | NA
X | 1992 | 340 | center-left
X | 1993 | 340 | NA
It would match country, year and ID but produce an unsatisfactory result.
Instead, I want my result to look like this:
Country | Year | Government's Political Party ID | Ideology
X | 1990 | 340 | center
X | 1991 | 340 | center-left
X | 1992 | 340 | center-left
X | 1993 | 340 | center-left
df2 does not have an X-1990 observation, but it can be inferred that 340's ideology has been center since 1985.
How do I perform this?