R - Problem Setup of Dataframe - 3 categorical variables

Question

this might be a silly question but my brain is just stuck. I have a dataset with the variables country (6), Relationship themes (10), and cultural dimensions (6)

in the picture you can see the first 4 relationship themes/levels, the other variables/sub-categories follow further to the right. I already transmuted the table etc. for some small tests where I defined the subcategories for one of them to test on the other, for the basic table it looked like this

But what I really want is to be able to combine all 3 categorical variables, being able to use each of them as a grouping factor or such, within one table. My brain just can't visualize it atm. So I want to have country as a variable with 6 levels, relationship themes as a variable with 10 levels, and cultural dimensions as a variable with 6 levels; represented in one table so I can do histograms that show their relationship, for example.
Since the data is super small I doubt anything beyond correlations etc. makes sense to test, but it would be the goal to figure out the relationship of the frequency of relationship themes with the cultural dimensions.

I hope this makes sense somehow and thanks in advance!

Just a minor comment, a nicer way to write `(Country == "GER" | Country == "CH" | Country == "UK"| Country == "US"| Country == "IN"| Country == "JP")` is `Country %in% c("GER", "CH", "UK", "US", "IN", "JP)` — Gregor Thomas, Nov 23 '21 at 15:49
As for the main part of your question, you want to pivot your data from "wide" to "long" format. See the marked dupe for several methods. I'd focus on the answer using `tidyr`'s `pivot_longer` function. If you need help with the transformation, I'd suggest posting a new question with copy/pasteable data (`dput(your_data[1:10, ])` will make a copy/pasteable version of the first 10 rows of your data) and specifying exactly which columns are "themes" and which are "cultural". — Gregor Thomas, Nov 23 '21 at 15:50
Hi Gregor, thanks for that! I did pivot the table a couple of times to be able to be able to have the cultural dimensions and social themes as variables, but I always just managed to use one of them at the same time this way. I also struggled, because when I have the countries at the top, each of them is a single variable/row and then I can't make them a grouping variable anymore. Is there a solution for that? :) — Jannike, Nov 25 '21 at 14:25
I don't know how to explain better or help more without some sample data I can import and work with. I can't import your picture of a data frame into R, so I am stuck until you share something like `dput(your_data[1:6, ])` (or a similar sample). I also can't help more unless you explain which of your pictured varibales are "relationship themes" which are "cultural dimensions". Your picture shows 5 non-country columns but I don't know which are which. — Gregor Thomas, Nov 25 '21 at 15:31
I also think you're somewhat misunderstanding... when you say "when I have the countries at the top, each of them is a single variable/row". I don't think you should have the countries at the top - I think your country column looks good. I also don't think each country should be a single row, I think you probably need several rows per country. And generally when I say "variable" I mean "column", so when you write "variable/row" it makes me think you are using a different definition - variables should be columns, not rows. — Gregor Thomas, Nov 25 '21 at 15:34

R - Problem Setup of Dataframe - 3 categorical variables

0 Answers0