0

I have three dataframes:

cities_df, which contains the name of a city amongst other fields

cities_df <- data.frame(
  city_name = c("London", "Newcastle Upon Tyne", "Gateshead"),
  city_population = c(8673713L, 289835L, 120046L),
  city_area = c(1572L, 114L, NA)
)

states_df, which contains the name of a state amongst other fields

states_df <- data.frame(
  state_name = c("Greater London", "Tyne and Wear"),
  state_population = c(123, 456)
)

dictionary_df, which contains the whole list of cities and their corresponding state.

dictionary_df <- data.frame(
  city_name = c("London", "Newcastle Upon Tyne", "Gateshead"),
  state = c("Greater London", "Tyne and Wear", "Tyne and Wear")
)

Is there any way to relate/link cities_df and states_df dataframes so I can have an easy way to get all the cities' fields that belong to a certain state?

ccamara
  • 1,141
  • 1
  • 12
  • 32
  • Do you have a common key? Can you provide sample data? – Michael Griffiths Oct 10 '16 at 17:42
  • No, I don't have any common date field. Let me work on the sample dataframes, as in fact I am working with different data but for sake of clarity I renamed it to cities and states, which everyone understands the type of relationship between both DF. – ccamara Oct 10 '16 at 17:45
  • I have uploaded two sample dataframes with few rows, just to get an idea of the data structure. – ccamara Oct 10 '16 at 17:56
  • 2
    I think it could be both easier for you and us if you just added the data into the question. Why would you put it into those links? Unless you are planning to remove the data evantually – David Arenburg Oct 10 '16 at 18:02
  • Agreed, but I did it the best way I could. I would be able to add code to download that dataframe, but that's not what you are asking for. If you think there's a better way, could you please tell me how to do it? – ccamara Oct 10 '16 at 18:13
  • Hmm... Simply `cities_df <- read.csv(text = "city_name,city_population,city_area London,8673713,1572 Newcastle Upon Tyne,289835,114 Gateshead,120046,NA")` and `states_df <- read.csv(text = "state_name Greater London Tyne and Wear")`? – David Arenburg Oct 10 '16 at 18:15
  • 2
    [How to make a great R reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Rich Scriven Oct 10 '16 at 18:15
  • Thanks for the link, @RichScriven, for the tip (David Arenburg) and thanks for the edit proposal (Michael Griffiths). Hope it's clearer now. – ccamara Oct 10 '16 at 18:23
  • 1
    I have provided more details and added a third dataframe – ccamara Oct 10 '16 at 18:50
  • I'm assuming that your answer to my question is, then, to merge those dataframes. Thank you for it and thanks for the link. – ccamara Oct 10 '16 at 18:56
  • Maybe one comment when merging by columns that have different names but corresponding values you can set `by.x` and `by.y` in the merge command. – Daniel Winkler Oct 10 '16 at 22:46

1 Answers1

1

Using merge, see linked post for more options:

# tidy up column name to match with other column names
colnames(dictionary_df)[2] <- "state_name"

# merge to get state names
x <- merge(cities_df, dictionary_df, by = "city_name")

# merge to get city names
y <- merge(states_df, dictionary_df, by = "state_name")

# merge by city and state
result <- merge(x, y, by = c("state_name", "city_name"))
result
#       state_name           city_name city_population city_area state_population
# 1 Greater London              London         8673713      1572              123
# 2  Tyne and Wear           Gateshead          120046        NA              456
# 3  Tyne and Wear Newcastle Upon Tyne          289835       114              456
Community
  • 1
  • 1
zx8754
  • 52,746
  • 12
  • 114
  • 209