3

I'm trying to join two columns of a SpatialDataFrame (shapefile) into one using the R program, but in both columns there are empty spaces, when they are together with the name plus NA, however I would like the NAs not to appear in my new column. I used the paste function. something like this:

  This is the structure of my SpatialDataFrame:


  ID           city                city2
1  1      saõ paulo                 <NA>
2  2      Rio de Janeiro            <NA>
3  3           <NA>            Belo Horizonte
4  4           <NA>            Curitiba

obs. my original data is not this and has more columns

I used this:

data$newCity <- paste(data$city, data$city2) # I don't want to show in my data Na

1.

ID          city          city2                newCity
  1      saõ paulo         <NA>            saõ paulo NA
  2  Rio de Janeiro        <NA>            Rio de Janeiro NA
  3        <NA>       Belo Horizonte       NA Belo Horizonte
  4        <NA>       Curitiba             NA Curitiba

In fact this would be the desired result:

ID          city          city2                 newCity
 1      saõ paulo         <NA>                saõ paulo
 2    Rio de Janeiro      <NA>               Rio de Janeiro
 3        <NA>         Belo Horizonte         Belo Horizonte
 4        <NA>          Curitiba              Curitiba

4 Answers4

2

Another base R option could be:

with(df, pmax(city, city2, na.rm = TRUE))

[1] "sao paulo"      "rio de janeiro" "Belo Horizonte" "Curitiba" 
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
1

Using paste glues the character columns together, separated by a space, i.e. "_". Try this:

data$newCity <- ifelse(is.na(data$city), data$city2, data$city)
stefan
  • 90,330
  • 6
  • 25
  • 51
  • Hi, thank you, I tried with this code but it didn't work, see what returned me: ID city city2 newCity 1 1 saõ paulo 2 2 2 Rio de Janeiro 1 3 4 Belo Horizonte 1 4 5 Curitiba 2 – Kledson Lemes Mar 04 '20 at 19:47
  • Looks like the character columns are actually factors. You can check this with `str(data)`, which shows you the type of your variables. In case of factors this worked for me: `df$newCity <- ifelse(is.na(as.character(df$city)), as.character(df$city2), as.character(df$city))`. Preferable solution would be to convert factors to character columns after loading the data. – stefan Mar 04 '20 at 20:12
1

You can use unite() in tidyr:

library(tidyr)

df %>%
  unite(newCity, city:city2, remove = F, na.rm = T)

The argument na.rm = T works only on character columns.

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
0

You can use the function coalesce from dplyr package:

df <- data.frame(ID = 1:4,
                 city = c("sao paulo", "rio de janeiro", NA, NA),
                 city2 = c(NA, NA, "Belo Horizonte", "Curitiba"), stringsAsFactors = FALSE)


library(dplyr)
df %>% mutate(City = coalesce(city, city2))
  ID           city          city2           City
1  1      sao paulo           <NA>      sao paulo
2  2 rio de janeiro           <NA> rio de janeiro
3  3           <NA> Belo Horizonte Belo Horizonte
4  4           <NA>       Curitiba       Curitiba
dc37
  • 15,840
  • 4
  • 15
  • 32
  • returned the following error: Error in UseMethod("mutate_") : _no applicable method for 'mutate_' applied to an object of class "c('SpatialPolygonsDataFrame', 'SpatialPolygons', 'Spatial', 'SpatialVector')"_, I think it's because my real data is a spatialDataframe – Kledson Lemes Mar 04 '20 at 20:11
  • I presume it is indeed related to your `SpatialPolygonsDataframe`. Can you edit your question to provide the output of `head(NameofYourSpatialDataframe)` ? – dc37 Mar 04 '20 at 21:41