0

after concatenating the variables using a unite function, there are rows that contain empty spaces which I need to delete in order to analyze the data.

enter image description here

thanks!!

I tried a paste function to remove directly the empty spaces when concatenating, but it didn't work.

jpsmith
  • 11,023
  • 5
  • 15
  • 36
Luise
  • 1
  • 1
    Without seeing the full data, i'd suggest replacing the blanks with `NA` and then run `unite(..., na.rm=TRUE)` – thelatemail May 11 '23 at 00:39
  • By empty spaces, do you mean the instances where there is nothing between two commas? i.e., `a,b,,,e` should be `a,b,e`? – jpsmith May 11 '23 at 00:47
  • Welcome to Stack Overflow. We cannot read data into R from images. Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including a small representative dataset in a plain text format - for example the output from `dput(yourdata)`, if that is not too large. – neilfws May 11 '23 at 00:54
  • @jpsmith yes, that's what I mean – Luise May 11 '23 at 01:27
  • @thelatemail I tried what you said, but it didn't work – Luise May 11 '23 at 01:28
  • @Luise - without your data or an example, we can only guess what might have happened. The method absolutely *does* work (see diomedesdata's example code in their answer below). – thelatemail May 11 '23 at 01:49

2 Answers2

1

Just to start off - please don't post photos of data or code! It's much more useful to do something like dput(head(data, 10)).

One option might be to use str_replace_all(), but it could be slow if your data is really big.

library(dplyr)
library(stringr)

df |>
  mutate(Productos = str_replace_all(Productos, ",{2,}", ",")) |> # remove double
  mutate(Productos = str_replace_all(Productos, "^,|,$", "")) # remove leading/trailing

That said, it looks like unite(..., remove = FALSE, na.rm = TRUE) is going to be better. From the examples:

# To remove missing values:
df %>% unite("z", x:y, na.rm = TRUE, remove = FALSE)
#> # A tibble: 4 × 3
#>   z     x     y    
#>   <chr> <chr> <chr>
#> 1 "a_b" a     b    
#> 2 "a"   a     NA   
#> 3 "b"   NA    b    
#> 4 ""    NA    NA   
diomedesdata
  • 995
  • 1
  • 6
  • 15
0

If your data look like this:

df <- data.frame(Productos = c("Cervezas,Vinos,,Tequilas,Aguardientes,,,Rones,Tabaqueria,Alimentos,Bebidas",
                               "Cervezas,,Ginebras,Tequilas,Aguardientes,,,Rones,Tabaqueria,,"))

You can remove two or more commas and replace them with a single comma, then remove any leading/trailing commas in base R using gsub:

gsub("^,|,$", "", gsub(",{2,}", ",",df$Productos))

Output:

[1] "Cervezas,Vinos,Tequilas,Aguardientes,Rones,Tabaqueria,Alimentos,Bebidas"
[2] "Cervezas,Ginebras,Tequilas,Aguardientes,Rones,Tabaqueria" 
jpsmith
  • 11,023
  • 5
  • 15
  • 36