Using R Separate_Rows doesn't work with a "|"

Question

Have a CSV file which has a column which has a variable list of items separated by a |.

I use the code below:

violations <- inspections %>% head(100) %>% 
  select(`Inspection ID`,Violations) %>% 
  separate_rows(Violations,sep = "|")

but this only creates a new row for each character in the field (including spaces)

What am I missing here on how to separate this column?

Please add data using `dput` and show the expected output for the same. Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). — Ronak Shah, Aug 24 '20 at 00:18
Possible duplicate https://stackoverflow.com/questions/27721008/how-do-i-deal-with-special-characters-like-in-my-regex — Ronak Shah, Aug 24 '20 at 00:18
it's probably due to the fact that separate rows uses regex try `separate_rows(Violations,sep = '\\|')` — Abdessabour Mtk, Aug 24 '20 at 00:21
Appreciate the feedback on how my question can be better and will look to increase the information presented in any future ones. — dfaberjob, Aug 24 '20 at 00:24
It seems like the '\\|" suggestion solved my problem. Appreciate the suggestion Abdessabour — dfaberjob, Aug 24 '20 at 00:24

score 4 · Accepted Answer · answered Aug 23 '20 at 20:15

It's hard to help without a better description of your data and an example of what the correct output would look like. That said, I think part of your confusion is due to the documentation in separate_rows. A similar function, separate, documents its sep argument as:

If character, sep is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values.

but the documentation for the sep argument in separate_rows doesn't say the same thing though I think it has the same behavior. In regular expressions, | has special meaning so it must be escaped as \\|.

df <- tibble(
  Inspection_ID = c(1, 2, 3),
  Violations = c("A", "A|B", "A|B|C"))
separate_rows(df, Violations, sep = "\\|")

Yields

# A tibble: 6 x 2
  Inspection_ID Violations
          <dbl> <chr>     
1             1 A         
2             2 A         
3             2 B         
4             3 A         
5             3 B         
6             3 C

Great to hear. Could you please mark your question as answered when you get a chance? — amoeba, Aug 24 '20 at 07:05

score 0 · Answer 2 · answered Aug 23 '20 at 19:53

0

Not sure what your data looks like, but you may want to replace sep = "|" with sep = "\\|". Good luck!

answered Aug 23 '20 at 19:53

BellmanEqn

791
3
11

score 0 · Answer 3 · answered Aug 25 '20 at 01:06

0

Using sep=‘\|’ with the separate_rows function allowed me to separate pipe delimited values

answered Aug 25 '20 at 01:06

dfaberjob

41
6

Using R Separate_Rows doesn't work with a "|"

3 Answers3

Linked