0

what is the purpose of " \. " and why it is quoted? this is the code:

library(tidyr)

iris.tidy <- iris %>%
  gather(key, Value, -Species) %>%
  separate(key, c("Part", "Measure"), "\\.")

it is for the iris dataset

zx8754
  • 52,746
  • 12
  • 114
  • 209
  • iris dataset https://gist.github.com/curran/a08a1080b88344b0c8a7#file-iris-csv – Ahmed Talib Jan 30 '20 at 08:38
  • 2
    You don't need meaning of `"\\."` in R but meaning of `"\\."` in `separate` specifically. Read `?separate` – Ronak Shah Jan 30 '20 at 08:39
  • Related post: https://stackoverflow.com/q/6638072/680068 We need to escape the escape character. – zx8754 Jan 30 '20 at 08:51
  • I'm not understanding the word key in the separate function, since we're using a pipe, the data argument precedes the function and we don't mention it as first arg, so our first arg with seperate() is col, which is column name or position, the thing is the whole iris dataset does NOT have a column named key! – Ahmed Talib Jan 30 '20 at 08:58
  • 1
    Column name `key` comes from `gather` where you do `gather(key, Value, -Species)` – Ronak Shah Jan 30 '20 at 09:14

2 Answers2

1

. says every character (in a regular expression). If you actually wan't it as a "." (the character itself) you need to "escape" it with a \ which however is a special character in regular expressions as well and therefore also needs to be escaped.

Georgery
  • 7,643
  • 1
  • 19
  • 52
1

It would be easier to understand if you run the code step by step.

gather brings the data in long format with column key with column names and column value with values of those columns

library(tidyr)

iris %>% gather(key, Value, -Species) %>%  head

#  Species          key Value
#1  setosa Sepal.Length   5.1
#2  setosa Sepal.Length   4.9
#3  setosa Sepal.Length   4.7
#4  setosa Sepal.Length   4.6
#5  setosa Sepal.Length   5.0
#6  setosa Sepal.Length   5.4

We then use separate to divide key column in two columns based on "." in their text.

iris %>%
  gather(key, Value, -Species) %>%
  separate(key, c("Part", "Measure"), "\\.") %>% head

#  Species  Part Measure Value
#1  setosa Sepal  Length   5.1
#2  setosa Sepal  Length   4.9
#3  setosa Sepal  Length   4.7
#4  setosa Sepal  Length   4.6
#5  setosa Sepal  Length   5.0
#6  setosa Sepal  Length   5.4

Since the sep argument in separate accepts regex and . has a special meaning in regex, if we want to specify actual . we need to escape it, hence we use "\\.". Also note that gather has been replaced with pivot_longer in the newer version of tidyr.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213