-3

I am just learning R and am having trouble replicating the use of the separate() function.

I have some data below that I'd like to delimit. My code looks like this:

separate(DF, col ="PARAM_2",paste0("x",1:257),sep="|")

Here is a sample of the raw data:

                                    PARAM_2 TRANSACTION_ID REVENUE

1                             16522337|10086236     3812351327  449.97
2                             21106549|24390750     3851589288   67.98
3                                      23475149     3804446998   54.99
4                                      19397324     3866373678  224.97
5                             23317326|23825351     3820764147  109.99
6                    20433128|20433140|20433165     4962022906  369.94
7                                      19506902     3835040778   10.50
8  24095014|25029701|24244086|24244271|16803155     3910007218  142.97
9                                      24036073     3887666318   22.49
10                   19972354|14519726|18168381     3757376277   98.89

I am not quite sure why but the code is placing one character per row instead of delimiting by | separator. Here's what the output using my flawed code looks like:

     x1    x2    x3    x4    x5    x6    x7    x8    x9   x10   x11   x12   x13   x14   x15   x16   x17
   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1            1     6     5     2     2     3     3     7     |     1     0     0     8     6     2     3
Jaap
  • 81,064
  • 34
  • 182
  • 193
Alan
  • 1
  • 4
  • 2
    The `sep` parameter takes regex, so you need to escape the pipe if you mean it literally: `sep = '\\|'`. Or just don't specify, and it will separate on the pipes anyway. Also, `separate_rows` will work better when you have an uneven number of splits. – alistaire Feb 14 '17 at 19:18
  • 1
    Thanks alistaire, adding \\ in front of the pipe worked! – Alan Feb 14 '17 at 19:27

1 Answers1

0

Instead of specifying the column names manually when the number of delimiters are different for each row, we can use cSplit which does this automatically

library(splitstackshape)
cSplit(DF, "PARAM_2", "|")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    thanks for this, will look into splitstackshape next time. Just trying to grasp basic R functions one at a time. – Alan Feb 14 '17 at 19:28