1

I want to apply a diff to a data frame column, accessing it by its name. What I do is the following:

abscissa <- "distance"
data.op[, abscissa]

I get this output

# A tibble: 15 x 1
   distance
        <dbl>
 1     0.0426
 2     0.0409
 3     0.0412
 4     0.0406
 5     0.0406
 6     0.0407
 7     0.0402
 8     0.0403
 9     0.103 
10     0.0402
11     0.0395
12     0.0407
13     0.0406
14     0.0405
15     0.0404

Then I simply try:

 diff(data.op[, abscissa])

But the output is:

 # A tibble: 15 x 0

I also tried data.op[, abscissa] %>% diff and data.op %>% select(abscissa) %>% diff with the same zero column result.

However, if I do

diff(as.data.frame(data.op)[, abscissa])

It works:

[1] -0.00169560  0.00024120 -0.00061200  0.00000000  0.00013320 -0.00045360  0.00003240  0.06299047 -0.06306967 -0.00071640  0.00120960 -0.00007920
[13] -0.00010440 -0.00005400

When I type str(data.op), I get:

Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 15 obs. of  28 variables:
...

What I don't understand is:

  • Why is my data frame a tibble ? Ok, I installed tidyverse, but I never used it on this data frame before.

Edit: Not true, I did use mapvalues() function to create the data frame, so I guess it's why it is a tibble, not a simple data frame.

  • My data.op is also a data frame, so why doesn't diff(data.op[, abscissa]) work?

  • Why data.op[, abscissa] %>% diff and data.op %>% select(abscissa) %>% diff don't work either?

  • Do I really need to convert it to a data frame to make a simple diff? This doesn't help readability...

Sorry I can't provide a more reproducible example. I tried with mtcars and everything works as expected (but mtcars is a data frame, not a tibble). At some point, my data.op data frame was converted to a tibble, but I have no idea why.

Ben
  • 6,321
  • 9
  • 40
  • 76
  • 1
    Related: [Why does subsetting a column from a data frame vs. a tibble give different results](https://stackoverflow.com/questions/39918774/why-does-subsetting-a-column-from-a-data-frame-vs-a-tibble-give-different-resul) – Henrik Feb 15 '19 at 10:54

1 Answers1

5

To provide a reproducible example with mtcars

library(tidyverse)
df <- as.tibble(mtcars)
abscissa <- "mpg"

Now, when you do

diff(df[, abscissa])
# A tibble: 32 x 0

but

diff(mtcars[, abscissa])
#[1]   0.0   1.8  -1.4  -2.7  -0.6  -3.8  10.1  -1.6  -3.6  -1.4  -1.4   0.9
#[13]  -2.1  -4.8   0.0   4.3  17.7  -2.0   3.5 -12.4  -6.0  -0.3  -1.9   5.9
#[25]   8.1  -1.3   4.4 -14.6   3.9  -4.7   6.4

works fine.

That is because

class(df[, abscissa])
#[1] "tbl_df"     "tbl"        "data.frame"

whereas

class(mtcars[, abscissa])
#[1] "numeric"

Now from ?diff

x - a numeric vector or matrix containing the values to be differenced

Hence, it does not work with tibbles.

You could do

df %>% pull(abscissa) %>% diff
# [1]   0.0   1.8  -1.4  -2.7  -0.6  -3.8  10.1  -1.6  -3.6  -1.4  -1.4   0.9
#[13]  -2.1  -4.8   0.0   4.3  17.7  -2.0   3.5 -12.4  -6.0  -0.3  -1.9   5.9
#[25]   8.1  -1.3   4.4 -14.6   3.9  -4.7   6.4

since

df %>% pull(abscissa) %>% class
#[1] "numeric"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213