0

I want to standardize the changesize as the following code,and It seems to me there is no problem in my code. Why this gives me the error as

Error in summarise_impl(.data, dots) : expecting a single value

str(pricechange_0.5_2)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':1907600 obs. of  13 variables:
$ url           : chr  "http://item.yhd.com/item/1242267" 
$ time          : chr  "2016-01-02" "2016-01-02" "2016-01-02" "2016-01-02" ...
$ changPrice    : num  0.997 1 1 1 1 ...
$ classify      : Factor w/ 251 levels "","Clothing"....
$ changesize    : num  -0.334 0 0 0 0 ...
$ abs_changesize: num  0.334 0 0 0 0 ...

library(dplyr)
by_url <- group_by(pricechange_0.5_2,url)
url_datad <- summarise(by_url,url_sd_chasize=(changesize -
                                              mean(changesize))/sd(changesize))

Here is a sample of my data.

A tibble: 10 × 3                                                                           
                             url  hangesize abs_changesize                                                                                      
                           <chr>      <dbl>          <dbl>                                                       
http://item.yhd.com/item/1242267 -0.3343999      0.3343999
  http://item.jd.com/418657.html  0.0000000      0.0000000
...

Any other way to standadrize the changesize as url?

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
Sophia Jiang
  • 19
  • 1
  • 5
  • 1
    can you give us a sample of your data? you can do `dput(head(pricechange_0.5_2,url))` and copy the result to do that, it will make it easier to solve your problem – Derek Corcoran Dec 26 '16 at 12:46
  • You have more than a single value of `changesize` per `pricechange_0.5_2, url` combination and `summarise` isn't designed to handle it. That's about it. – David Arenburg Dec 26 '16 at 12:54
  • Good suggestion.Thank you very much. – Sophia Jiang Dec 26 '16 at 12:56
  • @DavidArenburg I thought just as you are saying, but using mtcars I tried this: `by_am <- group_by(mtcars, am)` followed by this `url_datad <- summarise(by_am, sd_chasize = mean(mpg)/sd(mpg))` and it works is the minus that screws it up – Derek Corcoran Dec 26 '16 at 12:58
  • No, you didn't understand what I said. If you want to reproduce this with `mtcars`, try `mtcars %>% group_by(cyl) %>% summarise(mpg)` for instance- maybe that will help you understand – David Arenburg Dec 26 '16 at 13:02
  • @Derek Corcoran Yeah,you're right. ' url_datad <- summarise(by_am, sd_chasize = mean(mpg)/sd(mpg)) ' works for my data.It is the minus that screws it up. – Sophia Jiang Dec 26 '16 at 13:17
  • @ David Arenburg. Thanks for your help. There are 9834 urls and 2million Obs in total, so it is not a single value.But I want to standarize the changesize of obs according to the variable url, what else can I do instead of summarise? – Sophia Jiang Dec 26 '16 at 13:21
  • Again, You don't have a single value **per group**, not in total (I understand you have more than a single value in total). Please follow the examples [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) in order to provide a **minimal** working example (by that I mean that **we can reproduce the error on your example data set**) – David Arenburg Dec 26 '16 at 13:52
  • @SophiaJiang let me know if my answer was what you were looking for – Derek Corcoran Dec 26 '16 at 14:00
  • I think you want `mutate` rather than `summarise` – Richard Telford Dec 26 '16 at 15:38

1 Answers1

-2

this is an option:

url_unique <- unique(pricechange_0.5_2$url)

by_url <- list()

for(i in 1:length(url_unique)){
  by_url[[i]] <- filter(pricechange_0.5_2, url == url_unique[i])
  by_url[[i]] <- scale(by_url[[i]])
}

by_url <- do.call("rbind", by_url)

this will scale all the values in your data.frame grouped by url

Derek Corcoran
  • 3,930
  • 2
  • 25
  • 54