1

I have a large data frame with around 190000 rows. The data frame has a label column storing 12 nominal categories. I want to change the weight column value of each row based on the label value of that row. For example, if the label of a row is "Res", I want to change its weight field value to 0.5 and if it is "Condo", I want to change its weight value to 2.

I know it is easy to implement this by if else statement but given the number of rows, the processing time takes so much long. I wanted to use cut() but it seems that cut categorizes based on intervals not nominal categories. I would appreciate any suggestion that can decrease the processing time.

Hamid Z
  • 13
  • 2
  • The most I can give you with the paucity of information you've provided is to try one of these three alternatives: `x$wt <- ifelse(rownames(x) == 'Res', 0.5, 2)`, `x$wt <- switch(rownames(x), Res=0.5, Condo=2, TownHouse=1.3, 1)`, or perhaps even `lst <- list(Res=0.5, Condo=2); x$wt <- lst[rownames(x)]`. If you need more, please [provide](http://stackoverflow.com/help/mcve) [more](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) [information](https://en.wikipedia.org/wiki/Minimal_Working_Example). – r2evans Sep 25 '15 at 22:52
  • If you create a table of your 12 labels and weights you can use `merge` to combine them on the label column (you might need to set `all.x=TRUE`). The `plyr` package also has the `join` function which provides similar functionality. – Branden Murray Sep 25 '15 at 23:02

0 Answers0