0

I'm stuck again at an error I don't understand. I have a big data.frame with multiple stock parameters, such as price earnings ratio etc. Now I want to calculate between columns, i.e.:

MyDataFrame$NewColumn = MyDataFrame$Column1/MyDataFrame$Column2

This worked. However this doesn't work an generates an error:

Index$ValuationScore = 0.3*Index$PE1_Score + 0.2*Index$PE2_Score + 0.1*Index$PE3_Score

The scores are values between 1-6. The first 60 rows of my table don't contain any data only NAs since I need to calculate an average earlier on looking back 60 periods. The error message I get is:

Warning messages:
1: In Ops.factor(0.3, Index$PE1_Score) : * not meaningful for factors
2: In Ops.factor(0.2, Index$PE2_Score) : * not meaningful for factors
3: In Ops.factor(0.1, Index$PE3_Score) : * not meaningful for factors
John Paul
  • 12,196
  • 6
  • 55
  • 75
MichiZH
  • 5,587
  • 12
  • 41
  • 81
  • 1
    Might be helpful when posting here in the future to use `Sys.setenv(LANGUAGE="en")` before your code so that errors/warnings come out in English. I've edited your question accordingly. – Thomas Oct 07 '13 at 12:03
  • 3
    You need to convert your variables to something other than factor, perhaps numeric. – Thomas Oct 07 '13 at 12:04
  • Thx alot for both, learned something again :-) This works. However what's the difference between factor and numeric? In the first case the division worked..this is strange since MyDataFrame$NewColumn in my example is actually how I've calculated these PE_Scores. So they must be already numeric actually aren't they? – MichiZH Oct 07 '13 at 12:08
  • 1
    Hard to say without seeing `str(Index)`. – Thomas Oct 07 '13 at 12:11
  • Looked at it and the division is done with to num columns, whereas the multiplication was tried on a factor column. So the factor column is a result of the division of two num columns. Does this makes sense? – MichiZH Oct 07 '13 at 12:14
  • It would be helpful to have a whole reproducible example: http://stackoverflow.com/a/5963610/2588184. Otherwise it's hard to figure out how that column became a factor. – mrip Oct 07 '13 at 12:32
  • I don't know how to do that since the data is huge and in some parts confidential. But well the solution works so I'm happy, thx :-) – MichiZH Oct 07 '13 at 12:34
  • Always take a look at your data with the function `str` (just like `head`). Beware, `factors` are sneaky! You cannot do numerical calculations or string operations on them (sometimes you can, but results are not what you expect), so *always* convert factors to numerics or characters. – MrGumble Oct 07 '13 at 12:42
  • You could try `dput(head(Index[,c('var1','var2']))`, where `var1` and `var2` are your two original numeric columns that you're dividing. Then use that as the basis for a reproducible example. – Thomas Oct 07 '13 at 12:42
  • Be very careful when converting factors to numeric that you're not converting the factor representation to its numeric equivalent. For example, in binary classification it's common to have your dependent variable be a factor with level "0" and "1". R actually stores these as the first and second levels of a factor, so if you go straight to numeric your "0" and "1"s will convert to 1s and 2s. For example as.numeric(as.factor(c("0","0","1"))) won't give you the results you 'd think. Convert to char first and then use as.numeric: as.numeric(as.character(as.factor(c("0","0","1")))) – TomR Jan 31 '14 at 20:11

0 Answers0