0

The raw data is presented as below,

Year   Price    Volume   P1    P2    P3   V1    V2    V3
2009    46       125     25    50    75   200   400   600
2009    65       800     25    50    75   200   400   600
2010    20       560     30    55    90   250   500   800
2010    15       990     30    55    90   250   500   800
2011    89       350     35    70    120  250   500   800
2012    23       100     35    70    120  250   500   800
...     ...      ...     ...   ...   ...  ...   ...   ...

I try to create a new column named as "Portfolio". If Price and Volume are smaller than P1 and V1, respectively, Portfolio is equal to 11. Then, if else Price is smaller than P1 but Volume is smaller than V2, Portfolio is equal to 12, and so on. There are 3 breakpoints for Price and also Volume. Therefore, 16 Portfolios are created, which are named 11, 12, 13, 14, 21, 22, 23, 24,...,44. The result would be as the table below,

Year   Price    Volume   P1    P2    P3   V1    V2    V3   Portfolio
2009    46       125     25    50    75   200   400   600    21
2009    65       800     25    50    75   200   400   600    34
2010    20       560     30    55    90   250   500   800    13
2010    15       990     30    55    90   250   500   800    14
2011    89       350     35    70    120  250   500   800    32
2012    23       100     35    70    120  250   500   800    11
...     ...      ...     ...   ...   ...  ...   ...   ...    ...

Could you please help me to solve this issue. I tried if(){} and else if(){} functions. However, I did not get the result as the second table. That is why I post raw data here. Thank you so much.

The code I tried was as the following,

if ((Price<P1)&&(Volume<V1)){data$Portfolio=11}
else if ((Price<P1)&&(Volume<V2)){data$Portfolio=12}
else if((Price<P1)&&(Volume<V3)){data$Portfolio=13}
else if(Price<P1){data$Portfolio=14}
else if((Price<P2)&&(Volume<V1)){Fin_Ret$port=21}
...
else if(Price>P3){data$Portfolio=44}

The output was,

> if ((Price<P1)&&(Volume<V1)){data$Portfolio=11}
> else if ((Price<P1)&&(Volume<V2)){data$Portfolio=12}
Error: unexpected 'else' in "else"
...

When I tried "&" instead of &&", the result showed,

> if ((mkvalt<MV20)&(BM<BM20)){Fin_Ret$port=11}
Warning message:
In if ((mkvalt < MV20) & (BM < BM20)) { :
the condition has length > 1 and only the first element will be used

I am confused maybe I don't understand fundamental things in R.

Weber Chen
  • 19
  • 3
  • 6
  • 1
    could you share your code in which you tried with if() – joel.wilson Nov 14 '16 at 07:10
  • prefer using & instead of &&, because && evaluates just the first row and recycles it for the remaining values – joel.wilson Nov 14 '16 at 07:19
  • 1
    You want `ifelse()` not `if(){} else{}`. Possible duplicate: [Vectorized if in R](http://stackoverflow.com/q/4042413/903061). @joel.wilson `&&` is preferable with `if()` because they both evaluate only one element. Only switching to `&` won't do anything unless you also switch to `ifelse()` -- both need to change. – Gregor Thomas Nov 14 '16 at 07:21
  • Also related - [alternative to nested `ifelse`](http://stackoverflow.com/q/18012222/903061) – Gregor Thomas Nov 14 '16 at 07:25
  • @Gregor thank you for pointing out that.. i do agree – joel.wilson Nov 14 '16 at 07:26
  • joel.wilson and Gregor, Thanks for your teaching I will keep working on R :) I appreciate that. – Weber Chen Nov 14 '16 at 07:51
  • From what I can see from your (incomplete) code snippet I suspect that the edge cases where `Price` equals `P3` might not be covered. Your last line `else if(Price>P3){data$Portfolio=44}` should read `else {data$Portfolio=44}`. – Uwe Nov 14 '16 at 12:20

3 Answers3

1

You can use:

df$Portfolio[(df$Price<df$P1)&(df$Volume<df$V1)] <- 11
df$Portfolio[(df$Price<df$P1)&(df$Volume<df$V2) & is.na(df$Portfolio)] <- 12

or using dplyr::mutate

library(dplyr)
df <- df %>% 
        mutate(Portfolio=ifelse((Price<P1)&(Volume<V1),11,NA)) %>% 
        mutate(Portfolio=ifelse((Price<P1)&(Volume<V2)& is.na(Portfolio),12,Portfolio))
OmaymaS
  • 1,671
  • 1
  • 14
  • 18
0

In the code you have given,

else if(Price<P1){data$Portfolio=14}
else if((Price<P2)&&(Volume<V1)){Fin_Ret$port=21}
...
else if(Price>P3){data$Portfolio=44}

Remove if after else in the last line. You should be able to get the expected result.

timat
  • 1,480
  • 13
  • 17
duvvurum
  • 337
  • 2
  • 4
  • 9
0

Here is a different and concise approach using findInterval and data.table. It is based on the observation that the Portfolio id consists of two digits where the first digit is determined solely by the price category and the second digit solely by the volume category.

library(data.table)
dt[, Portfolio := paste0(findInterval(Price, c(-Inf, P1, P2, P3)), 
                         findInterval(Volume, c(-Inf, V1, V2, V3))),
   by = .(P1, P2, P3, V1, V2, V3)]

print(dt)
#   Year Price Volume P1 P2  P3  V1  V2  V3 Portfolio
#1: 2009    46    125 25 50  75 200 400 600        21
#2: 2009    65    800 25 50  75 200 400 600        34
#3: 2010    20    560 30 55  90 250 500 800        13
#4: 2010    15    990 30 55  90 250 500 800        14
#5: 2011    89    350 35 70 120 250 500 800        32
#6: 2012    23    100 35 70 120 250 500 800        11

findInterval uses right open intervals by default which is in line with the conditions (Price<P1), etc in the code of the OP.

Data

To make it a reproducible example

dt <- fread("Year   Price    Volume   P1    P2    P3   V1    V2    V3
            2009    46       125     25    50    75   200   400   600
            2009    65       800     25    50    75   200   400   600
            2010    20       560     30    55    90   250   500   800
            2010    15       990     30    55    90   250   500   800
            2011    89       350     35    70    120  250   500   800
            2012    23       100     35    70    120  250   500   800")
Uwe
  • 41,420
  • 11
  • 90
  • 134