0

I have an input column (symbols) which has more than 10000 rows and they contain operator symbols and text values like ("",">","<","","****","inv","MOD","seen") as shown below in the code as values. This column doesn't contain any numbers. It only contains the value which are stated in the code.

What I would like to do is map those operator symbols ('<','>' etc) to different codes, 1) Operator_codes 2) Value_codes and have these two different codes as separate columns

I already have a working code but it is not very efficient as you can see I repeat the same operation twice. Once for Operator_codes and then for value_codes. I am sure there must be some efficient way to write this. I am new to R and not very familiar with other approach.

oper_val_concepts = function(DF){
operators_source = str_extract(.$symbols)
operators_source = as.data.frame(operators_source)
colnames(operators_source) <- c("Symbol")
operator_list = c("",">","<","-","****","inv","MOD","seen")
operator_codes = c(123L,14L,16L,13L,0L,0L,0L,0L)
value_codes=c(14L,12L,32L,123L,16L
,41L,116L,186L)
operator_code_map = map2(operator_list,operator_codes,function(x,y)c(x,y)) 
  %>% 
    data.frame()
value_code_map = map2(operator_list,value_codes,function(x,y) c(x,y)) %>% 
  data.frame()
operator_code_map = t(operator_code_map)
value_code_map = t(value_code_map)
colnames(operator_code_map) <- c("Symbol","Code")
colnames(value_code_map) <- c("Symbol","Code")
rownames(operator_code_map) = NULL
rownames(value_code_map) = NULL
dfm<-merge(x=operators_source,y=operator_code_map,by="Symbol",all.x = 
TRUE)
dfm1<-merge(x=operators_source,y=value_code_map,by="Symbol",all.x = TRUE)
 }

 t1 = oper_val_concepts(test)

dput command output is

structure(list(Symbols = structure(c(2L, 3L, 1L, 4L, 2L, 3L, 
5L, 4L, 6L), .Label = c("****", "<", ">", "inv", "mod", "seen"
), class = "factor")), .Names = "Symbols", row.names = c(NA,-9L), class = 
"data.frame")

I am expecting an output to be two columns in a dataframe as shown below.

enter image description here

The Great
  • 7,215
  • 7
  • 40
  • 128
  • can anyone help me with this? – The Great Apr 02 '19 at 13:26
  • Is this your full function? I don't see a closing }. Also check out how to make a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample data. – StephenK Apr 02 '19 at 15:18
  • Thats my full function. I have updated the code above with closing. – The Great Apr 02 '19 at 15:20
  • DF is nothing but the argument which takes the 'test' dataframe when I call the function. At the bottom of the code, I call the function with 'test' dataframe as the value for argument. I am trying to prepare the data – The Great Apr 02 '19 at 15:31
  • Added the dput command output above in the code section. You should be able to see now. There are no other columns in the input table other than 'Symbols' column. I manually get the mapping codes online and try to create a output dataframe as shown in the above screenshot. – The Great Apr 02 '19 at 15:36
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/191108/discussion-between-stephenk-and-selva). – StephenK Apr 02 '19 at 15:39
  • Did the solution below fix your problem? If not, would you explain why? – StephenK Apr 04 '19 at 14:22
  • 1
    Hi! StephenK, am yet to try this. Will defintely update this post and mark your solution as answer if it worked. Currently am traveling and didn't try it yet. – The Great Apr 05 '19 at 05:47

1 Answers1

0

Based on what I am understanding, it seems like you want to create a dataframe that will act as a key (see key below). Once you have this, you can just join the dataframe that just contains symbols with this key dataframe.

df <- structure(list(Symbols = structure(c(2L, 3L, 1L, 4L, 2L, 3L, 
                                     5L, 4L, 6L), .Label = c("****", "<", ">", "inv", "mod", "seen"
                                     ), class = "factor")), .Names = "Symbols", row.names = c(NA, -9L), class = "data.frame")

key <- data.frame(Symbols = c("",">","<","-","****","inv","mod","seen"),
           Oerator_code_map = c(123L,14L,16L,13L,0L,0L,0L,0L),
           value_code_map = c(14L,12L,32L,123L,16L,41L,116L,186L))

df %>% left_join(key, by = "Symbols")

output

  Symbols Oerator_code_map value_code_map
1       <               16             32
2       >               14             12
3    ****                0             16
4     inv                0             41
5       <               16             32
6       >               14             12
7     mod                0            116
8     inv                0             41
9    seen                0            186
StephenK
  • 685
  • 5
  • 16