1

Suppose if I consider two tables Transaction and account where transaction$account_id contains factors and unique values in account$account_id

> transaction
      trans_id account_id amount
             1        100    500
             2        101    200
             3        102    100
             4        100    600
             5        100    700
             6        100    900
             7        101   1000
             8        101  10000
             9        102  20000
            10        101   5000
> account
  account_id Total.amnt notrans transavg
         100       2700       4      675
         101      16200       4     4050
         102      20100       2    10050

Now my question is, how to find whether amount in transaction table is greater than transavg in account table for each account_id.

And store it in a variable with factors as 1 if greater than transavg and 0 if lesser than transavg. And what packages do I need to use.

Nikhil Kumar
  • 455
  • 1
  • 4
  • 11
  • Possible duplicate of http://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right – akrun Mar 24 '17 at 08:41

2 Answers2

4

We can use match to compare the account_id and then get the corresponding amount from the account table and then compare it with amount in transaction table. This will return a boolean output which can be converted to integer using as.integer.

transaction$flag <- as.integer(transaction$amount > 
          account$transavg[match(transaction$account_id, account$account_id)])

transaction

#   trans_id account_id amount flag
#1         1        100    500    0
#2         2        101    200    0
#3         3        102    100    0
#4         4        100    600    0
#5         5        100    700    1
#6         6        100    900    1
#7         7        101   1000    0
#8         8        101  10000    1
#9         9        102  20000    1
#10       10        101   5000    1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

We can use a non-equi join with data.table. Convert the 'data.frame' to 'data.table' (setDT(transaction)), create the 'flag' as 0, join with 'amount' on 'account_id' and amount > transavg, and assign (:=) 'flag' as 1.

library(data.table)
setDT(transaction)[, flag := 0][amount, flag := 1, on =.(account_id, amount > transavg)]
transaction
#     trans_id account_id amount flag
# 1:        1        100    500    0
# 2:        2        101    200    0
# 3:        3        102    100    0
# 4:        4        100    600    0
# 5:        5        100    700    1
# 6:        6        100    900    1
# 7:        7        101   1000    0
# 8:        8        101  10000    1
# 9:        9        102  20000    1
#10:       10        101   5000    1
akrun
  • 874,273
  • 37
  • 540
  • 662