-1

I faced problems when categorizing data by time. Here's how it looks like:

I have a list of data which specifies starting time and ending time of each period:

   Period    Start      END
1      A1 11:08:58 11:11:58
2      A2 12:08:58 12:11:58
3      A3 13:08:58 13:11:58
4      A4 14:08:58 14:11:58
5      A5 15:08:58 15:11:58
6      A6 16:08:58 16:11:58
7      A7 17:08:58 17:11:58
8      A8 18:08:58 18:11:58
9      A9 19:08:58 19:11:58
10    A10 20:08:58 20:11:58
11    A11 21:08:58 21:11:58
12    A12 22:08:58 22:11:58
13    A13 23:08:58 23:11:58
14    A14 00:08:58 00:11:58
15    A15 01:08:58 01:11:58
16    A16 02:08:58 02:11:58
17    A17 03:08:58 03:11:58
18    A18 04:08:58 04:11:58
19    A19 05:08:58 05:11:58
20    A20 06:08:58 06:11:58

I have also another list which specifies time which every transaction occurs:

Transaction Transaction.Time
1        TR015         12:10:58
2        TR008         18:10:58
3        TR009         13:10:58
4        TR019         14:10:58
5        TR001         15:10:58
6        TR011         16:10:58
7        TR018         17:10:58
8        TR005         11:10:58
9        TR013         19:10:58
10       TR012         20:10:58
11       TR014         21:10:58
12       TR004         22:10:58
13       TR020         23:10:58
14       TR010         00:10:58
15       TR016         01:10:58
16       TR007         02:10:58
17       TR017         03:10:58
18       TR006         04:10:58
19       TR003         05:10:58
20       TR002         06:10:58

What I tried to do is to merge these two list to know at which period each transaction occurs, like:

   Transaction Transaction.Time Period    Start      END
1        TR015         12:10:58 A2        12:08:58 12:11:58
2        TR008         18:10:58 A8        18:08:58 18:11:58
3        TR009         13:10:58 A3        13:08:58 13:11:58
郭傳誠
  • 3
  • 3

2 Answers2

0

I assume these names for your data frames: df_period: for data frame of the time periods transaction_occurs: for data frame of the transactions (it has to columns)

df <- data.frame(transaction_occurs, period=0, start=0, end=0)

for(i in 1:nrow(transaction_occurs)){
  row_number <- which(df[i,2]>df_period$Start & df[i,2]<df_period$END)
  df[i,c(3,4,5)] <- df_period[row_number,]
}
Mahdi Baghbanzadeh
  • 540
  • 1
  • 4
  • 10
0

I would create a reproducible example like this:

period = read_delim('No Period Start END
1 A1 11:08:58 11:11:58
2 A2 12:08:58 12:11:58
3 A3 13:08:58 13:11:58
4 A4 14:08:58 14:11:58
5 A5 15:08:58 15:11:58
6 A6 16:08:58 16:11:58
7 A7 17:08:58 17:11:58
8 A8 18:08:58 18:11:58
9 A9 19:08:58 19:11:58
10 A10 20:08:58 20:11:58
11 A11 21:08:58 21:11:58
12 A12 22:08:58 22:11:58
13 A13 23:08:58 23:11:58
14 A14 00:08:58 00:11:58
15 A15 01:08:58 01:11:58
16 A16 02:08:58 02:11:58
17 A17 03:08:58 03:11:58
18 A18 04:08:58 04:11:58
19 A19 05:08:58 05:11:58
20 A20 06:08:58 06:11:58', delim = ' ')

tnx = read_delim('No Transaction Time
1 TR015 12:10:58
2 TR008 18:10:58
3 TR009 13:10:58
4 TR019 14:10:58
5 TR001 15:10:58
6 TR011 16:10:58
7 TR018 17:10:58
8 TR005 11:10:58
9 TR013 19:10:58
10 TR012 20:10:58
11 TR014 21:10:58
12 TR004 22:10:58 
13 TR020 23:10:58
14 TR010 00:10:58
15 TR016 01:10:58
16 TR007 02:10:58
17 TR017 03:10:58
18 TR006 04:10:58
19 TR003 05:10:58
20 TR002 06:10:58', delim = ' ')

To locate a time period, you must be able to turn it into some date time format, here date is irrelevant so you can leave the function parsing it in any date, still works. Also it must be sure that the period data is complete, means no time stamp is outside the range.

require(lubridate)
require(tidyverse)
period = period %>% mutate(Start = as_datetime(Start),
                           END = as_datetime(END))

tnx = tnx %>% mutate(Time = as_datetime(Time))

locate_period = function(time_stamp, period_data) {
  period_data = period_data %>% filter(Start <= time_stamp) %>% filter(END >= time_stamp)
  period_data$Period[[1]]
}

tnx$Period = ''

for (i in 1:nrow(tnx)) {
  tnx$Period[[i]] = locate_period(tnx$Time[[i]], period)
}

tnx = left_join(tnx, period, by = 'Period')
minhsphuc12
  • 68
  • 1
  • 8