0

Given

Table1 (based on 5 minute timeframe, with boolean values in column "b" and "s"):

timestamp (=index)   open  close  b  s
2022-10-08 18:00:00  4.29   4.29  0  1
2022-10-08 18:05:00  4.30   4.36  1  0
2022-10-08 18:10:00  4.36   4.35  0  0
2022-10-08 18:15:00  4.34   4.37  0  0
2022-10-08 18:20:00  4.37   4.37  0  1
2022-10-08 18:25:00  4.36   4.37  0  0
2022-10-08 18:30:00  4.37   4.37  1  0
2022-10-08 18:35:00  4.38   4.39  0  0
2022-10-08 18:40:00  4.39   4.38  0  1
2022-10-08 18:45:00  4.39   4.38  0  0

Table2 (has no column "b" and "s" initial, but must be created with the value "1" at the same timestamp as in table1 and the rest filled with "0". The value in other columns needs to be kept and are different in both tables):

timestamp (=index)   b  s
2022-10-08 18:00:00  4.29   4.29  0  1 (-> from table1)
2022-10-08 18:01:00  4.29   4.29  0  0
2022-10-08 18:02:00  4.29   4.29  0  0
2022-10-08 18:03:00  4.29   4.29  0  0
2022-10-08 18:04:00  4.29   4.29  0  0
2022-10-08 18:05:00  4.29   4.29  1  0 (-> from table1)
2022-10-08 18:06:00  4.30   4.39  0  0
2022-10-08 18:07:00  4.38   4.40  0  0
2022-10-08 18:08:00  4.40   4.40  0  0
2022-10-08 18:09:00  4.41   4.36  0  0
2022-10-08 18:10:00  4.36   4.33  0  0
2022-10-08 18:11:00  4.34   4.34  0  0
2022-10-08 18:12:00  4.33   4.32  0  0
2022-10-08 18:13:00  4.32   4.33  0  0
2022-10-08 18:14:00  4.33   4.35  0  0
2022-10-08 18:15:00  4.34   4.34  0  0
2022-10-08 18:16:00  4.34   4.35  0  0
2022-10-08 18:17:00  4.35   4.36  0  0
2022-10-08 18:18:00  4.36   4.36  0  0
2022-10-08 18:19:00  4.36   4.37  0  0
2022-10-08 18:20:00  4.37   4.37  0  1 (-> from table1)
2022-10-08 18:21:00  4.38   4.37  0  0
2022-10-08 18:22:00  4.37   4.38  0  0
2022-10-08 18:23:00  4.38   4.38  0  0
2022-10-08 18:24:00  4.38   4.37  0  0
2022-10-08 18:25:00  4.36   4.37  0  0
2022-10-08 18:26:00  4.37   4.38  0  0
2022-10-08 18:27:00  4.38   4.37  0  0
2022-10-08 18:28:00  4.37   4.37  0  0
2022-10-08 18:29:00  4.37   4.37  0  0
2022-10-08 18:30:00  4.37   4.37  1  0 (-> from table1)
2022-10-08 18:31:00  4.37   4.37  0  0
2022-10-08 18:32:00  4.37   4.37  0  0
2022-10-08 18:33:00  4.37   4.37  0  0
2022-10-08 18:34:00  4.36   4.37  0  0
2022-10-08 18:35:00  4.37   4.37  0  0
2022-10-08 18:36:00  4.38   4.38  0  0
2022-10-08 18:37:00  4.39   4.39  0  0
2022-10-08 18:38:00  4.39   4.39  0  0
2022-10-08 18:39:00  4.39   4.39  0  0
2022-10-08 18:40:00  4.39   4.39  0  1 (-> from table1)
2022-10-08 18:41:00  4.39   4.39  0  0
2022-10-08 18:42:00  4.38   4.38  0  0
2022-10-08 18:43:00  4.39   4.37  0  0
2022-10-08 18:44:00  4.38   4.38  0  0
2022-10-08 18:45:00  4.39   4.38  0  0

So I need a solution to find the values (from b and s) from the 5m timeframe and "plot" them on the same timestamp on table2. I guess the fastest way is numpy?

Quinten
  • 35,235
  • 5
  • 20
  • 53
wkamer
  • 75
  • 7
  • Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) Merge on timestamp with outer join and fill nan if needed – My Work Oct 08 '22 at 16:08

2 Answers2

0

You could do update

tab2.update(tab1)
BENY
  • 317,841
  • 20
  • 164
  • 234
  • I have added two columns with value, this because the values of table2 (open and close) needs to be kept. And with your solution is resetting values to 0. Sorry, the those value were not given in the first place. – wkamer Oct 08 '22 at 16:56
  • @wkamer you can try update – BENY Oct 08 '22 at 17:00
  • The result is that the columns "b" and "s" are not created. Do I have to add something to the update function when columns does not exist? – wkamer Oct 08 '22 at 17:27
  • @wkamer yes you have to create the column first – BENY Oct 08 '22 at 18:07
  • This simple solution give me what I want after indeed I added the 2 empty columns first. – wkamer Oct 10 '22 at 13:59
0

As I wrote in my comments, use pd.merge on your timestamp or on all the columns and fill the nans with 0:

table1 = pd.DataFrame({'timestamp': ['10/8/22 18:00',
  '10/8/22 18:05',
  '10/8/22 18:10',
  '10/8/22 18:15',
  '10/8/22 18:20',
  '10/8/22 18:25',
  '10/8/22 18:30',
  '10/8/22 18:35',
  '10/8/22 18:40',
  '10/8/22 18:45'],
 'open': [4.29, 4.3, 4.36, 4.34, 4.37, 4.36, 4.37, 4.38, 4.39, 4.39],
 'close': [4.29, 4.36, 4.35, 4.37, 4.37, 4.37, 4.37, 4.39, 4.38, 4.38],
 'b': [0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
 's': [1, 0, 0, 0, 1, 0, 0, 0, 1, 0]})


table2 = pd.DataFrame({'timestamp': ['10/8/22 18:00', '10/8/22 18:01', '10/8/22 18:02', '10/8/22 18:03',
       '10/8/22 18:04', '10/8/22 18:05', '10/8/22 18:06', '10/8/22 18:07',
       '10/8/22 18:08', '10/8/22 18:09', '10/8/22 18:10', '10/8/22 18:11',
       '10/8/22 18:12', '10/8/22 18:13', '10/8/22 18:14', '10/8/22 18:15',
       '10/8/22 18:16', '10/8/22 18:17', '10/8/22 18:18', '10/8/22 18:19',
       '10/8/22 18:20', '10/8/22 18:21', '10/8/22 18:22', '10/8/22 18:23',
       '10/8/22 18:24', '10/8/22 18:25', '10/8/22 18:26', '10/8/22 18:27',
       '10/8/22 18:28', '10/8/22 18:29', '10/8/22 18:30', '10/8/22 18:31',
       '10/8/22 18:32', '10/8/22 18:33', '10/8/22 18:34', '10/8/22 18:35',
       '10/8/22 18:36', '10/8/22 18:37', '10/8/22 18:38', '10/8/22 18:39',
       '10/8/22 18:40', '10/8/22 18:41', '10/8/22 18:42', '10/8/22 18:43',
       '10/8/22 18:44', '10/8/22 18:45'],
 'open': [4.29, 4.29, 4.29, 4.29, 4.29, 4.29, 4.3 , 4.38, 4.4 , 4.41, 4.36,
       4.34, 4.33, 4.32, 4.33, 4.34, 4.34, 4.35, 4.36, 4.36, 4.37, 4.38,
       4.37, 4.38, 4.38, 4.36, 4.37, 4.38, 4.37, 4.37, 4.37, 4.37, 4.37,
       4.37, 4.36, 4.37, 4.38, 4.39, 4.39, 4.39, 4.39, 4.39, 4.38, 4.39,
       4.38, 4.39],
 'close': [4.29, 4.29, 4.29, 4.29, 4.29, 4.29, 4.39, 4.4 , 4.4 , 4.36, 4.33,
       4.34, 4.32, 4.33, 4.35, 4.34, 4.35, 4.36, 4.36, 4.37, 4.37, 4.37,
       4.38, 4.38, 4.37, 4.37, 4.38, 4.37, 4.37, 4.37, 4.37, 4.37, 4.37,
       4.37, 4.37, 4.37, 4.38, 4.39, 4.39, 4.39, 4.39, 4.39, 4.38, 4.37,
       4.38, 4.38]})


pd.merge(table1, table2, on=['timestamp', 'open', 'close'], how='outer').sort_values(by='timestamp').fillna(0)

or

pd.merge(table2, table1[['timestamp', 'b', 's']], on=['timestamp'], how='outer').sort_values(by='timestamp').fillna(0)

and you get:

    timestamp   open    close   b   s
0   10/8/22 18:00   4.29    4.29    0.0 1.0
10  10/8/22 18:01   4.29    4.29    0.0 0.0
11  10/8/22 18:02   4.29    4.29    0.0 0.0
12  10/8/22 18:03   4.29    4.29    0.0 0.0
13  10/8/22 18:04   4.29    4.29    0.0 0.0
1   10/8/22 18:05   4.3     4.36    1.0 0.0
14  10/8/22 18:05   4.29    4.29    0.0 0.0
15  10/8/22 18:06   4.3     4.39    0.0 0.0
16  10/8/22 18:07   4.38    4.4     0.0 0.0
17  10/8/22 18:08   4.4     4.4     0.0 0.0
18  10/8/22 18:09   4.41    4.36    0.0 0.0
2   10/8/22 18:10   4.36    4.35    0.0 0.0
19  10/8/22 18:10   4.36    4.33    0.0 0.0
20  10/8/22 18:11   4.34    4.34    0.0 0.0
21  10/8/22 18:12   4.33    4.32    0.0 0.0
22  10/8/22 18:13   4.32    4.33    0.0 0.0
23  10/8/22 18:14   4.33    4.35    0.0 0.0
24  10/8/22 18:15   4.34    4.34    0.0 0.0
3   10/8/22 18:15   4.34    4.37    0.0 0.0
25  10/8/22 18:16   4.34    4.35    0.0 0.0
26  10/8/22 18:17   4.35    4.36    0.0 0.0
27  10/8/22 18:18   4.36    4.36    0.0 0.0
28  10/8/22 18:19   4.36    4.37    0.0 0.0
4   10/8/22 18:20   4.37    4.37    0.0 1.0
29  10/8/22 18:21   4.38    4.37    0.0 0.0
30  10/8/22 18:22   4.37    4.38    0.0 0.0
31  10/8/22 18:23   4.38    4.38    0.0 0.0
32  10/8/22 18:24   4.38    4.37    0.0 0.0
5   10/8/22 18:25   4.36    4.37    0.0 0.0
33  10/8/22 18:26   4.37    4.38    0.0 0.0
34  10/8/22 18:27   4.38    4.37    0.0 0.0
35  10/8/22 18:28   4.37    4.37    0.0 0.0
36  10/8/22 18:29   4.37    4.37    0.0 0.0
6   10/8/22 18:30   4.37    4.37    1.0 0.0
37  10/8/22 18:31   4.37    4.37    0.0 0.0
38  10/8/22 18:32   4.37    4.37    0.0 0.0
39  10/8/22 18:33   4.37    4.37    0.0 0.0
40  10/8/22 18:34   4.36    4.37    0.0 0.0
7   10/8/22 18:35   4.38    4.39    0.0 0.0
41  10/8/22 18:35   4.37    4.37    0.0 0.0
42  10/8/22 18:36   4.38    4.38    0.0 0.0
43  10/8/22 18:37   4.39    4.39    0.0 0.0
44  10/8/22 18:38   4.39    4.39    0.0 0.0
45  10/8/22 18:39   4.39    4.39    0.0 0.0
46  10/8/22 18:40   4.39    4.39    0.0 0.0
8   10/8/22 18:40   4.39    4.38    0.0 1.0
47  10/8/22 18:41   4.39    4.39    0.0 0.0
48  10/8/22 18:42   4.38    4.38    0.0 0.0
49  10/8/22 18:43   4.39    4.37    0.0 0.0
50  10/8/22 18:44   4.38    4.38    0.0 0.0
9   10/8/22 18:45   4.39    4.38    0.0 0.0

The rest is just to .reset_index() and convert the type of b and s to int if you care about either of those.

My Work
  • 2,143
  • 2
  • 19
  • 47