1

I've got the following dataframe:

activity_level2
                          Date_and_time  ...  walking_frame
Date_and_time                            ...               
2020-07-24 23:00:00 2020-07-24 23:00:00  ...              0
2020-07-24 23:01:00 2020-07-24 23:01:00  ...              0
2020-07-24 23:02:00 2020-07-24 23:02:00  ...              0
2020-07-24 23:03:00 2020-07-24 23:03:00  ...              0
2020-07-24 23:04:00 2020-07-24 23:04:00  ...              0
2020-07-24 23:05:00 2020-07-24 23:05:00  ...              0
2020-07-24 23:06:00 2020-07-24 23:06:00  ...              0
2020-07-24 23:07:00 2020-07-24 23:07:00  ...              0
2020-07-24 23:08:00 2020-07-24 23:08:00  ...              0
2020-07-24 23:09:00 2020-07-24 23:09:00  ...              0
2020-07-24 23:10:00 2020-07-24 23:10:00  ...              0
2020-07-24 23:11:00 2020-07-24 23:11:00  ...              0
2020-07-24 23:12:00 2020-07-24 23:12:00  ...              0
2020-07-24 23:13:00 2020-07-24 23:13:00  ...              0
2020-07-24 23:14:00 2020-07-24 23:14:00  ...              0
2020-07-24 23:15:00 2020-07-24 23:15:00  ...              0
2020-07-24 23:16:00 2020-07-24 23:16:00  ...              0
2020-07-24 23:17:00 2020-07-24 23:17:00  ...              0
2020-07-24 23:18:00 2020-07-24 23:18:00  ...              0
2020-07-24 23:19:00 2020-07-24 23:19:00  ...              0
2020-07-24 23:20:00 2020-07-24 23:20:00  ...              0
2020-07-24 23:21:00 2020-07-24 23:21:00  ...              0
2020-07-24 23:22:00 2020-07-24 23:22:00  ...              0
2020-07-24 23:23:00 2020-07-24 23:23:00  ...              0
2020-07-24 23:24:00 2020-07-24 23:24:00  ...              0
2020-07-24 23:25:00 2020-07-24 23:25:00  ...              0
2020-07-24 23:26:00 2020-07-24 23:26:00  ...              0
2020-07-24 23:27:00 2020-07-24 23:27:00  ...              1
2020-07-24 23:28:00 2020-07-24 23:28:00  ...              1
2020-07-24 23:29:00 2020-07-24 23:29:00  ...              1
2020-07-24 23:30:00 2020-07-24 23:30:00  ...              1
2020-07-24 23:31:00 2020-07-24 23:31:00  ...              1
2020-07-24 23:32:00 2020-07-24 23:32:00  ...              1
2020-07-24 23:33:00 2020-07-24 23:33:00  ...              1
2020-07-24 23:34:00 2020-07-24 23:34:00  ...              1
2020-07-24 23:35:00 2020-07-24 23:35:00  ...              1
2020-07-24 23:36:00 2020-07-24 23:36:00  ...              1
2020-07-24 23:37:00 2020-07-24 23:37:00  ...              1
2020-07-24 23:38:00 2020-07-24 23:38:00  ...              1
2020-07-24 23:39:00 2020-07-24 23:39:00  ...              1
2020-07-24 23:40:00 2020-07-24 23:40:00  ...              1
2020-07-24 23:41:00 2020-07-24 23:41:00  ...              1
2020-07-24 23:42:00 2020-07-24 23:42:00  ...              1
2020-07-24 23:43:00 2020-07-24 23:43:00  ...              1
2020-07-24 23:44:00 2020-07-24 23:44:00  ...              1
2020-07-24 23:45:00 2020-07-24 23:45:00  ...              1
2020-07-24 23:46:00 2020-07-24 23:46:00  ...              1
2020-07-24 23:47:00 2020-07-24 23:47:00  ...              1
2020-07-24 23:48:00 2020-07-24 23:48:00  ...              1
2020-07-24 23:49:00 2020-07-24 23:49:00  ...              1
2020-07-24 23:50:00 2020-07-24 23:50:00  ...              1
2020-07-24 23:51:00 2020-07-24 23:51:00  ...              1
2020-07-24 23:52:00 2020-07-24 23:52:00  ...              1
2020-07-24 23:53:00 2020-07-24 23:53:00  ...              1
2020-07-24 23:54:00 2020-07-24 23:54:00  ...              1
2020-07-24 23:55:00 2020-07-24 23:55:00  ...              1
2020-07-24 23:56:00 2020-07-24 23:56:00  ...              1
2020-07-24 23:57:00 2020-07-24 23:57:00  ...              1
2020-07-24 23:58:00 2020-07-24 23:58:00  ...              1
2020-07-24 23:59:00 2020-07-24 23:59:00  ...              1

[60 rows x 7 columns]

I want to select specifics rows in another dataframe 'dfcont2':

dfcont2
                               waddling_count    MP  waddling_frame
Date_and_time                                                      
2020-07-24 23:00:01.065838656           943.0   0.0             0.0
2020-07-24 23:00:01.132505322           943.0   0.0             0.0
2020-07-24 23:00:01.199171988           943.0   0.0             0.0
2020-07-24 23:00:01.265838654           943.0   0.0             0.0
2020-07-24 23:00:01.332505320           943.0   0.0             0.0
                                      ...   ...             ...
2020-07-24 23:59:58.399136016          2160.0   0.0             0.0
2020-07-24 23:59:58.465802682          2160.0   0.0             0.0
2020-07-24 23:59:58.532469348          2160.0   0.0             0.0
2020-07-24 23:59:58.599136014          2160.0   0.0             0.0
2020-07-24 23:59:58.665802680          2160.0  21.0             0.0

[53965 rows x 3 columns]

I want to select those rows in dfcont2 which meet the following condition:

activity_level2['walking_frame'] = 0

and I want all the rows between the 2 specific timestamps in 'activity_level2' (so for 1 full minute) I hope this is clear this way... I don't have any idea how to do this... Any help is very much appreciated.

Geox
  • 89
  • 7
  • :1: UserWarning: Boolean Series key will be reindexed to match DataFrame index. *** pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match). – Geox Apr 03 '21 at 13:34
  • ohh...so you have df's of different length – Anurag Dabas Apr 03 '21 at 13:35
  • yes, activity_level2 has 60 rows (60 minutes), dfcont2 has >50.000 rows (15 frames per second) – Geox Apr 03 '21 at 13:41
  • Please repost your sample data in pandas.DataFrame format, dictionary format, or at least without the elipses. – semblable Apr 03 '21 at 14:27
  • Does this answer your question? [Pandas: select DF rows based on another DF](https://stackoverflow.com/questions/41020978/pandas-select-df-rows-based-on-another-df) – semblable Apr 03 '21 at 14:37

1 Answers1

0

I would suggest to try this:

# In case "Date_and_time" column is not already of type 'datetime' in both dfs:
activity_level2["Date_and_time"] = pd.to_datetime(
    activity_level2["Date_and_time"], format="%Y-%m-%d %H:%M:%S"
)
dfcont2["Date_and_time"] = pd.to_datetime(
    dfcont2["Date_and_time"], format="%Y-%m-%d %H:%M:%S"
)

# Rows of activity_level2 for which 'walking_frame' is equal to 0
filtered_activity = activity_level2.loc[activity_level2['walking_frame'] == 0, :]

# Rows of dfcont2 between the 2 specific timestamps in 'filtered_activity'
mask = dfcont2["Date_and_time"].isin(filtered_activity["Date_and_time"].values)
newdf = dfcont2.loc[mask, :]
Laurent
  • 12,287
  • 7
  • 21
  • 37