1

I have this dataframe

                                     columnA     columnB
symbol  timestamp                         
AAPL    2022-08-17 16:49:38.857000
AAPL    2022-08-17 16:49:48.869000
AAPL    2022-08-17 16:50:10.524000

TSLA    2022-08-17 16:49:40.575000
TSLA    2022-08-17 16:49:50.346000
TSLA    2022-08-17 16:50:10.728000

How can I remove all rows before 15:50:00 for only one symbol, say AAPL

I tried this:

multi = pd.MultiIndex.from_product([[symbol], list_of_timestamps])
df.drop(multi,inplace=True)

But it sometimes throws me the following error:

File "pandas/_libs/index.pyx", line 623, in pandas._libs.index.BaseMultiIndexCodesEngine.get_indexer
  File "pandas/_libs/index.pyx", line 603, in pandas._libs.index.BaseMultiIndexCodesEngine._extract_level_codes
  File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3442, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Is there a simpler way to remove all rows before 16:50:00 for AAPL?

alexx0186
  • 1,557
  • 5
  • 20
  • 32

1 Answers1

0

You can use .query:

df.query("symbol != 'AAPL' or timestamp < '2022-08-17 16:50:00'")
Vladimir Fokow
  • 3,728
  • 2
  • 5
  • 27