How can I optimise the following operation:
df[(df.start <= x) & (df.end >= y)]
I tried using MultiIndex
but saw no significant speedup.
df = df.set_index(['start', 'end'])
df[(df.index.get_level_values('start') <= end) & (discon_df.index.get_level_values('end') >= start)]
Sample data:
'<table border="1" class="dataframe">\n <thead>\n <tr style="text-align: right;">\n <th></th>\n <th>start</th>\n <th>end</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>2018-11-13 10:28:30.304287</td>\n <td>2018-11-13 10:46:28.663868</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2018-11-13 12:27:32.226550</td>\n <td>2018-11-13 13:09:02.723869</td>\n </tr>\n <tr>\n <th>2</th>\n <td>2018-11-13 13:29:29.981659</td>\n <td>2018-11-13 13:54:01.138963</td>\n </tr>\n <tr>\n <th>3</th>\n <td>2018-11-13 14:30:49.380554</td>\n <td>2018-11-13 14:48:50.627830</td>\n </tr>\n <tr>\n <th>4</th>\n <td>2018-11-13 14:59:26.799017</td>\n <td>2018-11-13 15:24:00.453983</td>\n </tr>\n <tr>\n <th>5</th>\n <td>2018-11-13 16:30:16.824188</td>\n <td>2018-11-13 16:48:35.346318</td>\n </tr>\n <tr>\n <th>6</th>\n <td>2018-11-13 17:15:25.486287</td>\n <td>2018-11-13 17:59:30.774629</td>\n </tr>\n <tr>\n <th>7</th>\n <td>2018-11-13 18:27:41.915379</td>\n <td>2018-11-13 18:47:26.528320</td>\n </tr>\n <tr>\n <th>8</th>\n <td>2018-11-13 19:28:12.835576</td>\n <td>2018-11-13 19:52:15.448146</td>\n </tr>\n <tr>\n <th>9</th>\n <td>2018-11-13 20:41:41.210849</td>\n <td>2018-11-13 21:07:52.249831</td>\n </tr>\n <tr>\n <th>10</th>\n <td>2018-11-13 21:11:23.529623</td>\n <td>2018-11-13 21:42:10.106951</td>\n </tr>\n </tbody>\n</table>'