Suppose I have a log file parsed and placed into a pandas.DataFrame
.
I'm interested to create a new boolean
column which will have True
only if the current line has EXPRESSION_1
string in it, and the next line has the EXPRESSION_2
expression in it.
I can do it for just a single expression, as shown in the Example 1
below:
Example 1:
import pandas as pd
EXPRESSION_1 = 'Starts streaming the stream rtspsrc'
EXPRESSION_2 = 'initializing gst pipeline'
df = pd.DataFrame(
{
'message': [
'Some log text',
'Some log text',
'Starts streaming the stream rtspsrc',
'initializing gst pipeline',
'Some log text',
'Starts streaming the stream rtspsrc',
'initializing gst pipeline',
'Some log text',
]
}
)
df.loc[:, 'process_started'] = df.loc[:, 'message'].apply(lambda msg: True if msg.find(EXPRESSION_1) > -1 else False)
df
Output of Example 1:
message process_started
0 Some log text False
1 Some log text False
2 Starts streaming the stream rtspsrc True
3 Some log text False
4 Some log text False
5 Starts streaming the stream rtspsrc True
6 initializing gst pipeline False
7 Some log text False
Desired Output:
message process_started
0 Some log text False
1 Some log text False
2 Starts streaming the stream rtspsrc False # <= Note the False here
3 Some log text False
4 Some log text False
5 Starts streaming the stream rtspsrc True
6 initializing gst pipeline False
7 Some log text False
Thanks in advance for any suggestions.