In Python I'm trying to extract a single value from a Pandas dataframe. I know exactly what the value contains, I just need to find it anywhere in the dataframe and extract it.
For example, in the dataframe below:
df = pd.DataFrame(
{0: ['BA1234', 'CA:1234', 'DA','DA1234', 'EX DA', 'CA1234'],
1: ['BA1234', 'CA:1234', 'DA','CA1234', 'EX DA', 'CA1234'],
2: ['BA1234', 'CA:1234', 'DA','CA1234', 'EX DA', 'CA1234']})
I want to extract the string containing the two letters 'DA' and exactly 4 digits after it.
I've been trying this using a mask:
mask = pd.DataFrame(np.column_stack([df[col].str.contains('^DA\d{4}', na = False) for col in df]))
Which seems to work:
da_value = df[mask]
da_value
0 1 2
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 DA1234 NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
However, how do I extract the value from the dataframe? Is there a better/easier way of doing this?
Edit: The output I actually want is
da_value = 'DA1234'