How to make Python regex extract working?

Asked Feb 15 '20 at 18:06

Active Feb 15 '20 at 18:24

Viewed 30 times

I have a pandas dataFrame with in one of the columns (df['data']) the following data:

[{'validFrom': '2009-02-16', 'validTo': None, 'country': ['NL', 'BE', 'US'],
'model': ['Free']}]

I tried to extract the different values using regex:

df.['data'].str.extract(r"\'validFrom\': \'(.*?)\',")

When I test this in a online regex tester it works, but when I try it in my script it returns NaN
I basically want to extract the values for all fields (validFrom, validTo, country and model).

Example dataframe, the [..] equals the above mentioned data.

|----------------|-------------|-------------|------------------|
|      code      |     name    |      type   |     data         |
|----------------|-------------|-------------|------------------|
|      003       |     WMG     |      other  |      [..]        |

What am I doing wrong?

edited Feb 15 '20 at 18:24

asked Feb 15 '20 at 18:06

Claudine

1

Can you show an example on how the dataframe looks like? It looks like you have a dict in the df? Not a string? – LeoE Feb 15 '20 at 18:09
You are trying to apply a regex to a dictionary ? – kpie Feb 15 '20 at 18:13
@LeoE i've added the table, wasn't sure how to format it. For now the dataframe is just 1 row – Claudine Feb 15 '20 at 18:25
The important part is missing... What exactly is `'data'`? Is it a dict or a string? – LeoE Feb 15 '20 at 18:27
Thanks for the reference to the other question. I didn't consider it as a dictionary. By using `json_normalize` I found a solution :) – Claudine Feb 15 '20 at 18:38

How to make Python regex extract working?

0 Answers0