2

Below is the data I am working on:

df_raw = pd.DataFrame({'Summary':['|ro-rd4_ae20|Issue-backfgound', '|20:36|site1_shutdown'], 'User':[r'UPC\User',r'UPC\Ankita'], 'Name':['Generic User', 'CSD']})

Using a regular expression, I want to check pattern for 'Name' =CSD and 'Name' ='Generic User' separately which will create new column giving true / false value if matches pattern in re.

If df_raw.Name ='CSD' then apply reg. exp (df_raw['Summary'].str.findall(r'(([?:[01]?\d|2[0-9]):[0-9]\d|[a-z0-9A-Z-._]+)', expand=False))

and df_raw.Name = 'Generic User' then apply reg exp (df_raw['Summary'].str.findall(r'(([?:[01]?\d|2[0-9]):[0-9]\d|[a-z0-9A-Z-._]+)', expand=False))

I have tried adding re in variable and apply but that is not helping/not giving output

pls help with this

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563

1 Answers1

2

You can use

import pandas as pd
import re

df_raw = pd.DataFrame({'Summary':['|ro-rd4_ae20|Issue-backfgound', '|20:36|site1_shutdown'], 'User':[r'UPC\User',r'UPC\Ankita'], 'Name':['Generic User', 'CSD']})

def extract(r):
  if r["Name"] == "Generic User":
    return bool( re.search(r'(?:[01]?\d|2[0-9]):[0-9]\d|[a-z0-9A-Z._-]+', r["Summary"]) )
  elif r["Name"] == "CSD":
    return bool( re.search(r'(?:[01]?\d|2[0-9]):[0-9]\d|[a-z0-9A-Z._-]+', r["Summary"]) )
  return False

df_raw["Valid"] = df_raw.apply(extract, axis=1)

Output:

>>> df_raw
                         Summary        User          Name  Valid
0  |ro-rd4_ae20|Issue-backfgound    UPC\User  Generic User   True
1          |20:36|site1_shutdown  UPC\Ankita           CSD   True

The df_raw["Valid"] column will contain the True or False values.

Note I removed a "wild" [ at the start of your patterns, it seems off.

If you need to extract the matches use

def extract(r):
  if r["Name"] == "Generic User":
    m = re.search(r'(?:[01]?\d|2[0-9]):[0-9]\d|[a-z0-9A-Z._-]+', r["Summary"])
    if m: return m.group()
  elif r["Name"] == "CSD":
    m = re.search(r'(?:[01]?\d|2[0-9]):[0-9]\d|[a-z0-9A-Z._-]+', r["Summary"])
    if m: return m.group()
  return ''

>>> df_raw["Valid"] = df_raw.apply(extract, axis=1)
>>> df_raw
                         Summary        User          Name        Valid
0  |ro-rd4_ae20|Issue-backfgound    UPC\User  Generic User  ro-rd4_ae20
1          |20:36|site1_shutdown  UPC\Ankita           CSD        20:36
>>> 
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563