0

Hi I'm trying to without succeed to use regular expression to select he string before the \r\ , ideally in majority of the times is a word follow by a coma. But as showed the \r\ and some other obstacles appear .Example below:

    var
Sao Paulo , Brazil \r\n Details Description ....
Rio de Janeiro , Brazil  ... Pending funding.  

  (result)expected
Sao Paulo , Brasil 
Ian_De_Oliveira
  • 291
  • 5
  • 16

2 Answers2

0

How about this ?

# using string function .find()
    a = 'Sao Paulo , Brazil \r\n Details Description ....'
    a[0:a.find('\r')].strip()

   'Sao Paulo , Brazil'

Edit:< br />

Let's say your data frame is df. The name of your column is 'text'. We'll create a new column say 'new_text'. Now, do the following:

library(data.table)
setDT(df) # just in case it's not a data.table
df[,new_text := text[0:text.find('\r')].strip()]
YOLO
  • 20,181
  • 5
  • 20
  • 40
0

You can use either

df["var"].str.extract("(.*)\\\\r")

or

df["var"].str.extract(r"(.*)\\r")

Notice the r before quotation mark. You can read more at Python regex - r prefix

Tai
  • 7,684
  • 3
  • 29
  • 49