Regex to find all backslashes and the immediately following character

Question

I am currently working with email data and when extracting from Outlook, the body of the email still keeps all of the escape characters within the string.

I'm using the re package in Python to achieve this, but to no avail.

Here's an example of text I'm trying to rid the escape characters from:

I am completely in agreement with that. \r\n\r\n\rbest regards.

Expected:

I'd like this to read: "I am completely in agreement with that. best regards.

I've tried the following to extract the unwanted text:

re.findall(r'\\\w+', string)
re.findall(r'\\*\w+', string)
re.findall(r'\\[a-z]+', string)

None of these are doing the trick. I'd appreciate any help!

Thanks!

score 3 · Answer 1 · answered Sep 06 '19 at 13:59

3

you can try this:

re.sub(r'\n|\r','', string)


'I am completely in agreement with that. best regards.'

answered Sep 06 '19 at 13:59

Billy Bonaros

1,671
11
18

1

For Python 2.x and unicode strings it may be necessary to first compile the pattern with flag `re.UNICODE` for this to work. – sophros Sep 06 '19 at 14:20

score 0 · Answer 2 · answered Sep 06 '19 at 13:59

0

You are confusing a representation of whitechars (please read more about them here).

You should rather be looking for \r, \n characters this way:

re.findall(r'\n\w+', string)

or

re.findall(r'\r\w+', string)

answered Sep 06 '19 at 13:59

sophros

14,672
11
46
75

score 0 · Answer 3 · answered Sep 06 '19 at 13:59

0

It seems you want to get rid of the line returns. If so, you don't need the re module, just use:

string.replace("\r\n", "")

answered Sep 06 '19 at 13:59

Guillaume Adam

191
2
10

score 0 · Answer 4 · answered Sep 06 '19 at 13:59

0

You can write a function by yourself:

def function(string):
    while '\\' in string:
        ind = string.find('\\')
        string = string[:ind] + string[ind+2:]

    return string

answered Sep 06 '19 at 13:59

ARD

333
1
13

Regex to find all backslashes and the immediately following character

4 Answers4