Using this it is possible to remove nbsp
str_replace_all(df$text, 'nbsp', '')
What kind of regex can someone use to remove all number with this command?
Using this it is possible to remove nbsp
str_replace_all(df$text, 'nbsp', '')
What kind of regex can someone use to remove all number with this command?
If by "nbsp" you're referring to a Non Breaking Space, then it should work by using explicit Unicode encoding.
The nbsp is encoded as 0x00A0
in Unicode, so on R you can express it as "\U00a0"
.
For example:
> "This is a strange\U00A0 character"
[1] "This is a strange character"
This might be more clear with a different character:
> "This is a strange \U00A1 character"
[1] "This is a strange ¡ character"
And this can be removed as you would expect.
> str_remove("This is a strange \U00A1 character", "\U00A1")
[1] "This is a strange character"
> str_remove("This is a strange\U00A0 character", "\U00A0")
[1] "This is a strange character"
This also works by providing the decimal notation:
str_remove("This is a strange\U00A0 character", intToUtf8(160))
Note, this works on my computer, but there might be variations with locale settings and fonts installed.