-1

I have a table with a string column. This column includes lots of text but for some unknown reason, I find some nul characters such as '\0sult'. I would like to clean my column and remove this nul character but I don't know how to do this.

If I try :

grepl(pattern = "\0", x = "blabla \0sults")

I've got

Error in parse(text = x, srcfile = src): nul character not allowed (line 1)

How could I detect and remove those nul characters ?

PAC
  • 5,178
  • 8
  • 38
  • 62
  • As far as i know nulls not allowed in string. What if you tip "blabla \0ults" in console? – Ric Feb 24 '23 at 21:58
  • (*type) . See for example what happens with `rawToChar(as.raw(c(97,98,99)))` and `rawToChar(as.raw(c(97, 0,99)))` – Ric Feb 24 '23 at 22:05
  • If using `read.table` or `readLines` add the argument `skipNul=TRUE` – G. Grothendieck Feb 24 '23 at 22:06
  • @ric-villalba I've got the same error : "nul character not allowed" – PAC Feb 24 '23 at 22:13
  • 2
    If you have the table in R, you need to share some of the data using `dput`. Otherwise we are not in a position to help since we can not reproduce your table – Onyambu Feb 24 '23 at 22:21
  • 1
    Note that `"blabla \0sults"` is not a valid R string. So you need to provide us with the data in R. ie check where the null is. probably row 4, column 5, then do `dput(df[4,5])` and copy paste that – Onyambu Feb 24 '23 at 22:23
  • How did you import the data exacty? It shouldn't be possible to have an embedded nul in a string. Perhaps it's an encoding issue? Hard to help without some kind of [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to test with – MrFlick Feb 24 '23 at 22:30
  • @g-grothendieck : Thanks for the tricks but it doesn't work on my data. – PAC Feb 26 '23 at 21:11

1 Answers1

1

I suspect (for whatever reason) your table doesn't actually have the nul character, rather strictly speaking the strings have the characters "\" and "0". In order to match this, you have to double escape it.

For example, if I have the file "data.csv"

key,value
key1,blue
key2,\0sults
key3,blabla \0sults
key4,brown

Then you would match the pairs of characters like this:

myData <- read.csv("data.csv")
grepl(pattern = "\\\\0", myData$value)
#> [1] FALSE  TRUE  TRUE FALSE
Marcus
  • 3,478
  • 1
  • 7
  • 16