0

I am reading some very old files created by C code that consist of a header (ASCII) and then data. I use readBin() to get the header data. When I try to convert the header to a string it fails because there are 3 'bad' bytes. Two of them are binary 0 and the other binary 17 (IIRC).

How do I convert the bad bytes to ASCII SPACE? I've tried some versions of the below code but it fails.

      hd[hd == as.raw(0) | hd  == as.raw(0x17)] <- as.raw(32)

I'd like to replace each bad value with a space so I don't have to recompute all the fixed data locations in parsing the string derived from hd.

Nate Lockwood
  • 3,325
  • 6
  • 28
  • 34
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. How exactly does your code attempt fail? Does it return an error? If so, what is the message? – MrFlick Mar 05 '22 at 00:22

1 Answers1

1

I normally just go through a conversion to integer.

Suppose we have this raw vector:

raw_with_null <- as.raw(c(0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x00, 
                          0x57, 0x6f, 0x72, 0x6c, 0x64, 0x21))

We get an error if we try to convert it to character because of the null byte:

rawToChar(raw_with_null)
#> Error in rawToChar(raw_with_null): embedded nul in string: 'Hello\0World!'

It's easy to convert to numeric and replace any 0s or 23s with 32s (ascii space)

nums <- as.integer(raw_with_null)

nums[nums == 0 | nums == 23] <- 32

We can then convert nums back to raw and then to character:

rawToChar(as.raw(nums))
#> [1] "Hello World!"

Created on 2022-03-05 by the reprex package (v2.0.1)

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • I did figure out how to do this a little differently than yours and used your technique to convert each non ascii characters to 'space' – Nate Lockwood Mar 07 '22 at 18:45