5

I have imported a string into R from a database. The db column type is BYTEA (Postgres). In order for me to use it as intended, it should be of type raw. Instead, it is of type character. I want to convert it to raw in the following sense:

The string representation is

\x1f8b080000000000

If I use charToRaw, it is converted to the array

5c 78 31 66 38 62 30 38 

Instead I need it to be the array

1f 8b 08 00 00 00 00 00

How do I acheive this.

Edit #1 Reply to Chris

library(RPostgreSQL)
conn <- dbConnect(dbDriver("PostgreSQL"), dbname = "somename",
                  host = "1.2.3.4", port = 5432,
                  user = "someuser", password = pw)
some_value <- dbGetQuery(conn, "select value from schema.key_value where key like '%somekey%' limit 1")

some_value$value
# [1] "\\x1f8b080000000000000
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
artdv
  • 774
  • 1
  • 8
  • 23
  • can you share the code you used to import from the database? – Chris Mar 31 '16 at 21:06
  • 1
    maybe cast to string in Postgres and then `charToRaw` once back in R? A better way would be to cast directly to raw in Postgres, not sure if that is possible though – Chris Mar 31 '16 at 21:29
  • Tried `select convert_to(value::text, 'utf-8') as value from schema.key_value ...` then `charToRaw()` without success. Valiant effort though. – artdv Mar 31 '16 at 21:44

1 Answers1

4

This works for converting a single character string of the type you've described to a vector of raws.

## The string I think you're talking about
dat <- "\\x1f8b080000000000"
cat(dat, "\n")
## \x1f8b080000000000

## A function to convert one string to an array of raw
f <- function(x)  {
    ## Break into two-character segments
    x <- strsplit(x, "(?<=.{2})", perl=TRUE)[[1]]
    ## Remove the first element, "\\x"
    x <- x[-1]
    ## Complete the conversion
    as.raw(as.hexmode(x))
}

## Check that it works
f(dat)
##  [1] 1f 8b 08 00 00 00 00 00
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • I'm a little shocked that I can't do this in a native way. Seems really inefficient. I'm leaning toward handling this in python or bash. – artdv Apr 01 '16 at 01:19
  • @artdv Shocked that you can't more simply convert strings like `"\\x1f...."` to raw vectors? Or shocked that you can't get data directly from a database into a raw vector? In any case, yeah, I'd definitely suggest using something you like and are more comfortable with, if you've got it at the ready. – Josh O'Brien Apr 01 '16 at 01:29
  • The latter more than the former. If I want to store binary versions of R objects in a database, what is the desired approach? – artdv Apr 01 '16 at 03:06
  • @artdv -- I haven't a clue what's the best approach, esp. since I don't know your use case. Perhaps as paths to `"*.rds"` files (containing serialized R objects, as written to disk by `saveRDS()` (and readable by `readRDS()`))? – Josh O'Brien Apr 01 '16 at 03:30
  • `1f 8b` suggests this is a gzipped byte array. How do you decompress the result? – Brian D Jun 25 '18 at 17:15
  • @BrianD Does this help? https://stackoverflow.com/questions/5764499/decompress-gz-file-using-r – Josh O'Brien Jun 25 '18 at 17:24
  • not as much as this one: https://stackoverflow.com/questions/50996686/how-to-decode-postgresql-bytea-column-hex-to-int16-uint16-in-r ;) – Brian D Jun 25 '18 at 18:03