0

I am working with a dataset that has many values listed as "<.1". We treat these values by entering in half of the value: 0.05, in this case. However not every cell in the dataset has these < signs so I can't just cut every value in the dataset in half. How do I conditionally edit the dataset to get change this?? Seems like an easy fix but I can't figure it out. Any help is much appreciated!!

df <- as.data.frame(matrix(c("0.09", "<.1", "40", "<.07", ".2", "376", "<0.075", "<0.01", "14"), ncol = 3, byrow = TRUE))
df

I want my data to look like this:

df1 <- as.data.frame(matrix(c("0.09", "0.05", "40", "0.035", ".2", "376", "0.0375", "0.005", "14"), ncol = 3, byrow = TRUE))
df1

EDIT: I have done this:

x <- data[2]

data[2:23] <- lapply(data[2:23], function(x) {
  dohalf <- grepl("^<", x)
  vec2 <- as.numeric(gsub("^<", "", x))
  vec2[dohalf] <- vec2[dohalf]/2
})

but I get this error:

Error: Assigned data `lapply(...)` must be compatible with existing data.
x Existing data has 241 rows.
x Element 1 of assigned data has 71 rows.
i Only vectors of size 1 are recycled.

Is it because I have NA's? There are only 71 values that have this "<" in the first column out of 241 observations.

klarathon
  • 3
  • 2
  • I understand the need to change `<.1` to `0.05`. But how do you know that `.1` really means `<.1`? To me, they are very different things. Regardless, please provide sample data and expected output, making this a reproducible self-contained question. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. Thanks! – r2evans Apr 23 '21 at 16:01
  • These are obviously strings, are you intending to convert them to `numeric` with this refactoring? – r2evans Apr 23 '21 at 16:03
  • please provide the data and the code you tried and desired output – GuedesBF Apr 23 '21 at 18:52

1 Answers1

1

Try this:

vec <- c("<.1", ".1", ".5")
dohalf <- grepl("^<", vec)
dohalf
# [1]  TRUE FALSE FALSE
vec2 <- as.numeric(gsub("^<", "", vec))
vec2[dohalf] <- vec2[dohalf]/2
vec2
# [1] 0.05 0.10 0.50

If you want to keep them as strings, then

as.character(vec2)
# [1] "0.05" "0.1"  "0.5" 

or you can only numerify/halve/string-ize the <* as needed.

vec
# [1] "<.1" ".1"  ".5" 
dohalf
# [1]  TRUE FALSE FALSE
vec[dohalf] <- as.character(as.numeric(gsub("^<", "", vec[dohalf]))/2)
vec
# [1] "0.05" ".1"   ".5"  
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Perfect! Is there any way to do this for multiple vectors at a time or is it only just one at a time? – klarathon Apr 23 '21 at 17:28
  • How are these multiple vectors stored. If columns of a `dat`aframe, then `dat[,1:3] <- lapply(dat[,1:3], function(vec) {...})` (use your preferred version in place of the `...`) will work. If a list of vectors not in a frame, similar steps work using `lapply`. – r2evans Apr 23 '21 at 17:32