0

I have two vectors, which I would like to combine in one dataframe. One of the vectors values needs to be divided into two columns. The second vector nc informs about the number of values for each observation. If nc is 1, only one value is given in values (which goes into val1) and 999 is to be written in the second column (val2).

What is an r-ish way to divide vector value and populate the two columns of df? I suspect I miss something very obvious, but can't proceed at the moment...Many thanks!

set.seed(123)
nc <- sample(1:2, 10, replace = TRUE)
value <- sample(1:6, sum(nc), replace = TRUE)



# result by hand
df <- data.frame(nc = nc, 
               val1 = c(6, 3, 4, 1, 2, 2, 6, 5, 6, 5), 
               val2 = c(999, 5, 999, 6, 1, 999, 6, 4, 4, 999))  
joran
  • 169,992
  • 32
  • 429
  • 468
Fritzbrause
  • 87
  • 2
  • 7
  • _"If nc is 1, only one value is given in values (which goes into val1) and 999 is to be written in the second column (val2)."_ <=> `df$val2 <- ifelse(df$nc == 1, 999, df$val2)` and `df$val1 <- ifelse(df$nc == 1, df$nc, df$val1)`? – lukeA Mar 10 '15 at 11:48
  • No, I don't think that will work: if, e.g. nc[1] == 2 then you need to pick the first two values from `value`, if e.g. nc[1] == 1 then only the fist value from `value`. To find out, to which observation a value belongs to, I think, one needs to iterate through the entire list. – Fritzbrause Mar 10 '15 at 12:44
  • 1
    To clarify what exactly you need, please add the data frame with the expected output to your post. – lukeA Mar 10 '15 at 12:48

2 Answers2

0

I think this is what you are looking for. I'm not sure it is the fastest way, but it should do the trick.

count <- 0
for (i in 1:length(nc)) {
    count <- count + nc[i]
    if(nc[i]==1) {
        df$val1[i] <- value[count]
        df$val2[i] <- 999
    } else {
        df$val1[i] <- value[count-1]
        df$val2[i] <- value[count]
    }
}
RHA
  • 3,677
  • 4
  • 25
  • 48
  • Although this code possibly does return the correct result, it certainly is not the "r-ish way". – Roland Mar 10 '15 at 13:09
0

Here's an approach based on this answer:

set.seed(123)
nc <- sample(1:2, 10, replace = TRUE)
value <- sample(1:6, sum(nc), replace = TRUE)

splitUsing <- function(x, pos) {
    unname(split(x, cumsum(seq_along(x) %in% cumsum(replace(pos, 1, pos[1] + 1)))))
}

combineValues <- function(vals, nums) {
    mydf <- data.frame(cbind(nums, do.call(rbind, splitUsing(vals, nums))))
    mydf$V3[mydf$nums == 1] <- 999
    return(mydf)
}

df <- combineValues(value, nc)
Community
  • 1
  • 1
stuwest
  • 910
  • 1
  • 6
  • 14