Specific sequence creation in R

Question

I want to create the following sequences in a smart way instead of hard-coding them:

'0-0-0-0-0-0'
'0-1-0-0-0-0'
'0-0-1-0-0-0'
'0-0-0-1-0-0'
'0-0-0-0-1-0'
'0-0-0-0-0-1'
'1-0-0-0-0-0'
'1-1-0-0-0-0'
'1-0-1-0-0-0'
'1-0-0-1-0-0'
'1-0-0-0-1-0'
'1-0-0-0-0-1'
'1-1-1-1-1-1'
'2-0-0-0-0-0'
'2-1-0-0-0-0'
'2-0-1-0-0-0'
'2-0-0-1-0-0'
'2-0-0-0-1-0'
'2-0-0-0-0-1'
'3-0-0-0-0-0'
'3-1-0-0-0-0'
'3-0-1-0-0-0'
'3-0-0-1-0-0'
'3-0-0-0-1-0'
'3-0-0-0-0-1'
'0-2-0-0-0-0'
'0-0-2-0-0-0'
'0-0-0-2-0-0'
'0-0-0-0-2-0'
'0-0-0-0-0-2'
 and so on...

Elaborating more on the details of the pattern that presents: I have 4 states {0,1,2,3} and I want to find all the possible combinations for sequences of length=6 starting with any of the states and allowing only one intermediate position of the sequence to be present in any of the next positions.

I'm not quite clear on what "and so on..." means. For example, is `3-0-2-0-0` one of things covered? How many elements are you trying to generate all together? — John Coleman, Jun 04 '19 at 17:14
@JohnColeman Yes, it is one of those elements I want to create. I do not know how many will be created in total, but only one of the positions after the first one can be in another state than 0 given all different states at position one. — azal, Jun 04 '19 at 17:16
If I understand correctly, that should yield `4*(3*5+1) = 64` possibilities. — John Coleman, Jun 04 '19 at 17:18
Could you clarify what you mean by *intermediate position*? Do you just mean "at most one other non-zero", or are there restrictions (like perhaps the one other non-zero must be less than the first value). — Gregor Thomas, Jun 04 '19 at 17:24
Intermediate positions = positions after the first one. In these positions, I want each status to appear only once, in each position, given all possible statuses at the first position — azal, Jun 04 '19 at 17:26

Gregor Thomas · Accepted Answer · 2019-06-05T13:01:03.503

3

Here's one method. I generate a simple description of each sequence, then build the sequences (and de-duplicate, which is needed because of the all-intermediate-0 items).

dd = expand.grid(first = 0:3, inter_value = 0:3, inter_position = 2:6)

result = t(apply(dd, 1, function(x) {
  z = c(x["first"], rep(0L, 5))
  z[x["inter_position"]] = x["inter_value"]
  z
}))

result = result[!duplicated(result), ]

dim(result)
# [1] 64  6
head(result, 10)
#       first          
#  [1,]     0 0 0 0 0 0
#  [2,]     1 0 0 0 0 0
#  [3,]     2 0 0 0 0 0
#  [4,]     3 0 0 0 0 0
#  [5,]     0 1 0 0 0 0
#  [6,]     1 1 0 0 0 0
#  [7,]     2 1 0 0 0 0
#  [8,]     3 1 0 0 0 0
#  [9,]     0 2 0 0 0 0
# [10,]     1 2 0 0 0 0

Getting the dashes:

apply(result, 1, paste, collapse = "-")
#  [1] "0-0-0-0-0-0" "1-0-0-0-0-0" "2-0-0-0-0-0" "3-0-0-0-0-0" "0-1-0-0-0-0" "1-1-0-0-0-0" "2-1-0-0-0-0"
#  [8] "3-1-0-0-0-0" "0-2-0-0-0-0" "1-2-0-0-0-0" "2-2-0-0-0-0" "3-2-0-0-0-0" "0-3-0-0-0-0" "1-3-0-0-0-0"
# [15] "2-3-0-0-0-0" "3-3-0-0-0-0" "0-0-1-0-0-0" "1-0-1-0-0-0" "2-0-1-0-0-0" "3-0-1-0-0-0" "0-0-2-0-0-0"
# [22] "1-0-2-0-0-0" "2-0-2-0-0-0" "3-0-2-0-0-0" "0-0-3-0-0-0" "1-0-3-0-0-0" "2-0-3-0-0-0" "3-0-3-0-0-0"
# [29] "0-0-0-1-0-0" "1-0-0-1-0-0" "2-0-0-1-0-0" "3-0-0-1-0-0" "0-0-0-2-0-0" "1-0-0-2-0-0" "2-0-0-2-0-0"
# [36] "3-0-0-2-0-0" "0-0-0-3-0-0" "1-0-0-3-0-0" "2-0-0-3-0-0" "3-0-0-3-0-0" "0-0-0-0-1-0" "1-0-0-0-1-0"
# [43] "2-0-0-0-1-0" "3-0-0-0-1-0" "0-0-0-0-2-0" "1-0-0-0-2-0" "2-0-0-0-2-0" "3-0-0-0-2-0" "0-0-0-0-3-0"
# [50] "1-0-0-0-3-0" "2-0-0-0-3-0" "3-0-0-0-3-0" "0-0-0-0-0-1" "1-0-0-0-0-1" "2-0-0-0-0-1" "3-0-0-0-0-1"
# [57] "0-0-0-0-0-2" "1-0-0-0-0-2" "2-0-0-0-0-2" "3-0-0-0-0-2" "0-0-0-0-0-3" "1-0-0-0-0-3" "2-0-0-0-0-3"
# [64] "3-0-0-0-0-3"

edited Jun 05 '19 at 13:01

answered Jun 04 '19 at 17:32

Gregor Thomas

136,190
20
167
294

How can I get the output in the format of "0-0-0-0-0-0" (string with dashes) instead of the current one? – azal Jun 05 '19 at 10:04
Added the dashes at the bottom. – Gregor Thomas Jun 05 '19 at 13:01
I was expecting more statuses. E.g. statuses 0-1-1-0-0-0, 0-1-1-1-0-0 etc. are missing. I'd like if possible to include all possible combinations – azal Jun 12 '19 at 12:12
1

I don't understand. You stated "only one intermediate position of the sequence to be present", and defined "intermediate position" as *all positions after the first*. Maybe the issue is the definition of "present", but my answer, the other answer, and the comments all assumed "present" to mean *not zero*. You also said "I want each status to appear only once, in each position" To me, 0-1-1-0-0-0 has 2 intermediate positions present and the 1 appears twice, therefore it is doubly invalid. If that's not what you want, I would recommend asking a new question with a clearer explanation. – Gregor Thomas Jun 12 '19 at 13:16
Generally, over a week after multiple answers were posted is late to be fundamentally changing the questions requirements, I think you'll have more success getting the answer you want by considering this question complete and asking a new question, perhaps using this one as a reference, than by editing this one so late. – Gregor Thomas Jun 12 '19 at 13:20
You're absolutely right; my mistake. I just want to see the difference between the two snippets in order to better understand the code, so as to generate any future sequences without asking a question. – azal Jun 12 '19 at 13:25
1

We'd take a much simpler approach to generate all combinations, something like `all_combi = do.call(expand.grid, rep(list(0:3), 6))` in base R (which is just a fancy way to write `expand.grid(0:3, 0:3, 0:3, 0:3, 0:3, 0:3)`). Or using non-base packages, `RcppAlgos::permuteGeneral(0:3, 6, repetition = TRUE)` would work. – Gregor Thomas Jun 12 '19 at 13:41
1

For general combinations and permutations, [this answer](https://stackoverflow.com/a/47983855/903061) is extremely thorough. – Gregor Thomas Jun 12 '19 at 13:42
Brilliant! Thanks : ) – azal Jun 12 '19 at 13:45

score 1 · Answer 2 · answered Jun 04 '19 at 18:56

Here's a general nested for-loop solution. Not the most efficient in the world, but gets the desired result (Note: You can change states and/or sequence_len and the sequences will be generated automatically):

states <- 0:3
states_len <- length(states)
sequence_len <- 6
sequence_mat <- matrix(0, states_len*{{states_len-1}*{sequence_len-1}+1}, sequence_len)
rw <- 1
for(ii in states){
  for(jj in states){
    for(kk in 2:sequence_len){
      if(jj != 0){
        rw = rw + 1
      }
      sequence_mat[rw, 1] <- ii
      sequence_mat[rw, kk] <- jj
      if(jj == rev(states)[1] && kk == sequence_len){
        rw = rw + 1
      }
    }
  }
}

Output:

> head(sequence_mat, 20)
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    0    0    0    0    0    0
 [2,]    0    1    0    0    0    0
 [3,]    0    0    1    0    0    0
 [4,]    0    0    0    1    0    0
 [5,]    0    0    0    0    1    0
 [6,]    0    0    0    0    0    1
 [7,]    0    2    0    0    0    0
 [8,]    0    0    2    0    0    0
 [9,]    0    0    0    2    0    0
[10,]    0    0    0    0    2    0
[11,]    0    0    0    0    0    2
[12,]    0    3    0    0    0    0
[13,]    0    0    3    0    0    0
[14,]    0    0    0    3    0    0
[15,]    0    0    0    0    3    0
[16,]    0    0    0    0    0    3
[17,]    1    0    0    0    0    0
[18,]    1    1    0    0    0    0
[19,]    1    0    1    0    0    0
[20,]    1    0    0    1    0    0

Specific sequence creation in R

2 Answers2