strsplit(rquote, split = "")[[1]] in R

Question

rquote <- "r's internals are irrefutably intriguing"
chars <- strsplit(rquote, split = "")[[1]]

This question has been asked before on this forum and has one answer on it but I couldn't understand anything from that answer, so here I am asking this question again.

In the above code what is the meaning of [[1]] ?

The program that I'm trying to run:

rquote <- "r's internals are irrefutably intriguing"
chars <- strsplit(rquote, split = "")[[1]]

rcount <- 0

for (char in chars) {
  if (char == "r") {
      rcount <- rcount + 1
  }
  if (char == "u") {
      break
  }
  
}

print(rcount)

When I don't use [[1]] I get the following warning message in for loop and I get a wrong output of 1 for rcount instead of 5:

Warning message: the condition has length > 1 and only the first element will be used

Does this answer your question? [The difference between bracket \[ \] and double bracket \[\[ \]\] for accessing the elements of a list or dataframe](https://stackoverflow.com/questions/1169456/the-difference-between-bracket-and-double-bracket-for-accessing-the-el) — user438383, Jun 25 '22 at 13:18
@user438383 No it doesn't answer my question. The answer in the link provided by you is for selecting/sub setting elements of a data frame, lists, etc. — salman, Jun 25 '22 at 13:23
What do you think ``strsplit(rquote, split = "")`` returns then? — user438383, Jun 25 '22 at 13:25
@user438383 It splits the string and returns characters. That means by using ```[[1]]``` ```chars``` should only display the first element of the characters. But it doesn't, it displays all of them as it does when I don't mention the ```[[1]]``` at the end of it. But by not mentioning ```[[1]]``` I encounter an error while using ```chars``` in ```for``` loop. Other then that the output of ```chars``` as such with or without ```[[1]]``` is same. — salman, Jun 25 '22 at 13:32
@Salman it returns a **list**. You can access the first element of the list using [[1]] and then [[1]][2] for e.g. the second character in the first element of the list. — user438383, Jun 25 '22 at 13:37
@Chris I'm getting the same results for ```char``` with or without mentioning ```[[1]]```. But if I don't mention ```[[1]]``` I get an error while using ```char``` in ```for``` loop. — salman, Jun 25 '22 at 13:39
@user438383 let me update the question with the complete program I'm trying to run and the errors I'm encountering without using ```[[1]]``` — salman, Jun 25 '22 at 13:43
Without `[[1]]` returns a list, that is [[1]][and the stuff inside the list as [1][2],,, With `)[[1]]` effectively `unlist`s the returned object to `character`. This then impacts what you should use in your loop, `[[` for list, `[` for vector. How granular (each character vs words) is determined by what you select as your split character. — Chris, Jun 25 '22 at 13:44
@Chris can you post it in the answer and explain it a little bit more. Thank you. — salman, Jun 25 '22 at 13:52

Ivana · Accepted Answer · 2022-06-25T14:09:41.640

strsplit is vectorized. That means it splits each element of a vector into a vectors. To handle this vector of vectors it returns a list in which a slot (indexed by [[) corresponds to a element of the input vector.

If you use the function on a one element vector (single string as you do), you get a one-slot list. Using [[1]] right after strsplit() selects the first slot of the list - the anticipated vector.

Unfortunately, your list chars works in a for loop - you have one iteration with the one slot. In if you compare the vector of letters against "r" which throws the warning. Since the first element of the comparison is TRUE, the condition holds and rcount is rised by 1 = your result. Since you are not indexing the letters but the one phrase, the cycle stops there.

Maybe if you run something like strsplit(c("one", "two"), split="") , the outcome will be more straightforward.

> strsplit(c("one", "two"), split="")
[[1]]
[1] "o" "n" "e"

[[2]]
[1] "t" "w" "o"

> strsplit(c("one", "two"), split="")[[1]] 
[1] "o" "n" "e"

> strsplit(c("one"), split="")[[1]][2] 
[1] "n"

Thanks that clears up a lot of things. Can you explain this part a bit more: "Since you are not indexing the letters but the one phrase, the cycle stops there." Does not stating ```[[1]]``` index the whole phrase as one and not element wise? — salman, Jun 25 '22 at 14:35
When you add `[[1]]` after `strsplit()` your `chars` will be a vector of letters (as in my 2nd example) and cycle will go letter by letter. — Ivana, Jun 25 '22 at 15:56

Chris · Answer 2 · 2022-06-25T15:21:39.820

We'll start with the below as data, without [[1]]:

rquote <- "r's internals are irrefutably intriguing"
chars2 <- strsplit(rquote, split = "")
class(chars2)
[1] "list"

It is always good to have an estimate of your return value, your above '5'. We have both length and lengths.

length(chars2)
[1] 1     # our list
lengths(chars2)
[1] 40    # elements within our list

We'll use lengths in our for loop for counter, and, as you did, establish a receiver vector outside the loop,

rcount2 <- 0
for (i in 1:lengths(chars2)) {
    if (chars2[[1]][i] == 'r') {
      rcount2 <- rcount2 +1
       }
    if (chars2[[1]][i] == 'u') {
      break
      }
}
print(rcount2)
[1] 6
length(which(chars2[[1]] == 'r')) # as a check, and another way to estimate
[1] 6

Now supposing, rather than list, we have a character vector:

chars1 <- strsplit(rquote, split = '')[[1]]
length(chars1)
[1] 40
rcount1 <- 0
for(i in 1:length(chars1)) {
if(chars1[i] == 'r') {
rcount1 <- rcount1 +1
}
if (chars1[i] == 'u') {
break
}
}
print(rcount1)
[1] 5
length(which(chars1 == 'r'))
[1] 6

Hey, there's your '5'. What's going on here? Head scratch...

all.equal(chars1, unlist(chars2))
[1] TRUE

That break should just give us 5 'r' before a 'u' is encountered. What's happening when it's a list (or does that matter...?), how does the final r make it into rcount2?

And this is where the fun begins. Jeez. break for coffee and thinking. Runs okay. Usual morning hallucination. They come and go. But, as a final note, when you really want to torture yourself, put browser() inside your for loop and step thru.

Browse[1]> i
[1] 24
Browse[1]> n
debug at #7: break
Browse[1]> chars2[[1]][i] == 'u'
[1] TRUE
Browse[1]> n
> rcount2
[1] 5

strsplit(rquote, split = "")[[1]] in R

2 Answers2