4

I have a text document that I would like to substring. I'm using the following code to substring:

substr(text, start,start+end)

start has a vector of 60 elements. However, the above code only returns the equivalent of substr(text,1109,1109+199). How do I get it to return all 60 elements, namely

  • substr(text,1109,1109+199)
  • substr(text,11590,11590+199)
  • ....

Sample data

start

[1] 11009 11590 11972 15674 16274 16659 19866 20541 20963 24787 25376
[12] 25746 29458 30011 30363 34086 34702 35087 38643 39095 39416 42626
[23] 43188 43545 46731 47367 47757 51029 51673 52072 55444 56076 56470
[34] 59794 60445 60851 64267 64877 65276 68659 69200 69547 72747 73303
[45] 73657 76896 77648 78103 81541 82050 82391 85277 85848 86227 89128
[56] 89656 90010 92830 93329 93656

end

[1] 199 199 199 201 201 201 218 218 218 186 186 186 177 177 177 192 192
[18] 192 160 160 160 178 178 178 194 194 194 200 200 200 197 197 197 205
[35] 205 205 200 200 200 174 174 174 178 178 178 235 235 235 171 171 171
[52] 190 190 190 179 179 179 169 169 169
Pop
  • 12,135
  • 5
  • 55
  • 68
Paolo
  • 1,557
  • 3
  • 18
  • 28
  • 1
    Can you provide a minimal [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) example? – johannes Jul 12 '12 at 07:05
  • 1
    Maybe using mapply? `mapply(FUN = function(x, y) substr(text, x, x+y), x = start, y = end)` (not tested). – Roman Luštrik Jul 12 '12 at 07:22

1 Answers1

5

In stead of substr you could use substring :

substring(your_text,first=start,last=(start+end))
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
Pop
  • 12,135
  • 5
  • 55
  • 68
  • hey this works! how did you know the difference between substr and substring? – Paolo Jul 12 '12 at 07:29
  • I just browsed through the examples in the help page of substr and substring. – Pop Jul 12 '12 at 07:35
  • i do that but i never understand it. thanks for helping me out. – Paolo Jul 12 '12 at 07:43
  • 1
    `substr` takes one start value and one end value, and returns one string. `substring` takes `n` start values and the corresponding `n` end values, returning `n` strings, such that each string begins at `n[i]` and ends at `n[i]`, where `i` is in `1:n`. – Omar Wagih Aug 05 '14 at 11:37
  • 1
    @by0 that's not strictly true. For `substr(x, start, stop)`, you can pass `length(x)` `start` values and `length(x)` `stop` values. See the examples in the documentation – De Novo Sep 27 '17 at 21:28
  • To clarify, the difference here is that `substring(x, first, last)` will recycle vector x to meet the length of the longest of the vectors first and last. `substr(x, start, stop)` does not recycle x. So yes, in the special case where `length(x)`is 1, only the first element of `start` and the first element of `stop` are used. – De Novo Sep 27 '17 at 22:43
  • Another difference is substring() has a default value for `stop` so it is unnecessary to specify if you want start until end of string. – s_baldur Jul 09 '18 at 07:53