3

suppose I have a long continous string like this.

x='aattggcccagtgtgtacaatcagtgcaggagctaatcccggactccttcgtcccctgtgtcgctgcgctgtgcagcgacgaagccgagcggctcactcgtctcaatcacctcagcttcgcggagctgcttaagcccttctcccgcctcacttccgaggttcacatgagagatcctaataatcaacttcacgtaattaaaaatttgaagatagcagtaagcaacattgtcacccagccacctcagcctggagccatccggaagcttttgaatgatgttgtttctggcagtcagcctgcagaaggattagtagctaatgtgattacagcaggagattatgaccttaacatcagtgAAAAAGCAAAGGACAAAAGATCTTTCTCGGGTGTTTCATTCTTACAGTCCATATGATCACAAGATTGGTGAAAGACAACAAGTGTTAGTAACAGAAGAATCTTTTGATTCCAAGTTTTATGTTGCACACAATCAATTCTATGAGCAGGTTTTAGTGCCAAAGAACCCTGCGTTCATGGGGAAGATGGTTGAAGTGGACATCTATGAATCAGGCAAACATTTTATGAAAGGGCAGCCAGTATCTGATGCCAAAGTGTACACGCCCTCCATCAGCAAACCGCTAGCAAAGGGAGAAGTCTCGGGTTTGACAAAGGACTTCAGAAATGGGCTTGGGAACCAGCTGAGTTCAGGATCCCACACCTCTGCTGCATCTCAGTGTGACTCAGCGAGTTCCAGAATGGTGCTGCCCATGCCAAGGCTACATCAAGACTGTGCGCTGAGGATGTCCGTGGGCTTGGCTCTGCTGGGTCTTCTTTTTGCTTTTTTTGTCAAGGTCTATAATTAGGGA'

I would like to split this string into 10 characters with space in between. So far I tried this but its a bit clumsy and does'nt always bin the characters into 10

gregexpr(".{10}|.{9}|.{8}|.{7}|.{6}|.{5}|.{4}|.{3}|.{2}|.{1}", x ))[[1]] 

the issue with the above code is when the string ends with anything other than 10 characters, nth-1 characters have to hard coded.

user438383
  • 5,716
  • 8
  • 28
  • 43
Ahdee
  • 4,679
  • 4
  • 34
  • 58

1 Answers1

3

This is a little bit cheesy, but:

c(                             ## drop "omit" attribute
  na.omit(                     ## drop NA values (from end)
    unlist(                    ## collapse from data frame to vector
       read.fwf(               ## read fixed-width "file"
            textConnection(x), ## treat string as a file
                 widths = rep(10,    ## string width
                               1000  ## a 'big enough' value
           )))))

Or if you like (in recent-ish versions of R that have |>)

(x
   |> textConnection()
   |> read.fwf(widths = rep(10, 1000))
   |> unlist()
   |> na.omit()
   |> c()
)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453