-1

I have a huge csv file containing text which I want to break in to line 80 characters long. The small fragment of the file is as follows:

ATTTATGAAGGAGAGGGGTCAGGGTTGATTCGGGAGGATCCTATTGGTGCGGGGGCTTTGTATGATTATGGGCGTTGATTAGTAGTAGTTACTGGTTGAACATTGTTTGTTGGTGTATATATTGTAATTGAGATTGCTCGGGGGAATAGGATGATGTATGCTTTGTTTCTGTTGAGTGTGGGTTTAGTAATGGGGTTTGTGGGGTTTTCTTCTAAGCCTTCTCCTATTTATGGGGGTTTAGTATTGATTGTTAGCGGTGTGGTCGGGTGTGTTATTATTCTGAATTTTGGGGGAGGTTATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTATTCCTCATCACCCAACTAAAAATATTAAACACAAACTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAATTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCGCTTCATTCATTGCCCCCACAATCCTAGATGCCCCAACTAAATACTACCGTATGGCCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAACACAAACTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAATTATAACAAACCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTAACCTGACTAGAAAAGCTATTACCTAAAACAATTTCACAGCACCAAATCTCCACCTCCATCATCACCTCAACCCAAAAAGGCATAATTAAACTTTACTTCCTCTCTTTCTTCTTCCCACTCATCCTAACCCTACTCCTAATCACATAAATAACCATGCACACTACTATAACCACCCTAACCCTGACTTCCCTAATTCCCCCCATCCTTACCACCCTCGTTAACCCTAACAAAAAAAACTCATACCCCCATTATGTAAAATCCATTGTCGCATCCACCTTTATTATCAGTCTCTTCCCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATCTAGAAATTGCCCTCCTTTTACCCCTACCATGAGCCCTACAAACAACTAACCTGCCACTAATAGTTATGTCATCCCTCTTATTAATCATCATCCTAGCCCTAAGTCTGGCCTATGAGTGACTACAAAAAGGATTAGACTGAACCGAATATAAACTTCGCCTTAATTTTAATAATCAACACCCTCCTAGCCTTACTACTAATAATTATTACATTTTGACTACCACAACTCAACGGCTAC

How do I do this in R?

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
Pete
  • 81
  • 1
  • 8
  • 1
    If you're on linux or OS X you can do a `fold -w 80 FILENAME > newfile.txt` from a terminal/shell which will wrap any file at 80 chars. A side note that you really don't have a CSV file if it's just a long string of gene sequences. – hrbrmstr Dec 11 '14 at 12:13

1 Answers1

2

Try

 lines <- readLines('bigline.txt') 
 v1 <- strsplit(lines,'(?<=[A-Z]{80})', perl=TRUE)[[1]]

  nchar(v1)
 #[1] 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 10

 identical(v1[1], substr(lines,1,80))
 #[1] TRUE
 identical(v1[2], substr(lines,81,160))
 #[1] TRUE
akrun
  • 874,273
  • 37
  • 540
  • 662