4

I have a file containing lines that look like this:

GTTCAGAGTTCTACAGTCCGACGATCGGATGAGNNNNNN
GTTCAGAGTTCTACAGTCCGACGATCTCCGAGTNNNNNN
GTTCAGAGTTCTACAGTCCGACGATCCTTATATNNNNNN
GTTCAGAGTTCTACAGTCCGACGATCGAAGTGCNNNNNN
GTTCAGAGTTCTACAGTCCGACGATCAAGTTTTNNNNNN
GTTCAGAGTTCTACAGTCCGACGATCCGACGAANNNNNN

I want to remove the first 26 and final 6 characters from each line. I haven't been able to write a good regular expression to accomplish that using vi, but I'm not sure what else to do. Any suggestions?

Thanks!

PaulProgrammer
  • 16,175
  • 4
  • 39
  • 56
Forest
  • 721
  • 1
  • 8
  • 14
  • Great question, it forces to think not linearly on regex – Tonio Dec 03 '13 at 23:12
  • 2
    Regex is not suitable for all problems. It's a heavy hammer for a small problem like grabbing a static set of bytes from a list of strings. – PaulProgrammer Dec 03 '13 at 23:16
  • 1
    possible duplicate of [What linux shell command returns a part of a string?](http://stackoverflow.com/questions/219402/what-linux-shell-command-returns-a-part-of-a-string), or *why would you use regex when you know exactly how many characters you want to cut?!? – dmckee --- ex-moderator kitten Dec 04 '13 at 00:31

3 Answers3

3

Try with grep.

This will keep the last 13 characters and then the first 7, returning only the matching characters (-o) with the Perl-compliant -P flag:

grep -oP ".{13}$" foo.txt | grep -oP ".{7}"
Federico Giorgi
  • 10,495
  • 9
  • 42
  • 56
  • don't forget to mark it the correct one when you can, Forest. – Plasmarob Dec 03 '13 at 23:15
  • His example lines are 39 characters. What if some were 50 characters? Given known conditions `unknown length = 26 + unknown segment + 6` Thats 1 equation, 2 unknowns. Yet you solve it. –  Dec 03 '13 at 23:31
2

If your file name is foo you can use cut to grab out the range of chars you want:

$ cut -c27-33 foo

This produces:

GGATGAG
TCCGAGT
CTTATAT
GAAGTGC
AAGTTTT
CGACGAA
PaulProgrammer
  • 16,175
  • 4
  • 39
  • 56
1

cut can take a character range, if the lines are a fixed size (they appear to each be 39 characters)

cut -c27-33 file.txt
chepner
  • 497,756
  • 71
  • 530
  • 681