1

Assume that I have this text:

eskitirim eski[Verb]-t[Verb+Caus]+[Pos]+Hr[Aor]+YHm[A1sg] : 20.4453125 eski[Verb]-t[Verb+Caus]+[Pos]+Hr[Aor]+Hm[A1sg] : 21.7978515625

I want to remove everything after the second space. Output should be:

eskitirim eski[Verb]-t[Verb+Caus]+[Pos]+Hr[Aor]+YHm[A1sg]
JayGatsby
  • 1,541
  • 6
  • 21
  • 41

3 Answers3

4

If you are absolutely certain that the format (as to spacing) will always be exactly as you've shown it in the question, a simpler solution might be appropriate, but I would dig deeper into the semantics of your data to give a more robust solution.

1) If spacing could possibly vary but you definitely want only the first two non-space-containing sequences, use awk '{print $1,$2}'.

2) If the : is significant and guaranteed to be present, I would use that rather than spaces to delimit what you are after: awk -F: '{print $1}'.

3) I would not recommend any sed/regex solution unless there can be more than one sequential space and it is critical to preserve the exact amount of such space.

Jeff Y
  • 2,437
  • 1
  • 11
  • 18
2

You could use a capturing group to capture everything before the second space:

(.*?\s.*?)\s.*

And then replace everything with the first capturing group match.

Example Here

So (.*?\s.*?)\s.* replaced with \1 would output:

eskitirim eski[Verb]-t[Verb+Caus]+[Pos]+Hr[Aor]+YHm[A1sg]

Alternatively, you could also replace . with \S:

(\S*\s\S*)\s.*

Same output.

Josh Crozier
  • 233,099
  • 56
  • 391
  • 304
  • @JayGatsby See: http://stackoverflow.com/questions/13043344/search-and-replace-in-bash-using-regular-expressions – Josh Crozier Dec 20 '15 at 20:51
  • @JayGatsby `echo $string | sed 's/\(\S*\s\S*\)\s.*/\1/g'` worked for me. – Josh Crozier Dec 20 '15 at 21:11
  • @JoshCrozier: ``echo $string | sed -r 's/^(\S+\s\S+)\s.*/\1/'``: the ``-r`` switch enable ERE (brackets and plus doesn't need escaping) and there is no reason to use the global ``g`` flag – Giuseppe Ricupero Dec 20 '15 at 21:23
2

You can also use a simple cut to do the job:

~$ echo 'eskitirim ... ' | cut -d' ' -f-2        # or -f1,2
# eskitirim eski[Verb]-t[Verb+Caus]+[Pos]+Hr[Aor]+YHm[A1sg]

~$ echo 'eskitirim ... ' | cut -d':' -f1
# eskitirim eski[Verb]-t[Verb+Caus]+[Pos]+Hr[Aor]+YHm[A1sg]
Giuseppe Ricupero
  • 6,134
  • 3
  • 23
  • 32