0

I have a file (test.dat) which contains data like this

459|199811047|a |b |shan
kar|ooty|
460|199811047|a |b |guru|cbe|

but I need it like:

459|199811047|a |b |shankar|ooty|
460|199811047|a |b |guru|cbe|

While reading the data from this file, I don't want to remove newline from the end of each record. I just want to remove the \n between two string (like:shankar) inside the pipe symbol.

actually inside the unix my dat file... consist of 500 character.. so the first 300 character appear in the first line and got break(newline)for the next 200 character... but the 500 should be treated like single line.. so am trying to append the characters which has got break because of newline.

jcrshankar
  • 75
  • 1
  • 4
  • 9
  • The last paragraph is clear as mud. Is the file 500 characters long in total, with just two lines in it? Or does it consist of many lines up to 500 characters long, but lines that are more than 300 characters long have been mutilated by inserting a newline after the 300th character? Are the pipe symbols mentioned still relevant in this? How can you tell when the line has been mutilated? And please learn to use the shift key and write complete English words - 'bec' is not an acceptable abbreviation. – Jonathan Leffler Feb 21 '11 at 15:15
  • test.dat file consist of many lines,all lines are exactly 500 characters in it. when the line reach the 300th character,it has been mutilated by inserting a newline and next 200 character present in a next line. the pipe symbol related to this only, here i used pipe as a delimiter.. – jcrshankar Feb 21 '11 at 15:25

4 Answers4

1

It isn't clear really what the criteria for joining two lines are. However, this will probably do the trick on the data shown:

sed -e '/|shan$/N;s/|shan\nkar|/|shankar|/' test.dat

Tested with sed on MacOS X 10.6.6.

If the criterion is 'if the line does not end with a pipe, join it with the next line', then this works:

sed -e '/[^|]$/{N;s/\n//;}' test.dat

The search says 'if the line does not end with a pipe'; '{' starts a group of operations; N concatenates the next line with a newline in between; the s/\n// deletes the newline; '}' ends the group of operations.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • hi jonathan, I have a file which contains data like this 459,|1998-11-047|a |b |c \n efg | d|e | \N 459,|1998-11-047|a \n c|b |c \n efg | d|e | \N Basically what I have to do is , I have to remove all \n which is coming ( enclosed ) in between two pipes ( | ).. i dont want to remove \N... just to make dif i put it in \n and \N... my need is to remove \n between any string that resides inside the pipe... – jcrshankar Feb 21 '11 at 14:26
  • @jcrshankar: as you've just found out, comments do not preserve newlines. Clarify your question with additional examples. I am not clear what the criteria for 'between two pipes' is. I suspect @Raymond has it about correct; lines which do not end in a pipe need to be joined to the next, with the newline deleted (rather than replaced with a blank). – Jonathan Leffler Feb 21 '11 at 14:38
  • hi jonathan, actually inside the unix my dat file... consist of 500 character.. so the first 300 character appear in the first line and got break(newline)for the next 200 character... but the 500 should be treated like single line.. so am trying to append the characters which has got break bec of newline – jcrshankar Feb 21 '11 at 14:53
  • @jcrshankar: fix the process that mutilates your data. If you absolutely can't alter the mutilator, then you need to ask your question explaining what the problem is clearly. You don't need to describe how you think it should be fixed; you just describe what the result should be. So far, you've changed the requirements twice. See [How to ask questions the smart way](http://www.catb.org/~esr/faqs/smart-questions.html). – Jonathan Leffler Feb 21 '11 at 15:01
  • now my question may be clear i think. sorry for the over burden – jcrshankar Feb 21 '11 at 15:08
1

Inspired by How can I replace a newline (\n) using sed?

sed ':a;N;/|$/!ba;s/\n//g'

Explanation(for the difference from the inspiration):

  1. If we encounter a line does not end with '|', branch to the created register/|$/!ba
Community
  • 1
  • 1
Raymond Tau
  • 3,429
  • 26
  • 28
0
awk 'ORS=/^[0-9]/?"\0":"\n"' file

ruby -ne 'print /^\d+/?"#{$_.chomp}":"#{$_}";' file
kurumi
  • 25,121
  • 5
  • 44
  • 52
0

A slightly different approach:

sed '/^.\{300\}$/{N;s/\n//}' inputfile

If a line consists of exactly 300 characters, append the next line.

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439