1

I have looked at many questions at StackOverflow and uncle Google, but somehow I still can't crack it.

I have a CSV file that is automatically exported by SSRS. Unfortunately the export plugins are old and they put two line breaks and carriage returns at the end of the file:

00000c0: 6b7c 3230 2d46 6562 2d31 360d 0a0d 0a k|20-Feb-16....

I tried many sed replacements however it seems to only remove one line.

For example the simplistic

sed -i '/^\s*$/d'

Also tried to replace \s with [[:space:]] (also works but on one line only)

After which the last line of the hex dump looks like below:

00000c0: 6b7c 3230 2d46 6562 2d31 360d 0a k|20-Feb-16..

I've tried things like:

sed -i 's/\x0D\X0A//g' <file> however this wouldn't replace both 0d0a at the end

Any help would be appreciated

Moseleyi
  • 2,585
  • 1
  • 24
  • 46

2 Answers2

1

The following command should work for you:

sed 's/\x0d//;/^$/d'

I'm removing all carriage return characters and delete empty lines.

Try it, like this:

echo -e "foo\x0a\x0d\x0a" | sed 's/\x0d//;/^$/d' | xxd
00000000: 666f 6f0a                                foo.
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • That's interesting, I used it and the last line of hexdump now looks like : `00000f0: 2043 6f6f 6b7c 3033 2d4d 6172 2d31 360a Cook|03-Mar-16.` Why would the first x0a disappear but not the second? – Moseleyi Mar 04 '16 at 23:20
  • That's because `sed` will (usually) "not see" the linefeeds. It operates on a per line basis. Also it is fine and correct that the last line ends with a line delimiter. The carriage return character on the other will not being recognized as a line delimiter on UNIX like systems. That's why it can be removed by sed. – hek2mgl Mar 05 '16 at 09:00
  • But if the final destination of the file is Windows-based, the process that runs there will see LF as new line right? – Moseleyi Mar 05 '16 at 09:28
  • No, if the final destination of the file is Windows you need to convert all line endings to `\0xd\0xa`. In that case `sed '/^\x0d$/d'` should work. It simply removed the trailing empty line, but keeps windows carriage returns intact. – hek2mgl Mar 05 '16 at 09:30
  • The thing is I don't want those line endings at the end of the file so basically the part `\0xd\0xa\0xd\0xa` has to be removed from the end of the file. – Moseleyi Mar 05 '16 at 09:33
  • Apologies if I'm not getting it... this stil leaves one \x0d\x0a at the end of the file – Moseleyi Mar 05 '16 at 15:19
  • That's fine! Every text file should end with a newline. Since it is a windows file, it ends with `\x0d\x0a`. A Unix file would end with `\x0a`. – hek2mgl Mar 05 '16 at 15:24
  • That's what I thought but the receiver of the file, opens it on Windows with their tool and they still see that line end. They keep telling me their "software" can't be changed and it can fail data import if it finds empty line, hence I've been trying to get rid of it completely to meet their demands – Moseleyi Mar 07 '16 at 08:10
  • There is no empty empty line. `\x0d\x0a` is the *end* of the last *non empty* (!) line. I can't tell why *they* say it is wrong. It isn't.. I fear I can't do more. – hek2mgl Mar 07 '16 at 08:46
  • Thanks for all your explanations.. They helped a lot, and I passed the information over. I think everything is correct now so thank you – Moseleyi Mar 07 '16 at 11:04
1

"Forget about the last two lines."

# gnu!
head -n -2 foo.csv > foo.csv.new

"Oh, ed (or ex/vi/vim), kill the last two lines."

ed foo.csv << EOF
$
-1,$d
w foo.csv.new
q
EOF

# ex/vi/vim: change this to vi -c "the whole trunk".

"I love sed. And I love sed." (I don't)

sed -i -e '$d' foo.csv; sed -i -e '$d' foo.csv

"Should I kill this?"

[[ $(tail -n 2 foo.csv) == $'\r\n\r\n' ]]

"Vim, can you test for that yourself?"

# I don't write vimscript.

"PERL?"

# I don't write Perl.
Mingye Wang
  • 1,107
  • 9
  • 32