1

I have a requirement to remove all the control character displayed as "^@" in a list of files.

In the VI editor mode to type the control character ^@ we need to press CTRL+V+@ But the same is not working in the shell prompt. Please help me on this.

Note : I tried in Debian linux machine.

Abu
  • 21
  • 5

3 Answers3

3

What you are looking at in Vim are "null" bytes, i.e. bytes with the numeric value zero.

You can check that by putting the cursor on top of that ^@ and typing ga. This displays the numeric value of the character under the cursor.

If you need to remove all occurrences of that character from a file, you can use sed, and you don't need to type ^@ for that at all, since sed (at least the GNU version, not the BSD one it seems...) supports a different notation for hex values:

sed "s/\x00//g" file.txt

That would print the contents of file.txt to stdout, all zero bytes removed. If you want to remove the bytes in-place (be careful, dangerous to your orignial file, and also (1)!), use the -i option:

sed -i "s/\x00//g" file.txt

(1) Check the answer by gniourf_gniourf (and the comments) on the caveats re sed: You will lose the file creation date, and you need to be sure it's really a file you're working on, not a symlink.


For completeness, you can of course remove zero bytes without leaving Vim.

:%s/<ctrl-v>x00//g
  • : command mode
  • % range: complete file
  • s/ search
  • <ctrl-v> verbatim
  • x hexadecimal
  • 00 zero
  • / replace with...
  • / ...nothing
  • g globally (multiple times per line)

All this is of course assuming that you are not looking at an UTF-16 file and just being confused by the zero bytes in there. If that's the case, @IgnacioVazquez-Abrams hint at iconv is of course the better way: iconv -f UTF-16 -t UTF-8 file.txt. But then, Vim shouldn't be showing you ^@ in the first place.

Community
  • 1
  • 1
DevSolar
  • 67,862
  • 21
  • 134
  • 209
  • To explain why _BSD_ `sed` (and any POSIX-features-only implementation) cannot do it (do tell me if you know of a way): Seemingly, the only escape sequence supported in a regex is `\n` - no way to represent `NUL`. – mklement0 Oct 26 '15 at 15:35
  • 1
    @mklement0: I have no access to BSD at the moment, but it seems the **AIX** version of `sed` just removes null bytes in any case, whether your regex matches them or not. :-D – DevSolar Oct 26 '15 at 15:47
2

^@ is the null byte (0x00). To remove this from a file, you likely want to use a genuine editor and not a program that will create a temporary file and then mv that temp file to the original one: you'd lose all permissions, ownerships and symlinks.

Here's how you can remove all null bytes from a file with ed, the standard editor:

ed -s file < <(printf ',s/\0//g\nw\nq\n')

If you want to use this with, e.g., find, you'll have to proceed thus:

find ... -exec bash -c 'for f do ed -s "$f" < <(printf ',s/\0//g\nw\nq\n'); done' bash {} +
mklement0
  • 382,024
  • 64
  • 607
  • 775
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
  • @mklement0: How is this "more robust" than e.g. `sed -i`? – DevSolar Oct 26 '15 at 15:00
  • 1
    @DevSolar I guess he means what I wrote in the answer: with `sed`, _you'd lose all permissions, ownerships and symlinks._ – gniourf_gniourf Oct 26 '15 at 15:01
  • @DevSolar yes, `sed -i` creates a temporary file and then `mv` this temp file on the original one. Try it `;)`. – gniourf_gniourf Oct 26 '15 at 15:02
  • @gniourf_gniourf: I did, and it doesn't. – DevSolar Oct 26 '15 at 15:02
  • @DevSolar: For a comprehensive look, see the bottom half of my answer at http://stackoverflow.com/a/30066428/45375. In short: permissions _are_ preserved, but symlinks are destroyed, the file-_creation_ date is lost, and, on OSX, extended file attributes are lost. – mklement0 Oct 26 '15 at 15:03
  • I stand corrected on the symlinks and the file creation date. I just tested the first part of your claim ("you'd lose all permissions"), which is false, and didn't bother to test further. You correct that, and you'll get my upvote as well, and thanks for teaching me something today. ;-) – DevSolar Oct 26 '15 at 15:06
  • @DevSolar: Also note that GNU `awk -i inplace` and `perl -i` have the same issues. – mklement0 Oct 26 '15 at 15:10
  • Small caveat: even though [`ed` is a POSIX utility](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html), it is sadly not on all platforms; Debian is a notably exception; I think there are also platforms where `ed` is a symlink to a different utility. – mklement0 Oct 26 '15 at 15:14
  • @mklement0: At this point I figured as much, but I don't grok `awk`, and don't use `perl` for oneliners. ;-) – DevSolar Oct 26 '15 at 15:14
  • 1
    @DevSolar I'm not going to correct that, my statement was about methods not involving a genuine editor: and this includes explicit `mv` too, like for example the `tr` method (which is actually very good). Regarding `sed -i`: note that it's not specified by POSIX, so nothing guarantees that there's an implementation out there that doesn't deal properly with permissions; moreover I'm not sure how `sed -i` handles ACLs (but that's beyond the scope of this question anyway). (Btw, I did upvote your answer). – gniourf_gniourf Oct 26 '15 at 15:16
  • 1
    @mklement0 yeah, this thing about Debian sucks (and their default editor is `nano` too). The first thing to do on a fresh install is `sudo apt-get install ed`. – gniourf_gniourf Oct 26 '15 at 15:17
  • 1
    @mklement0: I don't know about `ed`, but `vim` installs `ex` as an alias to itself, starting in `ex` mode if invoked that way. gniourf_gniourf: Any particular reason why you'd go for `ed` instead of `vim` in `ex` mode? (Just wondering, because `vim` is usually the first thing *I* install on a new box. :-D ) – DevSolar Oct 26 '15 at 15:44
  • 2
    @DevSolar no particular reason to prefer `ed` over `ex` (just that for this task `ex` is overkill). Anyway, in `ex` you'd do: `ex file < <(printf '%%s/\\%%x00//g\nwq\n')` (untested). – gniourf_gniourf Oct 26 '15 at 16:10
1

Note: For in-place editing solutions, see gniourf_gniourf's ed-based answer (most robust) or DevSolar's GNU sed-based answer.

^@ is used to represent NUL (0x) bytes, both in vi and in cat -v's output.

In case you need to remove the actual NUL characters from your files, you don't need to type ^@; use tr -d '\0' instead:

# Create sample file with embedded NUL chars.
echo 'before NUL' > file;  head -c 2 </dev/zero >> file; echo 'after NUL' >>file

Examining the file with cat -v shows us (note the ^@ representing the NUL chars; $ represents a newline):

$ cat -v file
before NUL
^@^@after NUL

tr -d '\0' < file will print the contents of file with all NUL chars. removed:

$ tr -d '\0' < file | cat -v
before NUL$
after NUL$
Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775