1

I got some Postgres table dumps from somebody using pgAdmin3 on Windows. (Blech.) First of all, it has a whole bunch of extra crap at the top of the file that I've had to get rid of-- things like "toc.dat" without comments, etc.

I've resorted to editing them by hand to get them in workable format to be imported, because as it stands they are somewhat garbled; for the most part I've succeeded, but when I open them in emacs, for example, they tend to be littered with the following character:

^@

and sometimes just alot of:

@@@

enter image description here

I haven't figured out how to remove them using sed or awk, mainly because I have no idea what they are (I don't think they are null characters) or even how to search for them in emacs. They show up as red for 'unprintable' characters. (Screenshot above.) They also don't seem to be printed to the terminal when I cat the file or when I open it in my OS X Text editor, but they certainly cause errors when I try to import the file in to postgres using

psql mydatabase < table.backup

unless I edit them all out.

Anybody have any idea of a good way to get rid of these short of editing them by hand? I've tried in place sed and also tried using tr, but to no effect-- perhaps I'm looking for the wrong thing. (As I'm sure you are aware, trying to google for '^@' is futile!)

Just was wondering if anybody had come across this at all because it's going to eat at me unless I figure it out...

Thanks!

HodorTheCoder
  • 254
  • 2
  • 11

1 Answers1

1

Those are null characters. You can remove them with:

tr -d '\000' < file1 > file2

where the -d parameter is telling tr to remove characters with the octal value 000.
I found the tr command on this forum post, so some credit goes to them.

I might suggest acquiring access to a Windows machine (never thought I'd say that), loading the original dumps they gave you, and exporting in some other formats to see if you can avoid the problem altogether. Which to me seems safer than running any for of sed or tr on a database dump before importing. Good luck!

ErlVolton
  • 6,714
  • 2
  • 15
  • 26
  • Ahh. "In caret notation the null character is ^@". Interesting. I actually tried an almost identical tr command yesterday from here http://stackoverflow.com/questions/2398393/identifying-and-removing-null-characters-in-unix but it didn't work on my Mac. Yours seemed to work much better. Now I just need to remove the rest of the garbled text! Thank you. – HodorTheCoder Oct 24 '14 at 16:01
  • It's possible that it didn't work for you because your darwin shell doesn't like how their command was using redirects in the middle of the arguments. – ErlVolton Oct 24 '14 at 18:53