0

Below is a code (obtained from the web http://dwxchange.blogspot.com/2008_08_01_archive.html) that works well in removing carriage returns on a DOS file to Unix. I would like to know how I can go about modifying this code (or any recommended codes to remove carriage returns on multiple files) to work on multiple files in a directory. This code works on one file at a time. Oh by the way, I'm new to the awk command :(

awk '{ sub("\r$", ""); print }' dosfile.txt > unixfile.txt
user2008558
  • 341
  • 5
  • 16
  • 1
    Use a loop. Try searching a bit, you'll find numerous examples on this site. – devnull Mar 20 '14 at 16:42
  • 1
    Moreover, you probably don't need `awk`. You might make use of utilities like `dos2unix`. – devnull Mar 20 '14 at 16:42
  • Once I had simmilar issue there i seen Carrige return in DOS ^M So i used `sed -e 's/^M//g' input > output` which solved my issue.. Loop it for each filename input and filename output.. or you can directly replace on same file as input – Ashish Mar 20 '14 at 16:51
  • Also dos2unix is already there, which can help you without going for Replacing all escape characters of DOS files one by one – Ashish Mar 20 '14 at 16:53
  • See this [answer](http://stackoverflow.com/a/16768848/970195) for using tools, though `dos2unix` is the easiest if available. – jaypal singh Mar 20 '14 at 16:54
  • I definitely agree that dos2unix is the easiest approach. However, I don't have the utility installed on the unix server. Thanks for the recommending that solution – user2008558 Mar 20 '14 at 16:58
  • 1
    If you don't have `dos2unix`, `tr -d '\r'` would work just about as well, as long as you aren't using an encoding where byte value 13 is part of a valid multi-byte code point... – twalberg Mar 20 '14 at 18:18
  • sed -i 's/\r//' DIR/*.txt. This approach worked for me. Also the link Jaypal provided provides valuable information as well – user2008558 Mar 20 '14 at 19:14
  • 2
    @twalberg: Why would that not work with some encodings? In the POSIX specification `tr -d` is specified to work on characters, not on bytes.. – Scrutinizer Mar 20 '14 at 20:17
  • @Scrutinizer I was thinking of the not entirely uncommon case where a users environment is set to a single-byte-encoding locale (C, iso8859, etc) but processing files that are multibyte encoded (not UTF8, because 0xd is not a valid extension byte there, but maybe some other encodings that predate the Unicode family or something...). No specific examples in mind, just something to be aware of... In other words, in cases where `tr` thinks it's just processing a stream of single-byte characters, but it isn't really... – twalberg Mar 20 '14 at 20:20

0 Answers0