1

I am trying to modify a CSV file to append to the lines the value A. My conditional regex is ,$ where $ is the last character.

My sed 's/,$/,A/' does not return any changes.

However doing sed 's/$/########/' replaces the first N characters of each line with my replacement string.

Example:

user@HOST:/loc/yearly_files$ head merged19.csv
1900-01-01 09:00, 2084, DCNN, DLY3208, 1, 310, 1011, , , , , , , , 5, , , , , , , , , , , , , , , , , , , , , 1.1, , 0, , ,   , , , , , , , , , , 1, , , , , , , , 1, , 1, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , D, , D, , , , , 79, A, ,
1900-01-01 09:00, 3197, DCNN, DLY3208, 1, 449, 1001, , , , , , , , 4, , , , , , , , , , , , , , , , , , , , , 4.7, , 4.3, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 93.6, A, ,
1900-01-01 09:00, 4813, DCNN, DLY3208, 1, 653, 1001, , , , , , , , 4, , , , , , , , , , , , , , , , , , , , , 2.3, , 1.7, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 89.2, A, ,
1900-01-01 09:00, 4967, DCNN, DLY3208, 1, 687, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 3.2, , 2.8, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 93, A, ,
1900-01-01 09:00, 5399, DCNN, DLY3208, 1, 778, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 5.8, , 5.7, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 98.5, A, ,
1900-01-01 09:00, 6950, DCNN, DLY3208, 1, 1047, 1011, , , , , , , , 6, , , , , , , , , , , , , , , , , , , , , 6.1, , 5.1, , , , , , , , 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, , , , , , , , , , H, , , , B, , B, , , , , 84.8, A, ,
1900-01-01 09:00, 7384, DCNN, DLY3208, 1, 1136, 1001, , , , , , , , 4, , , , , , , , , , , , , , , , , , , , , 2.3, , 1.7, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 89.2, A, ,
1900-01-01 21:00, 2084, DCNN, DLY3208, 1, 310, 1011, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , -.6, , -.6, , ,   , , , , , , , , , , 1, , , , , , , , 1, , 1, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , D, , D, , , , , 99.3, A, ,
1900-01-01 21:00, 3197, DCNN, DLY3208, 1, 449, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 5.7, , 5.6, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 98.5, A, ,
1900-01-01 21:00, 4967, DCNN, DLY3208, 1, 687, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 5.3, , 4.9, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 93.6, A, ,
user@HOST:/loc/yearly_files$ head merged19.csv | sed "s/$/,NULL/"
,NULL01-01 09:00, 2084, DCNN, DLY3208, 1, 310, 1011, , , , , , , , 5, , , , , , , , , , , , , , , , , , , , , 1.1, , 0, , ,   , , , , , , , , , , 1, , , , , , , , 1, , 1, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , D, , D, , , , , 79, A, ,
,NULL01-01 09:00, 3197, DCNN, DLY3208, 1, 449, 1001, , , , , , , , 4, , , , , , , , , , , , , , , , , , , , , 4.7, , 4.3, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 93.6, A, ,
,NULL01-01 09:00, 4813, DCNN, DLY3208, 1, 653, 1001, , , , , , , , 4, , , , , , , , , , , , , , , , , , , , , 2.3, , 1.7, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 89.2, A, ,
,NULL01-01 09:00, 4967, DCNN, DLY3208, 1, 687, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 3.2, , 2.8, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 93, A, ,
,NULL01-01 09:00, 5399, DCNN, DLY3208, 1, 778, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 5.8, , 5.7, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 98.5, A, ,
,NULL01-01 09:00, 6950, DCNN, DLY3208, 1, 1047, 1011, , , , , , , , 6, , , , , , , , , , , , , , , , , , , , , 6.1, , 5.1, , , , , , , , 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, , , , , , , , , , H, , , , B, , B, , , , , 84.8, A, ,
,NULL01-01 09:00, 7384, DCNN, DLY3208, 1, 1136, 1001, , , , , , , , 4, , , , , , , , , , , , , , , , , , , , , 2.3, , 1.7, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 89.2, A, ,
,NULL01-01 21:00, 2084, DCNN, DLY3208, 1, 310, 1011, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , -.6, , -.6, , ,   , , , , , , , , , , 1, , , , , , , , 1, , 1, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , D, , D, , , , , 99.3, A, ,
,NULL01-01 21:00, 3197, DCNN, DLY3208, 1, 449, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 5.7, , 5.6, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 98.5, A, ,
,NULL01-01 21:00, 4967, DCNN, DLY3208, 1, 687, 1001, , , , , , , , 8, , , , , , , , , , , , , , , , , , , , , 5.3, , 4.9, , , , , , , , , , , , , 0, , , , , , , , 0, , 0, , , , , , , , , , , , , , , , , , , , , , , , , , , D, , , , B, , B, , , , , 93.6, A, ,
user@HOST:/loc/yearly_files$
shadoweye14
  • 773
  • 2
  • 7
  • 22
  • 4
    Looks like your line endings are `\r\n` instead of just `\n`. I'm wondering if that is the problem? – Sean Bright Apr 10 '18 at 14:55
  • 2
    @SeanBright you were right. Could you officially reply to the answer so I can accept it please? Thanks a lot! – shadoweye14 Apr 10 '18 at 14:59
  • 1
    I don't know what I would say. Someone else can take the credit if they want. – Sean Bright Apr 10 '18 at 14:59
  • 3
    Possible duplicate of [Why does my tool output overwrite itself and how do I fix it?](https://stackoverflow.com/questions/45772525/why-does-my-tool-output-overwrite-itself-and-how-do-i-fix-it) – Sundeep Apr 10 '18 at 15:00
  • anyone looking to see the issue with smaller sample: `printf 'foo\r\nbaz\r\n' | sed 's/$/X/'` – Sundeep Apr 10 '18 at 15:01
  • You can get rid of the `\r` characters by piping the content through `tr -d '\r'` before sending it to `sed`: `head merged19.csv | tr -d '\r' | sed "s/$/,NULL/"` – axiac Apr 10 '18 at 15:08
  • @axiac please don't suggest `tr -d '\r'` or at least add a note that it assumes `\r` is not present anywhere else in the file.. see the duplicate question link I posted for comprehensive take on this topic – Sundeep Apr 10 '18 at 15:11
  • @Sundeep If the file contains `\r` in other context that `\r\n` as EOL then it is probably not a text file. Why would one use `sed` to change it (and why changing it using line-oriented tools, in the first place)? – axiac Apr 10 '18 at 15:13
  • @Sundeep is there any tool that is able to read multi-line values from CSV files (assuming the EOL is `\n` and an embedded `\r` is a newline inside a value)? – axiac Apr 10 '18 at 15:18
  • @axiac good points.. I'm not an expert in dealing csv.. gawk can handle lot of cases https://stackoverflow.com/questions/45420535/whats-the-most-robust-way-to-efficiently-parse-csv-using-awk ... if I had to choose, I would go with perl/python with csv modules... – Sundeep Apr 10 '18 at 15:21
  • Ignore all the complicated answers you got (two of them are mine). Use `sed 's/,\r\?$/,A/'`. The extra `\r\?` means an optional `\r` before the end of line. It matches both `\r\n` (Windows new lines) and `\n` (Unix new lines) and it doesn't touch the `\r` that are not present at the end of line. – axiac Apr 11 '18 at 08:39

4 Answers4

3

The regular expression ,$ matches a comma as the last character of the line. If your delimiter is actually a comma plus a space, then you may have an invisible character there which your regex would not match.

In addition, your NULL experiment appears to indicate that you have \r\n line endings (i.e. your files may have been generated in Windows). You can verify the content of your file using od or hexdump:

$ od -c input.csv | head -18 | tail -4
0000340    ,       ,       ,       ,       D   ,       ,       D   ,
0000360    ,       ,       ,       ,       7   9   ,       A   ,       ,
0000400   \r  \n   1   9   0   0   -   0   1   -   0   1       0   9   :
0000420    0   0   ,       3   1   9   7   ,       D   C   N   N   ,

Note the \r \n.

You could remove these using dos2unix which may be available for your Linux distribution, or a GNU sed script like:

$ sed -i 's/\r//' input.csv

or a non-GNU sed script run in bash, like this:

$ sed -i '' -e $'s/\r//' input.csv

or by using the appropriate options in an FTP file transfer. There are some additional options for this here.

Once your file is converted, try simply matching the end of the line, if you're pretty sure you've got the right number of delimiters:

sed 's/$/A/' input.csv

Or even better, if you know that you really want field 103 to be an A:

awk -F, '{$103="A"} 1' OFS=", "  input.csv
ghoti
  • 45,319
  • 8
  • 65
  • 104
0

If you know the .csv file uses \r\n as the EOL character, you can get rid of the \r characters by piping the content through tr -d '\r' before sending it to sed:

head merged19.csv | tr -d '\r' | sed 's/$/,NULL/'

Warning: The tr filter removes all the \r characters present in the file and might break something else in your file structure.
However, if your file contains \r not followed by \n then either you have a file generated using an ancient version of MacOS (prior to version 10, macOS used to use \r as the EOL character) or it is a binary file and using sed to handle it is a bad idea anyway.

axiac
  • 68,258
  • 9
  • 99
  • 134
0

A PHP solution to replace the Windows end of lines (\r\n) with Unix end of lines (\n) before passing the input to sed:

head merged19.csv |\
php -r 'while($line=fgets(STDIN)){echo(str_replace("\r\n","\n",$line));}' |\
sed 's/$/,NULL/'

The filter is a small PHP program that reads lines from stdin, replaces the \r\n combination with \n and outputs them to stdout.
It doesn't remove the \r characters embedded in the text in other combinations; there should be none but this script plays safe and doesn't make this assumption.

The command is wrapped on multiple lines (the backslash, \, at the end of line tells bash that the command continues on the next line) because the PHP script is quite long for the format used by SO. You can remove the backslash characters (\) and write everything on a single line.

axiac
  • 68,258
  • 9
  • 99
  • 134
-1

You might be experiencing a problem with your terminal display.
Try writing the changes to a separate file and see if it is correct then.

head merged19.csv | sed "s/$/,NULL/" > testfile
cat testfile
or even try watching it in an editor like: nano testfile

elig
  • 2,635
  • 3
  • 14
  • 24