0

I have a csv file and I found out that one of the columns is messing up all my script. The table looks like this:

cat table.csv | head -n 4

site_id,x_coordinate,y_coordinate,Starting_year,Ending_year,Year_count,Samling_years,Country
FRDE1,52.19387436,-1.76443004,2002,2016,15,12, DE
FRDE2,50.160917,9.318498,2001,2016,16,14, DE
FRDE3,50.037406,9.428786,2001,2015,15,14, DE

Notice that the last column "Country" has a space before the text!

if I do

awk -F',' '{print $8}' table.csv | head -n 3, I get

Country
 DE
 DE

which looks as expected. But if I save that same line in a variable then I get:

VAR=$(awk -F',' '{print $8}' table.csv | head -n 3)

echo $VAR

DEntry

If I do the same with any other column it works well but not with that column! Any other awk process I do on the table gets messed up if that column is on the table. If I remove the table then everything works well. I haven't been able to find out what the problem is and I would like to keep the column.

Any tips are very welcome

Jaime GM
  • 3
  • 1
  • 2
    Check your file for unprintable characters. Pipe it via `cat -v` or `xxd` or `hexdump` – KamilCuk Oct 16 '20 at 09:54
  • Certainly You have `\r` in your file. After `Country` was printed, cursor returned at the beginning and `DE` overwrote that country twice, so you got `DEntry`. Run `dos2unix` on the input file. – thanasisp Oct 16 '20 at 10:01
  • Thanks a lot. I run `dos2unix` on the table and the problem was fixed. `cat -v` on the table showed the `^M` after `DE`. – Jaime GM Oct 16 '20 at 10:27

1 Answers1

1
[akshay@db1 tmp]$ dos2unix table.csv table.csv 
dos2unix: converting file table.csv to Unix format...
dos2unix: converting file table.csv to Unix format...

[akshay@db1 tmp]$ cat table.csv
site_id,x_coordinate,y_coordinate,Starting_year,Ending_year,Year_count,Samling_years,Country
FRDE1,52.19387436,-1.76443004,2002,2016,15,12, DE
FRDE2,50.160917,9.318498,2001,2016,16,14, DE
FRDE3,50.037406,9.428786,2001,2015,15,14, DE


[akshay@db1 tmp]$ VAR=$(awk -F',' '{print $8}' table.csv | head -n 3)

# recommended with quotes
[akshay@db1 tmp]$ echo "$VAR"
Country
 DE
 DE

# without quotes see.
[akshay@gold db1]$ echo $VAR
Country DE DE

You can use single awk like below

VAR=$(awk -F',' 'NR<=3{print $8}' table.csv )
Akshay Hegde
  • 16,536
  • 2
  • 22
  • 36