0

I have a csv file in my Linux machine, where I'm trying to make a few queries. But some of the columns in the csv have NULLS and it isn't expressed within the delimiters, So I'm finding it difficult to query the other columns because of the in consistency.

The format now is :

col1,col2,col3,col4,col5... col23 
"val1","val2",NULL,"Val3","val4","val5",...,"val23"
"val1","val2","val2",NULL,"val4","val5",...,"val23"

I'm trying to replace the NULL here in different columns to have delimiters around.

awk -F "," 'BEGIN {OFS="\",\""} $3 == "NULL" { $3 = "\"NULL\"" } {print $3}' file1.csv 

So far I've tried playing around with awk, but nothing seems to work. Is there a better approach or can someone help with the above command?

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
Emma Vaze
  • 19
  • 6
  • 1
    @EmmaVaze : Why do you want delimiters? Quotes are allowed, but in general not necessary in a CSV field, and a CSV parser should be able to read the field content quoted and unquoted. – user1934428 Oct 06 '20 at 06:26

3 Answers3

3

Could you please try following, considering that your Input_file is same as shown samples.

awk '
BEGIN{
  FS=OFS=","
}
{
  gsub(/NULL/,"\"&\"")
}
1' Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
2

to replace the NULL here in different columns to have quotes around:

$ awk '
BEGIN {
    FS=OFS=","             # set field separators
}
{
    for(i=1;i<=NF;i++)     # loop all fields
        if($i=="NULL")     # if NULL without quotes
            $i="\"NULL\""  # replace with "NULL"
}1' file                   # output

Output:

col1,col2,col3,col4,col5... col23 
"val1","val2","NULL","Val3","val4","val5",...,"val23"
"val1","val2","val2","NULL","val4","val5",...,"val23"
James Brown
  • 36,089
  • 7
  • 43
  • 59
1

You can also use the sed command to replace all the NULL values. It's a simple and short command.

sed -i 's/NULL/"NULL"/g' YourFileName
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
Vinit Shah
  • 26
  • 1
  • 4
    Almost, but what if the `Null` is already surrounded by double quotes? So you then need `sed -Ei 's/\([^"])NULL([^"])/\1"NULL"\2/g' file`. However this will not catch consecutive `NULL`s, so you then need `sed -Ei ':a;s/([^"])NULL([^"])/\1"NULL"\2/g;ta' file`. – potong Oct 06 '20 at 11:37
  • 3
    Also what if a field was `"fooNULLbar"`? Even with @potongs improvement you'd end up with `foo,"foo"NULL"bar",bar`. – Ed Morton Oct 06 '20 at 13:53