0

I have multiple CSV files with the next format:

"name","last_name","birth_day","register_date"
Michael,Jackson,August 29 - 1958,August 29 - 1958
Claude,Shannon,April 30 - 1916,April 30 - 1916

I want to transform each file to the next format:

"name","last_name","birth_day","register_date",sha256
Michael,Jackson,August 29 - 1958,August 29 - 1958,9949a1af67a3fb465eca01ca884f5ec7cd280078a39a0430a0f352bf19e16685  -
Claude,Shannon,April 30 - 1916,April 30 - 1916,fb464b3ab4f3f3db2384e192135cde97486ce96fe34e391a3294e5076f800aae  -

That means I want to add the "sha256" column with the hash values.

So far I could get the hash values for each row but I don't know how to add this value as a column "sha256" to the CSV file.

for file in ${DIR}/csv/*
do
    while IFS='' read -r line || [[ -n "$line" ]]; do
        echo -n $line | shasum -a 256
            /**
              Here it calculates the hash per row, and I want to add it
              at the end of the row as "sha256" column
            **/
    done < "$file"
done

How can I do it?

forkfork
  • 415
  • 6
  • 22

2 Answers2

0

You can use awk to do this, it will work for GNU awk >= 4.1.0 :

awk -i inplace '
function rtrim(s) { sub(/[ \t\r\n]+$/, "", s); return s }
{
    if (FNR > 1){
        cmd = "echo -n \""$0"\" | shasum -a 256"
        while (cmd | getline line) {
            split(line, arr, "-")
            print $0","rtrim(arr[1])
        }
        close(cmd)
    }
    else {
        print $0",sha256"
    }
}' ${DIR}/csv/*
  • -i inplace is used to edit the files in place
  • FNR is the current record number in the current file
  • see this post for passing variable to a shell command
  • the sha256 command result is splitted according to - delimiter in order to only keep the sha256 value. rtrim is used to remove the extra space
Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159
0

Why wouldn't you just echo the hash value after the line?

for file in ${DIR}/csv/*
do
    while IFS='' read -r line || [[ -n "$line" ]]; do
        hash=$(echo -n $line | shasum -a 256 | cut -d\  -f1)
        echo $line,$hash
    done < "$file"
done

The cut strips the trailing - from the shasum output. Add quotes around $hash if you like.

You should consider skipping the header line for each csv.

Paul Coccoli
  • 546
  • 1
  • 4
  • 16