I have bash script and a csv file with multiple rows containing names,addresses,numbers and another folder containing files with a few names from the csv file. How can I delete only the rows containing the file names that exists in the folder?
Asked
Active
Viewed 28 times
1 Answers
0
First make a script that will retrieve the names from the second folder.
You can use the results for filtering the first file.
This approach can be difficult when the names have special characters (as uses in sed
), or when the csv file has fields like
field1,"an other field with a , inside quotes",field3
awk
is a nice tool for this. When you don't have quoted fields, a ,
as a fieldsep and the name in field 1, you might use something like
awk 'NR==FNR{a[$1];next} !($1 in a) {print}' <(cat secondfolder/allcsvfiles.csv) maincsv_file
# or without the default {print}
awk -F"," 'NR==FNR{a[$1];next} !($1 in a)' <(cat secondfolder/allcsvfiles.csv) maincsv_file
(see What is "NR==FNR" in awk? for an explanation)

Walter A
- 19,067
- 2
- 23
- 43