-1

I have a file "selected_files.log" containing paths (a couple thousand lines) and I have a exclusion file "exclusion.txt" containing path and extensions I don't want to see in my selected_files.log

I have been trying with grep and sed, with no luck this is my last search. anybody help? thanks

lines=$(cat exclusion.txt)
for x in "$lines";
do
    grep -v "$x" "selected_files.log" > new_file.log
    echo "x is $x"
    #sed `/$x/d` -i "selected_files.log" 
done

Comm -23 doesn't work because files aren't sorted. I've tried with "/"$x"/d" -i but no luck : unterminated address regex

exemples of what my files contain selected_files.log

/mnt/user/system/data/S97/001584.bkp
/mnt/user/system/data/S97/00284.bkp
/mnt/user/system/data/S97/0058244.bkp
/mnt/user/system/data/A12/external.log
/mnt/user/system/data/A12/internal.log
/mnt/user/system/input/system_run.sh 
/mnt/user/system/input/user.sh
/mnt/user/system/output/results.dt 
/mnt/user/david/test/test.sh
/mnt/user/david/prod/bdd.bkp
/mnt/user/system/old_bkp.tmp
/mnt/user/system/output/test space/test.tmp

exclusion.txt

external.log
/mnt/user/system/input/
david
.tmp

result wanted:

/mnt/user/system/data/S97/001584.bkp
/mnt/user/system/data/S97/00284.bkp
/mnt/user/system/data/S97/0058244.bkp
/mnt/user/system/data/A12/internal.log
/mnt/user/system/output/results.dt 
Nexius2
  • 11
  • 4
  • try https://stackoverflow.com/questions/4366533/how-to-remove-the-lines-which-appear-on-file-b-from-another-file-a Better than sed – Haru Suzuki Feb 17 '22 at 19:25
  • 1
    Please, post some sample data with the related expected output. Don't post them as comments, images, tables or links to off-site services but use text and include them to your original question. Thanks. – James Brown Feb 17 '22 at 19:37
  • 1
    may be you are quoting sed with ' thats why variable is not expanding. use sed "/"$x"/d" -i – Haru Suzuki Feb 17 '22 at 19:42
  • see my answer for working for loop – Haru Suzuki Feb 19 '22 at 14:55

2 Answers2

1
grep -Fvxf exclusion.txt selected_files.log >> tmp && cat tmp > selected_files.log
awk 'NR==FNR{a[$0];next} !($0 in a)' exclusion.txt selected_files.log >> tmp && cat tmp > selected_files.log
gawk -i inplace 'NR==FNR{a[$0];next} !($0 in a)' exclusion.txt selected_files.log

Note : It will edit inplace selected_files.log but also wipes the exclusion.txt

Seems like for loop is causing problem. Try

cat exclusion.txt | while read f; do sed "/^${f//\//\\\/}$/d" selected* -i; done

Updated for loop working

cat selected_files.log

/mnt/user/system/data/S97/001584.bkp
/mnt/user/system/data/S97/00284.bkp
/mnt/user/system/data/S97/0058244.bkp
/mnt/user/system/data/A12/external.log
/mnt/user/system/data/A12/internal.log
/mnt/user/system/input/system_run.sh 
/mnt/user/system/input/user.sh
/mnt/user/system/output/results.dt 
/mnt/user/david/test/test.sh
/mnt/user/david/prod/bdd.bkp
/mnt/user/system/old_bkp.tmp

cat exclusion.txt

external.log
/mnt/user/system/input/
david
.tmp
IFS=$'\n' && for f in `cat exclusion.txt`; do sed "/^"${f//\//\\\/}"$/d" selected_files.log -i; done

OR

lines="$(cat exclusion.txt)"
for x in `echo "$lines"`; 
do  grep -v "$x" "selected_files.log" > new_file.log; 
echo "x is $x"; 
sed "/^"$(echo "${x//\//\\\/}")"$/d" selected_files.log -i; 
done

cat selected_files.log

/mnt/user/system/data/S97/001584.bkp
/mnt/user/system/data/S97/00284.bkp
/mnt/user/system/data/S97/0058244.bkp
/mnt/user/system/data/A12/internal.log
/mnt/user/system/output/results.dt 

Select answer if it solves your problem

Haru Suzuki
  • 142
  • 10
  • Hello, thanks for the reply. If the exclusion contains only words and de selected_files only phrases, i would work. but here, my files contains paths. I guess the / makes the thing not working... – Nexius2 Feb 19 '22 at 12:04
  • Hello, first cat did nothing, the second almost worked, it only deleted the last line with the .tmp, and if I add the .sh in exclusion, it will also delete it. I doesn't search for a word inside the line. – Nexius2 Feb 21 '22 at 05:56
  • what do you mean?? check and try again corrected your command – Haru Suzuki Feb 21 '22 at 15:06
  • it doesn't work as expected. after searching a bit, som of my export files are in CR/LP and others just in LP. I'm going to search this way – Nexius2 Feb 21 '22 at 17:45
  • got it to work, it was the return carrier. but I still have an issue with spaces in the path :-/ – Nexius2 Feb 22 '22 at 16:53
  • give some example why spaces not working. if you provise regex /usr/mnt/input and you want to remove both /usr/mnt/input and /usr/mn t/input. ?? – Haru Suzuki Feb 22 '22 at 17:00
  • I added it to the original post but here /mnt/user/system/output/test space/test.tmp and I want to remove .tmp. the output x shows /mnt/user/system/output/test and space/test.tmp on seperate lines. – Nexius2 Feb 22 '22 at 20:37
  • add constraint on regex sed "/^"$(echo "${x//\//\\\/}")"$/d" so that it will match entire line as regex – Haru Suzuki Feb 24 '22 at 20:12
0

Thanks to Haru, my solution was a mix of all.

lines="$(cat "exclusion.txt")"
IFS=$'\n'
for x in `echo "$lines"`; 
do echo "x is $x"; 
grep -v "$x" "selected_files.log" > new_file.log; 
sed "/"$(echo "${x//\//\\\/}")"/d" "selected_files.log" -i; 
done
unset IFS
Nexius2
  • 11
  • 4