0

I wrote small shell script, to identify the PDF file associate pages in my website.

It’s take the pdf source list url one by one, as an input and finding recursive in website content.

Problem is when I run the script find result not appending to the output file, But when I take the find command and run in terminal/putty manually can see the result.

Script:

#!/bin/bash
filename="PDF_Search_File.txt"
while read -r line
do
        name="$line"
                echo "*******pdf******** - $name\n" >>output_pdf_new.txt
        find . -type f -exec grep -l "$name" '{}' \; >>output_pdf_new.txt
                echo "*******pdf******** - $name\n" >>output_pdf_new.txt
done < "$filename"

source list url input file (PDF_Search_File.txt)

/static/pdf/pdf1.pdf
/static/pdf/pdf2.pdf
/static/pdf/pdf3.pdf
--------------------

out put result file (output_pdf_new.txt)

./Search_pdf.sh
*******pdf******** - /static/pdf/pdf1.pdf\n
*******pdf******** - /static/pdf/pdf1.pdf\n
./Search_pdf.sh
*******pdf******** - /static/pdf/pdf2.pdf\n
*******pdf******** - /static/pdf/pdf2.pdf\n
./Search_pdf.sh
*******pdf******** - /static/pdf/pdf3.pdf\n
*******pdf******** - /static/pdf/pdf3.pdf\n
------------------------------------------

terminal/putty can see the result for below, when manually run the find.

find . -type f -exec grep -l "/static/pdf/pdf1.pdf" '{}' \;

./en/toyes/zzz/index.xhtml
./en/toyes/kkk/index.xhtml
--------------

but having issue with script , only out put the echo result as above output result .

Update when i execute the script with bash -x , it's giving below result

[user@server1 generated_content]# bash -x Search_pdf.sh
+ filename=PDF_Search_File.txt
+ read -r line
+ name=$'/static/pdf/pdf1.pdf\r'
\n'cho '*******pdf******** - /static/pdf/pdf1.pdf
+ find . -type f -exec grep -l $'/static/pdf/pdf1.pdf\r' '{}' ';'
\n'cho '*******pdf******** - /static/pdf/pdf1.pdf
+ read -r line
+ name=$'/static/pdf/pdf2.pdf\r'
\n'cho '*******pdf******** - /static/pdf/pdf2.pdf
+ find . -type f -exec grep -l $'/static/pdf/pdf2.pdf\r' '{}' ';'

is something wrong here

  + find . -type f -exec grep -l $'/static/pdf/pdf2.pdf\r' '{}' ';'

find command should be like below , but it's taking as above when executing

find . -type f -exec grep -l "/static/pdf/pdf1.pdf" '{}' \;
chamara
  • 1
  • 1

2 Answers2

0

Have you tried -e option in echo to enable interpretation of backslash escapes?

Also why don't you simply do find | grep?

find ./ -type f | grep "$name" >> output_pdf_new.txt

Try following (./ instead of .) in find

find ./ -type f -exec grep -l "$name" '{}' \; >>output_pdf_new.txt
Calvin Kim
  • 359
  • 2
  • 6
  • echo result not having any issue , it's appending to the file , only find result not appending to the file. please tell me where i did wrong here – chamara Jun 14 '18 at 05:03
  • @chamara, edited answer with `find | grep` suggestion. – Calvin Kim Jun 14 '18 at 05:08
  • No above command ,not working brother , i just want to find the website pages inside the source , where those pdf links , then out put should result is website page index.html url my command working successfully when i test manually in putty , but i have huge pdf list – chamara Jun 14 '18 at 05:17
  • `find | grep` matches the filenames, he wants to match the contents of the file. – Barmar Jun 14 '18 at 05:23
  • ah, I see what you mean. Your original find command should work though. – Calvin Kim Jun 14 '18 at 05:25
0

grep -rl for the file inside of your for loop:

cd /www/webroot/
grep -rl "${name}" * | while read file_path; do
    # I need to do something with each file
    echo $file_path
done

OR I just need to run the output to file

cd /www/webroot/
grep -rl "${name}" * >> output_pdf_new.txt
Mike Q
  • 6,716
  • 5
  • 55
  • 62