0

have a folder structure as shown below ./all_files

-rwxrwxrwx  reference_file.txt
drwxrwxrwx  file1.txt
drwxrwxrwx  file2.txt
drwxrwxrwx  file3.txt

reference_file.txt has filenames as shown below

$cat reference_file.txt
file1.txt
file2.txt

data in file1.txt and file2.txt are as shown below:

$cat file1.txt
step_1
step_2
step_3

Now, I have to take particular step say step2 from each file

Note1: file name must present in reference_file.txt

Note2: step2 is not line no:2 always.

Note3: search should perform recursively.

I have used below script:

#!/bin/sh

for i in cat reference_file.txt;

do
   find . -type f -name $i | grep -v 'FS*' | xargs  grep -F 'step_2'
done<reference_file.txt

after using above code i got no output.

# bash -x script.sh
+ for i in cat reference_file.txt
+ find . -type f -name **cat**
+ xargs grep -F 'step_2'
+ for i in cat **reference_file.txt**
+ find . -type f -name reference_file.txt
+ xargs grep -F 'step_2'

Added New requirement:

target=step_XX_2 where XX can be anything and should be skipped for search.. so that desire ouput will be.. step_ab_2 step_cd_2 step_ef_2

user180946
  • 75
  • 1
  • 8
  • Why are you reading `reference_file.txt` twice? You `cat` the file and you also redirect it into the standard input of the loop - which it appears you don't use. What are you trying to achieve using two asterisks? – cdarke Aug 02 '16 at 09:50
  • Could you please show the desired output? – cdarke Aug 02 '16 at 09:57
  • 1
    Your for loop iterates over two elements, `cat` and `reference_file.txt`. Use `while read` instead of `for`. – choroba Aug 02 '16 at 09:58
  • I think, you were trying to do `for i in $(cat reference_file.txt);` & not `for i in cat reference_file.txt;` Using `while read` loop is better option. – anishsane Aug 02 '16 at 09:58
  • What are you trying to achieve with `FS*`? In regular expressions that means "F followed by zero or more S's". Not the same as a wildcard (globbing). – cdarke Aug 02 '16 at 10:08
  • the asterisks kept to highlight(bold) text.. but not part of code.. someone give me code using while loop please. – user180946 Aug 02 '16 at 10:50

2 Answers2

1

I think this is what you are trying to achieve. Please let me know:

EDIT: my previous version did not search recursively. Further edits: Note that using process substitution for find means that this script MUST be run under bash and not sh.

Further edit for change in specification: note the change to target and the -E option to grep instead of -F.

#!/bin/bash

target='step_.*?_?2'

while read -r name
do
   # EDIT: exclude certain directories
   if [[ $name == "old1" || $name == "old2" ]]
   then
        # do the next iteration of the loop
        continue    
   fi

   while read -r fname
   do
       if [[ $fname != FS* ]]
       then
           # Display the filename (grep -H is not in POSIX)
           if out=$(grep -E "$target" "$fname")
           then
               echo "$fname: $out"
           fi
       fi
   done < <(find . -type f -name "$name")

done < reference_file.txt

Note that your trace (bash -x) uses bash but your #! line uses sh. They are different - you should be consistent with the shell you are using.

So, I have dropped the xargs, that reads strings standard input and executes a program using the strings as argument. Since we already have the argument strings for grep we don't need it.

Your grep -v 'FS*' probably doesn't do what you expect. The regular expression FS* means "F followed by zero or more S's". Not the same as a shell pattern matching (globbing). In my solution I have used FS* because I am using the shell, not grep.

cdarke
  • 42,728
  • 8
  • 80
  • 84
  • thank you, is it possible to display the filename as well... `file1.txt: step_2` `file2.txt: step_2` so that i can check if the file is also present in reference_file.txt – user180946 Aug 02 '16 at 11:14
  • Nice! See a good comparison on [How to loop through file names returned by find?](http://stackoverflow.com/a/9612232/1983854), since `for fname in $(find ...)` doesn't seem very safe. – fedorqui Aug 02 '16 at 11:18
  • @fedorqui could please suggest the changes in code, which uses only while loop please. also the display includes both files name and result – user180946 Aug 02 '16 at 12:29
  • Suggested changes adopted. Please note that using process substitution for `find` means that this script MUST be run under `bash` and not `sh`. – cdarke Aug 02 '16 at 13:08
  • @cdarke the output also showing the all remaining lines in reference_file.txt... `file1.txt: step2` `file2.txt: step2` `file3.txt: file4.txt: file5.txt` and so on – user180946 Aug 03 '16 at 06:09
  • I misunderstood the data. Code amended. – cdarke Aug 03 '16 at 06:32
  • @cdarke please find below details: `step_2 may not be available in every file` `most important point was the file name must exists in reference_file.txt, if present it should go for next search(step_2)` `then search for step_2, if present then display filename: step_2` please help – user180946 Aug 03 '16 at 06:40
  • Like I said, I misunderstood. Yes, I see it now. Did you try my new solution? – cdarke Aug 03 '16 at 06:41
  • @cdarke missed your last update.. thank you.. new code is working fine – user180946 Aug 03 '16 at 06:45
  • @cdarke shall i have one more update `target=step_XX_2` where XX can be anything and should be skipped for search.. so that desire ouput will be.. `step_ab_2` `step_cd_2` `step_ef_2` – user180946 Aug 03 '16 at 06:56
  • Can you update the main question with that please? I'll post another comment when done. – cdarke Aug 03 '16 at 07:03
  • Can't do that if you insist on using the `-F` option to `grep`, since it needs a regular expression. Please get back to me on that. – cdarke Aug 03 '16 at 07:07
  • Change done, not that `target` has changes, and now I use `-E` in `grep`. – cdarke Aug 03 '16 at 07:10
  • @cdarke i tried with the new code but seems the target is not working fine.. `The exact target string was 01.XX.20XX` i am searching the date format with XX can be anything.. please suggest – user180946 Aug 03 '16 at 10:01
  • That's different to what you specified before. Your said "`target=step_XX_2`" - underscore, not dot. You need to construct a regular expression for the pattern you want. You really should at least give me a full specification of the pattern you want to match. Is that *exactly* two characters when you say "XX"? – cdarke Aug 03 '16 at 12:26
  • @cdarke yes... exactly 2 characters.. the one i specified last is the exact string i am searching for – user180946 Aug 03 '16 at 12:50
  • So your target pattern is `'01\...\.20..`. The `\.` means a literal dot, otherwise `.` means one single character. – cdarke Aug 03 '16 at 13:41
  • @cdarke thank you its works perfect.. i am trying to add one more condition in if condition and i tried below combinations and didn't work `if [[ $fname != FS* && old*]]` `if [[ $fname != "FS*" && "old*"]]` `if [[ $fname != (FS* && old*)]]` `if [[ $fname != ("FS*" && "old*")]]` – user180946 Aug 04 '16 at 05:51
  • got it. added space at the end `if [[ $fname != FS* && old* ]]` – user180946 Aug 04 '16 at 06:39
  • @cdarke is it possible to exclude specific directories in find command? `ex: (find .! (old1, old2) -type f -name "$name")` that means dont look in old1,old2 directories while searching – user180946 Aug 04 '16 at 07:00
  • Just put an `if` statement between the two `while` loops. I'll modify the code to show that. You really should try some of this yourself! – cdarke Aug 04 '16 at 08:03
  • @cdarke tried below things and not getting desired output `if [[ $name == "old1" || $name == "old2" ]]` `if [[ $name != "old1" || $name != "old2" ]]` first statement also searching in the directories old1, old2 second statement exiting the loop there itself. – user180946 Aug 05 '16 at 10:35
  • If you use `!=` then you need `&&` not `||`. – cdarke Aug 05 '16 at 11:57
  • @cdarke even that also failed, even if one arguement `if [[ $name != "old1" && "OLD2" ]]` `if [[ $name != "old1" ]]` `+ read -r name` `+ [[ file1.txt != \o\l\d\1 ]]` `+ [[ -n OLD2 ]]` `+ continue` `+ read -r name` `+ [[ file1.txt != \o\l\d\1 ]]` `+ continue` – user180946 Aug 05 '16 at 12:34
  • `[[ $name != "old1" && "OLD2" ]]` is wrong, should be `[[ $name != "old1" && $name != "OLD2" ]]`. I can't figure out what your code is from the comment (what are all thous `+`s). Please post and format your code in the question. – cdarke Aug 05 '16 at 15:54
  • @cdarke the + sign is when i execute the script using -x `ex: bash -x script.sh` to debug the script. here i observed the folder name searching as `[[ file1.txt != \o\l\d\1 ]]`. also in my last comment even if i pass single folder name as argument [[ $name != "old1" ]], i didn't get the output. please tell me if i missed any detail. – user180946 Aug 06 '16 at 14:35
0

I believe this question is duplicate of this

What you need is

#!/bin/sh

for i in `cat reference_file.txt`
    do  find . -type f -name $i | grep -v 'FS*' | xargs  grep -F 'step_2'  
done 

See the backticks and Do Not read the file reference_file.txt twice.

Community
  • 1
  • 1
Pintu
  • 278
  • 1
  • 6
  • 1
    `$(cat reference_file.txt)` syntax is preferred, backticks are considered deprecated. Better yet, you rarely need `cat`. – cdarke Aug 02 '16 at 10:05
  • @cdarke a `for i in $(cat ...)` is not safe. [Why you don't read lines with "for"](http://mywiki.wooledge.org/DontReadLinesWithFor). It is best to say `while IFS= read -r var1 var2 ... ; do ... done < file`. – fedorqui Aug 02 '16 at 10:46
  • 1
    @fedorqui: No argument with that see my post. `cat` is rarely needed in a script. The point of my comment was about backticks. – cdarke Aug 02 '16 at 10:48