0

I am trying to grep for the delimiters (comma or pipe or semicolon) in the multiple files.

If the file contains any of these delimiters then it is fine. I want to move files that don't contain any of these delimiters to the mvfiles directory. The script is currently moving all the files even if the delimiters exist in the files.

filename=$(find /opt/interfaces/sample_check -type f \( -name "*message.txt*" -or -name "*comma2*" -or -name "*comma3*" \))
pathname=/opt/interfaces/sample_check

echo $filename
echo $pathname

if `head -1 $filename | grep -o [';']`; then

    echo "Found"
else
    mv $filename /opt/interfaces/sample_check/mvfiles
fi
Paul Hodges
  • 13,382
  • 1
  • 17
  • 36
KCR
  • 83
  • 1
  • 9
  • Whats the output of $filename? Is that all the found files? In that case, if you enter your else statement, you're moving all the files that's in the $filename variable. – TMNuclear Dec 10 '18 at 14:40
  • Hi Ivan M, that triggers me something, do I need to use elif instead of else? – KCR Dec 10 '18 at 14:44
  • There's nothing in this code that I see checking for commas or pipes. Also, why are you using backticks? Aside from that you ought to use `$()` instead, you don't appear to need them here at all. – Paul Hodges Dec 10 '18 at 14:49
  • In the code I gave only semicolon, if I achieve this I can add other delimiters (pipe and comma). In the first place I am not able to move the files that doesn't contain semicolon as the delimiter. – KCR Dec 10 '18 at 14:54

4 Answers4

2

Try adjusting your logic a little. Also, include all your delimiters in your grep pattern, and tweak where you put your quotes for it.

pathname=/opt/interfaces/sample_check
find "$pathname" -type f \( -name "*message.txt*" -or -name "*comma[23]*" \) |
  while read -r filename
  do if sed -n '1d; 2{ /[,;|]/q0 }; q1' "$filename"
     then echo "Delimited: $filename"
     else echo "Moving ==>> $filename"
          mv "$filename" /opt/interfaces/sample_check/mvfiles/
     fi
  done

Since we only want to decide based on delimiters in line 2 of the file, let's use sed instead.

sed -n '1d; 2{/[,;|]/q0 }; q1' 

sed -n says print nothing unless requested - we don't need any output.

1d deletes the first line (we're not editing, just abandoning any further processing on this line so it skips the rest of the program.)

2{...} says do these commands only on line 2. /[,;|]/q0 says if the line has any comma, semicolon, or pipe, then quit with a zeo exit code to indicate success.

q1 says if it gets here, quit with exit code of 1.

These trigger the branching of the if. :)

Paul Hodges
  • 13,382
  • 1
  • 17
  • 36
  • This will fail if you have file names whose names contain newlines. – tripleee Dec 10 '18 at 15:08
  • True. Kiran, is that a remote likelihood here? If so we can elaborate this a bit. – Paul Hodges Dec 10 '18 at 15:14
  • If there *might* be newlines, try a construct like `find "$pathname" ... -print0 | while read -d $'\0' filename`. I'm pretty sure null bytes are not allowed in filenames. :) – Paul Hodges Dec 10 '18 at 15:22
  • HI Paul,This script works perfectly. Thank you so much. – KCR Dec 10 '18 at 15:46
  • Hi Paul Hodges,I am new to this website still couldn't figure out how to access to solution. Can you please help me? – KCR Dec 13 '18 at 11:05
  • Hi Paul, I am new to this site may I know how I could accept the solution? – KCR Dec 13 '18 at 14:04
  • Good morning. There should be a checkmark on the top left of each proposed solution, below the ranking. If you click one it will mark it as accepted. :) – Paul Hodges Dec 13 '18 at 14:48
  • Hi Paul, I have another question, I am trying to specifically check the delimiters in the second line of my file so I have added the to the grep line do if grep -q '[,;|]' "$filename" | head -n2. But this doesnt work for me. Can you please help me with this. – KCR Dec 13 '18 at 14:50
  • You specifically just want to know if the 2nd line of the file has one of these delimiters, right? – Paul Hodges Dec 13 '18 at 14:56
  • Yes I am checking how can I check specifically delimiters from the 2nd line of the files – KCR Dec 13 '18 at 15:01
  • Give me a few minutes. – Paul Hodges Dec 13 '18 at 15:01
  • Try that. This will be faster if any of the files are very big, too. – Paul Hodges Dec 13 '18 at 15:16
  • HI Paul, thanks a lot. I will try this script and let you know. :) – KCR Dec 14 '18 at 07:13
  • Hi Paul, The script works pretty well.. thank you so much :) – KCR Dec 14 '18 at 13:21
  • Hi Paul, I have added few more file names to the list then the script is not working. I also had to remove the quotes around $pathname for the script to find the files. After removing the quotes the script is working fine. But the sed command is not working. Can you please help here. – KCR Feb 28 '19 at 16:53
  • Add the changes as an update at the bottom of the original post. I'll take a look. – Paul Hodges Feb 28 '19 at 16:58
  • Hi Paul, added the changes. Could you please check. – KCR Mar 01 '19 at 07:17
  • Abc,123,xyz+Kiran.... this is how the 2nd line in the file looks like,.. afte xyz I have placed + as delimiter instead of comma,.. this file is not getting rejected. – KCR Mar 01 '19 at 08:40
  • I see no edits to your original post, and still don't understand what is happening. – Paul Hodges Mar 01 '19 at 14:23
1

I would avoid putting all file names into one variable because you would get problems if there are filenames that contain spaces. Instead I propose to read the file names line by line

pathname=/opt/interfaces/sample_check
echo $pathname

find "$pathname" -type f \( -name "*message.txt*" -or -name "*comma2*" -or -name "*comma3*" \) | while read filename
do

    echo $filename

    if head -1 "$filename" | grep '[;]' >/dev/null; then

        echo "Found"
    else
        mv "$filename" "$pathname/mvfiles/."
    fi
done

There are surely more options to solve the problem.

You can also use grep -q '[;]' instead of grep '[;]' >/dev/null on modern systems, but on old systems option -q might not work.

Note: I used .../mvfiles/. assuming mvfiles is an existing directory. This avoids overwriting or creating a file of the same name in case the directory should not exist. Additionally I used quotes to avoid problems with file names containing spaces.

Bodo
  • 9,287
  • 1
  • 13
  • 29
  • 1
    Try `grep -q` instead of redirection to `/dev/null`, and `$(...)` instead of backticks. – Paul Hodges Dec 10 '18 at 14:45
  • the script is still moving all the files to the directory instead of just moving the files that doesnt contain (semicolon) as delimiter. – KCR Dec 10 '18 at 14:52
  • @KiranChapidi Sorry, i did not notice, that filename is a list of files. I edit my answer. – Bodo Dec 10 '18 at 14:59
0

Try using grep -L to list files that don't match.

find $pathname -type f \( -name "*message.txt*" -or -name "*comma2*" -or -name "*comma3*" \) | xargs egrep -L ";|,|\|" | xargs  -IX mv  X $pathname/mvfiles

In the above example, I use egrep because of the pipe'd OR conditions. That is, we want to specify multiple regexs to match. If the file contains any of ; , | , the filename will not be output by egrep. This leaves only files that don't match being passed to xargs. In the mac version of xargs, you can specify a replacement string with the -I parameter. For every filename output by egreg, xargs will call mv <filename> $pathname/mvfiles.

As part of a follow up question, I was asked how to only review the second line of the file. Here's a bit of code to do just that:

awk ' FNR == 2 && /[;|,]/ { print FILENAME } ' * 

The above awk will display the current filename (FILENAME) when the file record number (FNR) is 2 (the second line in each file) and the input string matches the regex [;|,].

To inject that bit of code in my answer above, you can do this:

find $pathname -type f \( -name "*message.txt*" -or -name "*comma2*" -or -name "*comma3*" \) | xargs awk ' FNR == 2 && /[;|,]/ { print FILENAME } '   | xargs  -IX mv  X $pathname/mvfiles

So above, I replaced the 'xargs egrep' with 'xargs awk'. Also, I removed the * from the end of awk command because that's basically xargs default function - to take all of the input on stdin and provide it as input to the command ( in this case, awk ).

Since we are using awk, we can actually avoid the last use of xargs to move the files. Instead, we can build the commands and pipe them to bash (or some shell).

Here's a version where we pipe commands to bash:

find $pathname -type f \( -name "*message.txt*" -or -name "*comma2*" -or -name "*comma3*" \) | xargs awk ' FNR == 2 && /[;|,]/ { print "mv " FILENAME  " '$pathname'/mvfiles" } '   | bash

If you want to just debug the above statement without taking actions, you can remove the bash command at the end to leave this debug version:

Debug version (doesn't move files - only prints mv commands):

find $pathname -type f \( -name "*message.txt*" -or -name "*comma2*" -or -name "*comma3*" \) | xargs awk ' FNR == 2 && /[;|,]/ { print "mv " FILENAME  " '$pathname'/mvfiles" } '  
Mark
  • 4,249
  • 1
  • 18
  • 27
  • Hi Mark, If I want to search for the delimiters in just 2nd line of the files.. Where do I need to update the script. Can you please let me know. – KCR Mar 01 '19 at 04:26
  • @KiranChapidi - I've added a little explanation on how to look at only the second line in the files. – Mark Mar 04 '19 at 03:32
-1

From what I understand, you want to loop through a list of filenames that matches your search then check their content to see if it contains either a pipe, comma or semicolon. If that's the case you can use this.

# pathname is your directory you want files to search in
pathname=/opt/interfaces/sample_check

# here you can add different filenames that you want to match. I.e. only take files in the loop which contain filename1 in the name, or starts with filename2, or starts with filename3.
find $pathname -type f \( -name "*filename1*" -or -name "filename2*" -or -name "filename3*" \) -print0 |
while IFS= read -r -d '' file; do
    if grep -e '[|;,]' "$file"
      then
        echo "Found in $file."
    else
        echo "Not found in $file. Moving to directory xxx."
        mv "$file" /opt/interfaces/sample_check/mvfiles
    fi
done
TMNuclear
  • 1,175
  • 5
  • 25
  • 49
  • 1
    You start nicely, but the [useless `cat`](/questions/11710552/useless-use-of-cat), the [broken quoting](/questions/10067266/when-to-wrap-quotes-around-a-shell-variable), and the [unidiomatic `$?` comparison](/a/27501200/874188) ruin it. – tripleee Dec 10 '18 at 15:07
  • 1
    [DON'T USE CAT this way!](http://porkmail.org/era/unix/award.html#cat) `grep -e '[|;,]' $file` will open the file. *Don't* use `cat` when you don't need it, lol... Also, I suspect the `'` should be outside the `[]`, yes? – Paul Hodges Dec 10 '18 at 15:08
  • The quoting is still wrong. You need double quotes around `"$file"` everywhere. – tripleee Dec 10 '18 at 17:42
  • Hi, I would like to search for the delimiters comma pipe and semicolon exist or not just only in the 2nd line of the each file. If not exist move files to mvfiles folder. Where in the script I can make changes, can you please let me know. – KCR Mar 01 '19 at 04:31