2

I have the below example file

d41d8cd98f00b204e9800998ecf8427e  /home/abid/Testing/FileNamesTest/apersand $ file
d41d8cd98f00b204e9800998ecf8427e  /home/abid/Testing/FileNamesTest/file[with square brackets]
d41d8cd98f00b204e9800998ecf8427e  /home/abid/Testing/FileNamesTest/~$tempfile
017a3635ccb76250b2036d6aea330c80  /home/abid/Testing/FileNamesTest/FileThree
217a3635ccb76250b2036d6aea330c80  /home/abid/Testing/FileNamesTest/FileThreeDays
d41d8cd98f00b204e9800998ecf8427e  /home/abid/Testing/FileNamesTest/single quote's

I want to grep the last part of the file (the file name) but I'm after an exact match for the last part of the line (the file name)

grep FileThree$ files.md5
017a3635ccb76250b2036d6aea330c80  /home/abid/Testing/FileNamesTest/FileThree

gives back an exact match and doesnt find "FileThreeDays" which is what I'm after but because some of the file names contains square brackets it I'm having to use grep -F or fgrep. However using fgrep like the above doesnt work it returns nothing.

How can I exact match the last part of the line using fgrep whilst still honoring the special characters above ~ / $ / ' / [ ] etc...or any other method using maybe awk...

Further....

using fgrep withou return both these files I only want an exact match (using the use of the $ above with grep), but $ with fgrep doesnt return anything.

 grep -F FileThree files.md5
017a3635ccb76250b2036d6aea330c80  /home/abid/Testing/FileNamesTest/FileThree
217a3635ccb76250b2036d6aea330c80  /home/abid/Testing/FileNamesTest/FileThreeDays
AShah
  • 846
  • 2
  • 17
  • 33
  • You're probably best off using regular grep and escaping the characters you need to escape using \. I posted an alternative answer below. – fzzfzzfzz May 31 '15 at 14:31
  • This would work for your sample file above: `fgrep -w FileThree files.md5`. It would break if you'd also have a directory named `FileThree`. Still, `-w` can make your life simpler in a lot of cases. – lcd047 May 31 '15 at 16:54

2 Answers2

2

I can't tell all the details from your question, but it sounds like you can use grep and just escape the special characters: grep 'File\[Three\]Days$'

If you want to use fgrep, though, you can use some tr tricks to help you. If all you want is the filename (without the directory name), you can do something like

cat files.md5 | tr '/' '\n' | fgrep FileThreeDays

That tr command replaces slashes with newlines, so it will put each filename on its own line. That means that fgrep will only find the filename when it searches for FileThreeDays.

If you want the full filename with directory, it's a little trickier, but a similar approach will work. Assuming that there's always a double space between the SHA and the filename, and that there aren't any filenames with double spaces or tab characters in them, you can try something like this:

sed 's/  /\t' files.md5 | tr '\t' '\n' | fgrep FileThreeDays

That sed command converts the double spaces to tabs. The tr command turns those tabs into newlines (the same trick as above).

fzzfzzfzz
  • 1,248
  • 1
  • 12
  • 22
  • unfortunately i'm going to be doing this for millions of files and I have noticed files with double spaces. I will need the md5 as well so dont think the first approach would work. Also not sure if escaping is an option as I will have to script this somehow due to the number of files involved – AShah May 31 '15 at 15:35
2

I would use awk:

awk '{$1="";print}' file

$1="" cuts the first column to an empty string, and print prints the modified line - which only contains the filename now.

However, this leaves a blank space at the start of each line. If you care about it and want to remove it, set the output field separator to an empty string:

awk '{$1="";print}' OFS="" file
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • I probably should have mentioned that I need the whole line as I need the md5 from the filename – AShah May 31 '15 at 15:18
  • Then you want `grep -F` (fixed-strings). You don't need to quote special chars when using `-F` – hek2mgl May 31 '15 at 15:24
  • see above for the issue with grep -F – AShah May 31 '15 at 15:38
  • Check this if you want a script to escape regex metacharacters reliably: http://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed You can use that also for grep. Is this what you are looking for? – hek2mgl May 31 '15 at 15:48