1

Here is grep command:

grep "%SWFPATH%/plugins/" filename 

And its output:

set(hotspot[hs_bg_%2].url,%SWFPATH%/plugins/textfield.swf);
set(hotspot[hs_%2].url,%SWFPATH%/plugins/textfield.swf);
url="%SWFPATH%/plugins/textfield.swf"
url="%SWFPATH%/plugins/scrollarea.swf"
alturl="%SWFPATH%/plugins/scrollarea.js"
url="%SWFPATH%/plugins/textfield.swf"

I'd like to generate a file containing the names of the all files in the 'plugins/' directory, that are mentioned in a certain file.

Basically I need to extract the file name and the extension from every line. I can manage to delete any duplicates but I can't figure out how to extract the information that I need.

This would be the content of the file that I would like to get:

textfield.swf
scrollarea.swf
strollarea.js

Thanks!!!

PS: The thread "Extract filename and extension in bash (14 answers)" explains how to get filename and extension from a 'variable'. What I'm trying to achieve is extracting these from a 'file', which is completely different'

RafaelGP
  • 1,749
  • 6
  • 20
  • 35
  • possible duplicate of [Extract filename and extension in bash](http://stackoverflow.com/questions/965053/extract-filename-and-extension-in-bash) – Marc B Jun 06 '13 at 16:57
  • duplicate http://stackoverflow.com/questions/965053/extract-filename-and-extension-in-bash?rq=1 – blue Jun 06 '13 at 16:59

3 Answers3

2

Using awk:

grep "%SWFPATH%/plugins/" filename | \
awk '{ match($0, /plugins\/([^\/[:space:]]+)\.([[:alnum:]]+)/,submatch);
     print "filename:"submatch[1];
     print "extension:"submatch[2];
    }'

Some explanation:

the match function takes every line processed by awk (indicated by $0) and looks for matches to that regex. Submatches (the parts of the string that match the parts of the regex between parentheses) are saved in the array submatch. print is as straightforward as it looks, it just prints stuff.

blue
  • 2,683
  • 19
  • 29
1

For this specific problem

awk '/\/plugins\// {sub(/.*\//, ""); sub(/(\);|")?$/, "");
   arr[$0] = $0} END {for (i in arr) print arr[i]}' filename
iruvar
  • 22,736
  • 7
  • 53
  • 82
  • the second `sub` does not work well with the first two strings – blue Jun 06 '13 at 17:11
  • @blue, added in a different solution – iruvar Jun 06 '13 at 20:50
  • Thanks! Worked like a charm. I didn't have any preblems with the second 'sub' – RafaelGP Jun 07 '13 at 14:47
  • @RafaelGP, good to know! Blue pointed out an issue with the second `sub` in my original answer that I was able to subsequently fix, so you no longer see the issue in the latest answer. Cheers. – iruvar Jun 07 '13 at 14:48
1

Use awk to simply extract the filename and then sed to clean up the trailing )"; characters.

 awk -F/ '{print $NF}' a  | sed -e 's/);//' -e 's/"$//'
suspectus
  • 16,548
  • 8
  • 49
  • 57