0

i have a script that is pushing out some filesystem data to be uploaded to another system.

it would be very handy if i could tell myself what 'kind' of file each file actually is, because it will help with some querying later on down the road.

so, for example, say that my script is spitting out the following:

/home/myuser/mydata/myfile/data.log
/home/myuser/mydata/myfile/myfile.gz
/home/myuser/mydata/myfile/mod.conf
/home/myuser/mydata/myfile/security
/home/myuser/mydata/myfile/last

in the end, i'd like to see:

/home/myuser/mydata/myfile/data.log log
/home/myuser/mydata/myfile/myfile.gz gz
/home/myuser/mydata/myfile/mod.conf conf
/home/myuser/mydata/myfile/security security
/home/myuser/mydata/myfile/last last

there's gotta be a way to do this with regular expressions and sed, but i can't figure it out.

any suggestions?

EDIT:

i need to get this info via the command line. looking at the answers so far, i obviously have not made this clear. so with the example data i provided, assume that data is all being fed via greps and seds (data is already sterlized). i need to be able to pipe the example data to sed/grep/awk/whatever in order to produce the desired results.

jasonmclose
  • 1,667
  • 4
  • 22
  • 38
  • 1
    I can't give you the actual solution, but the method of doing this. You have to "split" at the last found "." character and the 1st element will be your extension if we index from 0. If no "." found your result should be the string after the last "/" character. Also, google for "bash get file extension". Good luck. – Dominik Antal Oct 01 '13 at 14:45
  • Rather than looking at the extension, it might be better to use `file`. See `man file` – cdarke Oct 01 '13 at 14:58
  • @fedorqui i don't feel like this is a repeat. i need to be able to get this info via a tool from the command line. maybe i didn't make that clear. the original data is being fed via grep and sed. i need to be able to pipe to something to get the results en masse. also, please ask me for clarification before you immediately downvote my question. if i find that my question is a duplicate, i'm fully capable of marking it as so. – jasonmclose Oct 01 '13 at 15:03
  • @jasonmclose it is not a repeat, but you can get the main idea from it: to get the extension and the name. The rest is a matter of looping through the output you are receiving. – fedorqui Oct 01 '13 at 15:08
  • It's not a repeat because of these examples: /mod.conf conf /security security. Notice how he sometimes want the extension and at other times he wants the file name. – historystamp Oct 21 '14 at 23:33

5 Answers5

2

This should work for you:

x='/home/myuser/mydata/myfile/security'
( IFS=[/.] && arr=( $x ) && echo ${arr[@]:(-1):1} )
security

x='/home/myuser/mydata/myfile/data.log'
( IFS=[/.] && arr=( $x ) && echo ${arr[@]:(-1):1} )
log
anubhava
  • 761,203
  • 64
  • 569
  • 643
2

Print last filed that are separated by a none alpha character.

awk -F '[^[:alpha:]]' '{ print $0,$NF }'
/home/myuser/mydata/myfile/data.log log
/home/myuser/mydata/myfile/myfile.gz gz
/home/myuser/mydata/myfile/mod.conf conf
/home/myuser/mydata/myfile/security security
/home/myuser/mydata/myfile/last last
Jotne
  • 40,548
  • 12
  • 51
  • 55
1

To extract the last element in a filename path:

filename=$(path##*/}

To extract characters after a dot in a filename:

extension=${filename##*.}

But (my comment) rather than looking at the extension, it might be better to use file. See man file.

cdarke
  • 42,728
  • 8
  • 80
  • 84
  • sorry. i wasn't clear with my question. i need to get this data en masse, meaning from the command line within bash, not from a bash script. so if the example data i provided is the result of grep'ing and pipe'ing and sed'ing, i need to be able to tack on one more sed/awk/whatever in order to achieve my desired results. – jasonmclose Oct 01 '13 at 15:06
  • @jasonmclose Why not pipe into a bash script, using `read`? – cdarke Oct 01 '13 at 15:14
1

As others have already answered, to parse the file names:

extension="${full_file_name##*.}"   # BASH and Kornshell/POSIX only
filename=$(basename "$full_file_name")
dirname=$(dirname "$full_file_name")

Quotes are needed if file names could have spaces, tabs, or other strange characters in them.

You can also test whether a file is a directory or file or link with the test command (which is linked to [ so that test -f foo is the same as [ -f foo ].

However, you said: "it would be very handy if i could tell myself what kind of file each file actually is".

In that case, you may want to investigate the file command. This command will return the file type as determined by some sort of magic file (traditionally in /etc/magic), but newer implementations can use the user's own scheme. This can tell file type by extension and by the magic number in the file's header, or by looking at the first few lines in the file (looking for a regular expression ^#! .*/bash$ in the first line.

David W.
  • 105,218
  • 39
  • 216
  • 337
1

This extracts the last component after a slash or a dot.

awk -F '[/.]' '{ print $NF }'
tripleee
  • 175,061
  • 34
  • 275
  • 318