5

I am trying to report on the number of files created on each date. I can do that with this little one liner:

ls -la foo*.bar|awk '{print $7, $6}'|sort|uniq -c

and I get a list how many fooxxx.bar files were created by date, but the month is in the form: Aaa (ie: Apr) and I want xx (ie: 04).

I have feeling the answer is in here:

awk '
BEGIN{
   m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
   for(o=1;o<=m;o++){
      months[d[o]]=sprintf("%02d",o)
    }
format = "%m/%d/%Y %H:%M"
}
{
split($4,time,":")
date = (strftime("%Y") " " months[$2] " " $3 " " time[1] " " time[2] " 0")
print strftime(format, mktime(date))
}'

But have no to little idea what I need to strip out and no idea how to pass $7 to whatever I carve out of this to convert Apr to 04.

Thanks!

SherpaPsy
  • 63
  • 2
  • 6
  • 1
    [Don't parse `ls`](http://mywiki.wooledge.org/ParsingLs). – Dennis Williamson Apr 11 '12 at 17:19
  • 2
    You want to get the file time with [`stat`](http://man.cx/stat), and the beauty of that is you can format the date to your liking directly. – glenn jackman Apr 11 '12 at 19:11
  • to elaborate on @DennisWilliamson great counsel : ls -l display things differently depending on the 'age' of the file: fields reprensent different things if the file/dir is more than a year old, more than 6 month old, etc. So sometimes $7 will not contain the Mmm info. Using stat is best, but on some very old system that do not have it, a very dirty (+ SLOW) trick on very old OS is to parse `tar cf - file | tar tvf - |head -1` (tar fields are more consistent and do not vary depending on the age of the file). Using stat (or some perl foo) on the file is much better and faster and appropriate. – Olivier Dulac Mar 04 '21 at 11:16

5 Answers5

22

Here's the idiomatic way to convert an abbreviated month name to a number in awk:

$ echo "Feb" | awk '{printf "%02d\n",(index("JanFebMarAprMayJunJulAugSepOctNovDec",$0)+2)/3}'
02

$ echo "May" | awk '{printf "%02d\n",(index("JanFebMarAprMayJunJulAugSepOctNovDec",$0)+2)/3}'
05

Let us know if you need more info to solve your problem.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
1

Assuming the name of the months only appear in the month column, then you could do this:

ls -la foo*.bar|awk '{sub(/Jan/,"01");sub(/Feb/,"02");print $7, $6}'|sort|uniq -c
tommy.carstensen
  • 8,962
  • 15
  • 65
  • 108
0

Just use the field number of your month as an index into the months array.

print months[$6]

Since ls output differs from system to system and sometimes on the same system depending on file age and you didn't give any examples, I have no way of knowing how to guide you further.

Oh, and don't parse ls.

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • We are monitoring the change in files of web content stored in rotating tar files. Files will never be older than 3 or 4 months. As the older tar files become sparse they are "optimized" into newer tar files and removed. I'm sure there are other ways to do this, but I thought using awk was elegant (and far better than cut). I could script the Aaa to xx month conversion using case, but rather do it on the fly as the lines are parsed with awk if possible, I am just not sure where to do that part where I put the field number into the index of the month array. – SherpaPsy Apr 12 '12 at 09:13
  • @SherpaPsy: Does your system have `stat`? What OS is it (and version/distribution)? – Dennis Williamson Apr 12 '12 at 11:04
  • Aix 6.1 - can use istat, but that output would be even more difficult to parse I think! – SherpaPsy Apr 12 '12 at 12:36
  • @SherpaPsy : indeed: on AIX: `TZ=UTC istat /somefile | grep modified | awk 'BEGIN {Mmms="Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"; n=split(Mmms,Mmm," ") ; for(i=1;i<=n;i++){ mm[Mmm[i]]=sprintf("%02d",i) } } ; { printf("%s-%s-%sT%s %s",$NF, mm[$4], $5, $6, $(NF-1) ) }'` ## will output an iso8601 date of the modificaiton date of that file, for ex: `2019-04-18T14:16:05 UTC` # you can TZ=anything, for ex: TZ=UTC+2 to see that date in UTC+2 timezone... or TZ=EST, etc – Olivier Dulac Mar 04 '21 at 11:34
0

To parse AIX istat, I use:

istat .profile | grep "^Last modified" | read dummy dummy dummy  mon day time dummy yr dummy
echo "M: $mon D: $day T: $time Y: $yr"
-> Month: Mar Day: 12 Time: 12:05:36 Year: 2012

To parse AIX istat month, I use this two-liner AIX 6.1 ksh 88:

monstr="???JanFebMarAprMayJunJulAugSepOctNovDec???"
mon="Oct" ; hugo=${monstr%${mon}*} ; hugolen=${#hugo} ; let hugol=hugolen/3 ; echo "Month: $hugol"
-> Month: 10

1..12 : month name ok

If lt 1 or gt 12 : month name not ok

Instead of "hugo" use speaking names ;-))

Community
  • 1
  • 1
0

Adding a version for AIX, that shows how to retrieve all the date elements (in whatever timezone you need it them in), and display an iso8601 output

tempTZ="UTC" ; TZ="$tempTZ" istat /path/to/somefile \
| grep modified \
| awk -v tmpTZ="$tempTZ" '
   BEGIN {Mmms="Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"; 
          n=split(Mmms,Mmm," ") ; 
          for(i=1;i<=n;i++){ mm[Mmm[i]]=sprintf("%02d",i) } 
   }
         { printf("%s-%s-%sT%s %s",$NF, mm[$4], $5, $6, tmpTZ )  }
  ' ## this will output an iso8601 date of the modification date of that file, 
    ## for ex: 2019-04-18T14:16:05 UTC 
 ## you can tempTZ=anything, for ex: tempTZ="UTC+2" to see that date in UTC+2 timezone... or tempTZ="EST" , etc

I show the iso8601 version to make it more known & used, but of course you may only need the "mm" portion, which is easly done : mm[$4]

Olivier Dulac
  • 3,695
  • 16
  • 31