3

I am trying to list all files we received in one month

The filename pattern will be

20110101000000.txt

YYYYMMDDHHIISS.txt

The entire directory is having millions of files. For one month there can be minimum 50000 files. Idea of sub directory is still pending. Is there any way to list huge number of files with file name almost similar.

grep -l 20110101*    

Am trying this and returning error. I try php it took a huge time , thats why i use shell script . I dont understand why shell also not giving a result Any suggestion please!!

codeforester
  • 39,467
  • 16
  • 112
  • 140
zod
  • 12,092
  • 24
  • 70
  • 106
  • This question asked on 2011 is the duplicate of the question asked on 2012 !!! – zod Oct 06 '15 at 15:29
  • Related: [Does "argument list too long" apply to shell builtins?](https://stackoverflow.com/questions/47443380/does-argument-list-too-long-restriction-apply-to-shell-builtins) – codeforester Nov 23 '17 at 00:45

6 Answers6

5
$ find ./ -name '20110101*' -print0 -type f | xargs -0 grep -l "search_pattern"

you can use find and xargs. xargs will run grep for each file found by find. You can use -P to run multiple grep's parallely and -n for multiple files per grep command invocation. The print0 argument in find separates each filename with a null character to avoid confusion caused by any spaces in the file name. If you are sure there will not be any spaces you can remove -print0 and -0 args.

Damodharan R
  • 1,497
  • 7
  • 10
  • Thanks . i want the "find" return only filename. now it is returning ./filename. I try to use just . but then also its not – zod May 09 '11 at 15:59
  • Avoid the useless xargs. A find only solution would be simpler and slightly faster. – jlliagre May 09 '11 at 20:22
2

This should be the faster way:

find . -name "20110101*" -exec grep -l "search_pattern" {} +

Should you want to avoid the leading dot:

find . -name "20110101*" -exec grep -l "search_pattern" {} + | sed 's/^.\///'

or better thanks to adl:

find . -name "20110101*" -exec grep -l "search_pattern" {} + | cut -c3-

jlliagre
  • 29,783
  • 6
  • 61
  • 72
  • 1
    `| cut -c3-` is also a simple way to remove the first two characters of each line. – adl May 26 '11 at 20:17
1

The 20110101* is getting expanded by your shell before getting passed to the command, so you're getting one argument passed for every file in the dir that starts with 20110101.

If you just want a list of matching files you can use find:

find . -name "20110101*"

(note that this will search every subdirectory also)

jjwchoy
  • 1,898
  • 16
  • 20
1

Some in depth information available here and also another work-around: for FILE in 20110101*; do grep foo ${FILE}; done. Most people will go with xargs and more seasoned admins with -exec {} + which accomplishes exactly the same, except is shorter to type. One would use the inline shell for construct, when running more processes is less important then seeing the results. With the for construct you may end up running grep thousands of times, but you see each match in real time, while using find and/or xargs you see batched results, however grep is run significantly less.

Mel
  • 6,077
  • 1
  • 15
  • 12
0

you need to put in a search term, so

grep -l "search term" 20110101*

if you want to just find the files, use ls 20110101*

ColWhi
  • 1,077
  • 6
  • 16
0

Just pipe the output of ls to grep:

ls | grep '^20110101'
jwodder
  • 54,758
  • 12
  • 108
  • 124
  • Thanks how can i put a search term to this? – zod May 09 '11 at 15:25
  • @zod: Your question seems to state that you just want a list of all files in the directory whose names start with "20110101", which is what this code does; it is using "`^20110101`" as the pattern to search for in the output from `ls`. – jwodder May 09 '11 at 16:10