3

My use case is to copy all filenames matching regex [0-9]{10} to a new directory. I also have to restrict the copied files to 100. I tried going through a few sources that explains how we can use regular expressions to do this but my limited understanding of bash and unix is limiting me from getting this usecase working. I tried something similar to this: How to copy multiple files from a different directory using cp?

Any help will be highly appreciated.

abhinav singh
  • 1,024
  • 1
  • 12
  • 34

5 Answers5

4

bash: store all the files in an array, then take a slice of the first 100 elements

all_files=( [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] )  # glob pattern, not regex
cp -t /destination/dir "${all_files[@]:0:100}"
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • 1
    This is beautiful. Though it wouldn't hurt to mention that `cp -t` only works in GNU coreutils (since this question is tagged **[tag:unix]** in addition to **[tag:linux]**). – ghoti Aug 17 '17 at 04:28
  • Nice. I tried it with a 100 from 1000000 files. It was slow but it worked. – James Brown Aug 17 '17 at 05:57
2

Something like this should work for you:

cp `ls -1 | egrep '[0-9]{10}' | head -100` <destination directory>

(Depending on your system, you might have a different grep command, or one that requires using the -e switch)

radical7
  • 8,957
  • 3
  • 24
  • 33
  • [Parsing `ls` is never a good idea.](http://mywiki.wooledge.org/ParsingLs) Even if it seems safe for a particular small use-case, it's a *terrible* habit to get into, and should never be offered as an option to budding shell programmers. – ghoti Aug 17 '17 at 04:18
  • @ghoti While your idol has some points, it's certainly only one view; one which I don't agree with. Do you tell budding chefs not to use a knife because they might cut themselves? Views like that discourage experimentation and learning. And let's face it: people screw up shell programming all the time, particularly when they try to employ the grandiose structures required in your solution. IMHO, it's much better to understand the limitations of a tool, than never wield it at all. – radical7 Aug 17 '17 at 16:41
  • My idol? Not sure what you mean. At any rate, no. One doesn't support "experimentation and learning" by encouraging people to engage in risky behaviour when there's an alternative that achieves the same thing with greater reliability and less risk. You want to learn? Listen to experts. The fact that people screw up shell programming all the time should be incentive for improvement, not a reason to defend bad habits. Also .. "grandiose structures"? Heh. – ghoti Aug 18 '17 at 14:17
1

While I think Glenn Jackman's answer is one of the nicest I've seen in a while, if you really do need to use a regex, then pathname expansion in an array won't work for you. Instead, you can either use find to find files (and populate your array), or you can step through a directory and use the regex matching built in to bash.


First strategy, using find (per Greg's BashFAQ/020):

unset files i
while IFS= read -r -d $'\0' file; do
  files[i++]="$file"
done < <(find -E ./ -type f -regex '\./[0-9]{10}' -print0)

Note that find's -regex has implicit ^ and $ anchors. I'm using -E to tell find that I want to use ERE instead of BRE (which works in macOS, FreeBSD, other BSDs...). In Linux, you may want to use the -regextype option ... or just express yourself in BRE.

Then select just the first 100 array items as Glenn suggested:

cp "${files[@]:0:100}" /path/to/destination/

The second strategy, using Bash's built-in regex support, might be done with a bit of scripting:

unset n
for file in *; do
  [[ $file =~ ^[0-9]{10}$ ]] &&
  mv -v "$file" dest/ &&
  (( ++n >= 100 )) && break
done

This uses globbing to identify all files, and then for each one that matches your regex, it moves the file and increments a counter. The increment also checks to see if it's exceeded your threshold and if so, breaks out of the loop.

You could make it a one-liner if you like. I did when writing and testing it. And this could of course be written longer if you don't like your scripts terse. :)

ghoti
  • 45,319
  • 8
  • 65
  • 104
  • Another way to append to an array is: `files=(); while ...; do files+=("$file"); ` -- this way you don't need the counter variable. – glenn jackman Aug 17 '17 at 11:33
  • @glennjackman, that's true, but it would be further from the example in BashFAQ/020. :-) – ghoti Aug 17 '17 at 11:37
-1

with this command you can find all the files you need : $sudo find / -regextype sed -regex "[0-9]{10}"

*note that you can replace / with any directory that you want to search in for your needle

and with this you are going to do what you want :

$sudo find / -regextype sed -regex "[0-9]{10}" -exec cp {} /path/you/want \;

Arash Khajelou
  • 560
  • 3
  • 16
-1

Thanks for the valuable suggestions. I came up with following solution :

ls -1 | egrep [0-9a-f]{10} | head -100 | xargs -I{} cp -f {} <your directory>
abhinav singh
  • 1,024
  • 1
  • 12
  • 34
  • [Here is why](http://mywiki.wooledge.org/ParsingLs) that's a "less than optimal" solution. – ghoti Aug 17 '17 at 04:17
  • Cannot disagree that its a less optimal solution. I guess down votes were more for wrong solutions. For a newbie like me, this solution was a discovery :) – abhinav singh Aug 17 '17 at 18:59