11

I need to copy all *.doc files (but not folders whose names match *.doc) from a network folder \\server\source (including files in all nested folders) to a local folder C:\destination without preserving the nested folders hierarchy (i.e. all files should go directly into C:\destination and no nested folders should be created in C:\destination). In case there are several files with the same name from different subfolders of \\server\source, only the first one should be copied and never overwritten then — all conflicting files found later should be skipped (there could be many cases like this, and the skipped files should not be trasferred over the network, otherwise it would take too much time). Here is my attempt to implement it in PowerShell:

cp \\server\source\* -Recurse -Include *.doc -Container:$false -Destination C:\destination

There are at least two problems with this command:

  • It copies folders whose names match *.doc too.
  • In case of conflicting names any file found later is transferred over the network and overwrites the previous one.

Can you suggest how to fix these problems?
Implementations using copy, xcopy, robocopy, cscript or *.bat, *.cmd are also welcome.
The local OS is Windows 8 and the file system is NTFS.

Vladimir Reshetnikov
  • 11,750
  • 4
  • 30
  • 51
  • What is the expected behavior if the script runs twice? Should it still copy everything once? Or should it copy nothing? – Aaron Jensen Jul 09 '13 at 08:08
  • 1
    @splatteredbits The destination directory can be assumed to be initially empty. If this precondition fails then the script behavior may be undefined. – Vladimir Reshetnikov Jul 09 '13 at 17:18

5 Answers5

16

I would produce the list of files first and validate as you go through the list.

Something like this:

$srcdir = "\\server\source\";
$destdir = "C:\destination\";
$files = (Get-ChildItem $SrcDir -recurse -filter *.doc | where-object {-not ($_.PSIsContainer)});
$files|foreach($_){
    if (!([system.io.file]::Exists($destdir+$_.name))){
                cp $_.Fullname ($destdir+$_.name)
    };
}

So, use Get-ChildItem to list files in source folder matching the filter, pipe through where-object to strip directories out.

Then go through each file in a foreach loop and check if the filename (not Fullname) exists in the destination using the Exists method of the system.io.file .NET class.

If it doesn't, copy, using only original filename (dropping original path).

Use the -whatif option on the copy when testing, so it only displays what it would do, in case result is not what you wanted :-)

James Gaunt
  • 14,631
  • 2
  • 39
  • 57
Graham Gold
  • 2,435
  • 2
  • 25
  • 34
7

The previous answers seem rather overcomplicated to me, unless I'm misunderstanding something. This should work:

Get-ChildItem "\\server\source\" *.doc -Recurse | ?{-not ($_.PSIsContainer -or (Test-Path "C:\Destination\$_"))} | Copy-Item -Destination "C:\Destination"

None of the built-in commands - copy, xcopy, or robocopy - will do what you want on their own, but there's a utility called xxcopy that will, conveniently available at http://www.xxcopy.com. It has a number of built-in options specifically for flattening directory trees into a single directory. The following will do what you described:

xxcopy "\\server\source\*.doc" "C:\Destination" /SGFO

However, xxcopy has various other options for handling duplicate filenames than just copying the first one encountered, such as adding the source directory name to the filename, or adding sequential numerical identifies to all but the first one, or all but the newest or oldest. See this page for details: http://www.xxcopy.com/xxcopy16.htm

Adi Inbar
  • 12,097
  • 13
  • 56
  • 69
  • Hmmm, I really don't understand this downvote, unless someone took exception to the phrasing of the first line. It wasn't intended as an insult, I just felt that the answers already posted were going through extra steps for something that can be done more simply. Aren't simplicity and conciseness considered preferable in coding as long as you don't sacrifice clarity? Anyway, the OP specifically said he'd welcome solutions using copy commands. I provided info about a 3rd party copy utility that's tailored to do exactly what he wants to accomplish with a single switch. How is that "not useful"? – Adi Inbar Jul 11 '13 at 03:01
  • Tried your option (the first one) - it's good, but it flattens the hierarchy – Archeg Sep 17 '14 at 15:05
  • 3
    Flattening the hierarchy was the goal of the question: "without preserving the nested folders hierarchy (i.e. all files should go directly into C:\destination and no nested folders should be created in C:\destination)". – Adi Inbar Sep 17 '14 at 16:21
  • Sorry. Missed that. Than your answer is correct - it helped me after some modifications. Thanks – Archeg Sep 17 '14 at 16:24
2
# Get all *.doc files under \\server\source
Get-ChildItem -Path \\server\source *.doc -Recurse |
    # Filter out directores
    Where-Object { -not $_.PsIsContainer } | 
    # Add property for destination
    Add-Member ScriptProperty -Name Destination -Value { Join-Path 'C:\destination' $this.Name } -PassThru |
    # Filter out files that exist on the destination
    Where-Object { -not (Test-Path -Path $_.Destination -PathType Leaf } | 
    # Copy. 
    Copy-Item
Aaron Jensen
  • 25,861
  • 15
  • 82
  • 91
1

Why use foreach when you already have a pipeline? Calculated properties for the win!

Get-ChildItem -Recurse -Path:\\Server\Path -filter:'*.doc' | 
    Where { -not $_.PSIsContainer } |
    Group Name |
    Select @{Name='Path'; Expression={$_.Group[0].FullName}},@{Name='Destination'; Expression={'C:\Destination\{0}' -f $_.Name}} |
    Copy-Item
Eris
  • 7,378
  • 1
  • 30
  • 45
0
$docFiles = Get-ChildItem -Path "\\server\source" -Recurse | Where-Object {$_.Attributes.ToString() -notlike "*Directory*" -and ($_.Name -like "*.doc" -or $_.Name -like "*.doc?")} | Sort-Object -Unique;
$docFiles | ForEach-Object { Copy-Item -Path $_.fullname -Destination "C:\destination" };

First line read each *.doc file and *.doc? (so it considers also Office 2010 .docx format), excluding Directories and duplicate files.
Second line copy each item from destination to source (the folder C:\destination must already exist).
In general I suggest you to split command over multiple lines because it's easier to produce code (in this case first task: get files, second task: copy files).

Naigel
  • 9,086
  • 16
  • 65
  • 106