I've seen the answer elsewhere for text files, but I need to do this for a compressed file.
I've got a 6G binary file which needs to be split into 100M chunks. Am I missing the analog for unix's "head" somewhere?
I've seen the answer elsewhere for text files, but I need to do this for a compressed file.
I've got a 6G binary file which needs to be split into 100M chunks. Am I missing the analog for unix's "head" somewhere?
Never mind. Here you go:
function split($inFile, $outPrefix, [Int32] $bufSize){
$stream = [System.IO.File]::OpenRead($inFile)
$chunkNum = 1
$barr = New-Object byte[] $bufSize
while( $bytesRead = $stream.Read($barr,0,$bufsize)){
$outFile = "$outPrefix$chunkNum"
$ostream = [System.IO.File]::OpenWrite($outFile)
$ostream.Write($barr,0,$bytesRead);
$ostream.close();
echo "wrote $outFile"
$chunkNum += 1
}
}
Assumption: bufSize fits in memory.
The answer to the corollary question: How do you put them back together?
function stitch($infilePrefix, $outFile) {
$ostream = [System.Io.File]::OpenWrite($outFile)
$chunkNum = 1
$infileName = "$infilePrefix$chunkNum"
$offset = 0
while(Test-Path $infileName) {
$bytes = [System.IO.File]::ReadAllBytes($infileName)
$ostream.Write($bytes, 0, $bytes.Count)
Write-Host "read $infileName"
$chunkNum += 1
$infileName = "$infilePrefix$chunkNum"
}
$ostream.close();
}
I answered the question alluded to in this question's comments by bernd_k but I would use -ReadCount
in this case instead of -TotalCount
e.g.
Get-Content bigfile.bin -ReadCount 100MB -Encoding byte
This causes Get-Content
to read a chunk of the file at a time where the chunk size is either a line for text encodings or a byte for byte encoding. Keep in mind that when it does this, you get an array passed down the pipeline and not individual bytes or lines of text.