I've got a huge XML file (0.5 GB), with no line breaks. I want to be able to look at, say, the first 200 characters without opening the whole file. Is there a way to do this with PowerShell?
-
1looks to me that get-content is going to be effectively loading the whole file, so that's not what I'm looking for - unless there's some lazy evaluating magic in gc that I can't find any documentation for. – Jonny Cundall Sep 21 '13 at 17:38
-
[This answer](http://stackoverflow.com/a/11010158/2707864) to http://stackoverflow.com/questions/1001776/how-can-i-split-a-text-file-using-powershell can be used as a basis. It might work faster than [this answer](http://stackoverflow.com/a/18936628/2707864) below if the fragment to extract is large. This is a conclusion that I obtained from non-systematic tests. Try it as you see fit. – sancho.s ReinstateMonicaCellio Apr 14 '16 at 21:47
5 Answers
PowerShell Desktop (up to 5.1)
You can read at the byte level with Get-Content like so:
$bytes = Get-Content .\files.txt -Encoding byte -TotalCount 200
[System.Text.Encoding]::Unicode.GetString($bytes)
If the log file is ASCII you can simplify this to:
[char[]](Get-Content .\files.txt -Encoding byte -TotalCount 200)
PowerShell Core 6.0 and newer
PowerShell Core doesn't support byte
encoding. It's been replaced by -AsByteStream
parameter.
$bytes = Get-Content .\file.txt -AsByteStream -TotalCount 200
[System.Text.Encoding]::Unicode.GetString($bytes)

- 363
- 4
- 6

- 194,368
- 42
- 353
- 369
-
1The file is ASCII, and what worked best was the ascii version of your first answer. The second answer actually displayed as one line per character - a bit hard to read! – Jonny Cundall Sep 21 '13 at 19:28
-
5if you put a () around the whole thing and `-join ''` it will become one string again. – Eris Sep 21 '13 at 19:47
-
3@Eris yes that will get it back to string form but the "() around the whole thing" bit is not necessary. – Keith Hill Sep 21 '13 at 20:02
-
This worked great for me. And it does not traverse the whole file as `get-content`, so it is most convenient for large files. – sancho.s ReinstateMonicaCellio Apr 08 '16 at 18:31
Copying binary files via powershell commandlets tend to be a bit slow. You may, however, run the following commands from powershell to get a decent performance:
cmd /c copy /b "large file.ext" "first n.ext"
FSUTIL file seteof "first n.ext" $nbytes
Tested in Win 10 PS 5.1
Result: 1.43GB processed in 4 seconds

- 2,854
- 18
- 26
Get-Content takes a -ReadCount option so you can take only the first X lines.
If you really want character granularity, you'll need to use one of the [IO.File]::Read methods from .NET

- 7,378
- 1
- 30
- 45
-
2sadly, there are no line breaks in the file so this is not an option – Jonny Cundall Sep 21 '13 at 19:06
@keith-hill got me most of the way there.
Here's what I used to get the first character out of a VMware Virtual Disk. There is important information in the first 1000 or so characters, but I'd never get at it trying to open a 30GB file.
$bytes = Get-Content .\VMwareVirtualDiskFile.vmdk -Encoding byte -TotalCount 1000
[String]::Concat([char[]]($bytes))

- 721
- 1
- 7
- 14
(get-content myfile).Substring(0,x)
Where x is the number of characters you want from each line e.g. $lines = (get-content myfile).Substring(0,10) will return an array of strings where each member of the array contains the first 10 characters of each line in myfile.

- 1
-
Welcome to stack overflow. Please consider formatting you code differently than your text. You can use ` ` to wrap your code – sao Jan 24 '20 at 14:08
-
this does not answer the original question, they wanted the first X bytes of the entire file, not per line. this method is also extremely inefficient for large files which was part of the original question. – Justin May 28 '20 at 19:47