My problem is that i have to split a file that is just 1 line but very long. I try to run
(cat $filename).split("'")
but it gives me the an outofmemoryexception. Is there a way to go through the file that it doesn't try to load the file all at once so i can split the single line. For reference the file size in question is 46MB.
Asked
Active
Viewed 729 times
1

Adlis
- 95
- 1
- 11
-
Something like this will probably be helpful http://stackoverflow.com/questions/4533570/in-powershell-how-do-i-split-a-large-binary-file – DejaVuSansMono Dec 16 '16 at 17:03
-
it's somewhat odd that a 46MB file is throwing an out of memory exception, is `(cat $filename).split("'")` the entirety of your code or are there other operations as well? – Mike Garuccio Dec 16 '16 at 17:04
-
sorry should have added the rest of what i was doing here it is where { $_ -match "`^EQD|`^MEA"} out-file -encoding default work/templist – Adlis Dec 16 '16 at 17:07
-
ok, can you try only running the `(cat $filename).split("'")` portion and see if it errors or spits out a bunch of text? – Mike Garuccio Dec 16 '16 at 17:15
-
yea i did that earlier and it gave me the same outofmemory exception – Adlis Dec 16 '16 at 17:29
-
ouch, same result if you just `cat $filename`? if so you'll need to drop to .Net's `StreamReader` class and use the `.Read` Method. More details [Here](https://msdn.microsoft.com/en-us/library/9kstw824(v=vs.110).aspx) – Mike Garuccio Dec 16 '16 at 17:39
-
I have found a way around this by using your suggestion of a streamreader and reading it character by character, though if i try to store this into one variable (with the ' replaced with new line) the variable will get an out of memory exception. Still this is much better then getting no progress and only the out of memory exception – Adlis Dec 16 '16 at 18:44
-
1I have no problem with `('a,' * 23MB) | set-content test.txt` to make a 46MB file, followed by `$x = (get-content test.txt).split(',')` to read it and split it into a 24 million item array. But it does make PS ISE take 1.2GB to do it, and I haven't tried to print it out - which your code will try to do, since you don't assign the result to anything... even so, virtual memory will take care of 1.2GB on basically any computer around today, so ... what's up with your computer? – TessellatingHeckler Dec 16 '16 at 19:46
-
1You don't need to split the text, instead use a properly constructed regex with `[regex]::matches` or `select-string -allmatches` – wOxxOm Dec 16 '16 at 20:25
-
sorry, to clarify originally i was trying to split up the string and then output it to a file to work with the result easier. – Adlis Dec 16 '16 at 21:22
-
1Please clarify **in your question**, not in comments. Show a [mcve] of your code. – Ansgar Wiechers Dec 17 '16 at 00:04
1 Answers
1
I had a similar issue working with large files a couple of years ago. Assuming that none of your individual strings exceeds the size limit, this should work:
$InputFileName = 'C:\Temp\Temp.txt'
$StreamReader = New-Object System.IO.StreamReader($InputFileName, [System.Text.Encoding]::ASCII)
$Queue = New-Object System.Collections.Generic.Queue[char]
[string[]]$Array = @()
while ($StreamReader.EndOfStream -ne $True)
{
$CurrentChar = $StreamReader.Read()
if ($CurrentChar -eq [char]"'")
{
[string]$Element = ''
while ($Queue.Count -gt 0)
{
$Element += $Queue.Dequeue()
}
$Array += $Element
}
else
{
$Queue.Enqueue($CurrentChar)
}
}
$StreamReader.Close()
This creates a first-in, first-out (FIFO) collection that queues your characters until '
is encountered. The queued characters are then read out into a string, which gets added to the array.

SodJax
- 26
- 1
-
Yea i ended up doing something similar but this takes a long time, but nothing much i can do because the ram for some reason was really low on the computer i was using – Adlis Dec 20 '16 at 22:19