1

I have a very large .txt file (6GB) that I need to break into several files. I'd like to run a powershell script that would copy lines 1 to 100,000 to a new file, then run it again for lines 100,001 to 200,000 and so on. I'm not a developer and I've been unable to figure out the syntax to get the job done. Any help will be appreciated.

I've tried with both Get-Content and Get-ChildItem

Get-Content C:\Users\alind\Downloads\2018\in.txt [5..10] | Set-Content C:\Users\alind\Downloads\2018\mid.txt

I'm looking for lines 6 to 11 in the file in.txt to be copied into the file mid.txt.

Andrew
  • 67
  • 8
  • The microsoft docs has an example of doing things like this - https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-content?view=powershell-6 – mhhollomon Jan 07 '19 at 18:42
  • 1
    Possible duplicate of [How can I split a text file using PowerShell?](https://stackoverflow.com/questions/1001776/how-can-i-split-a-text-file-using-powershell) –  Jan 07 '19 at 18:51
  • mhhollomon, that is page I'm trying to decipher. For example, I can get lines 1 to 100. However, I don't see how to get lines 10 to 20. – Andrew Jan 07 '19 at 18:52
  • 1
    For your sample line put the Get-Content comdlet in parentheses: `(Get-Content C:\Users\alind\Downloads\2018\in.txt)[5..10] | Set-Content C:\Users\alind\Downloads\2018\mid.txt` but that won't work for your giant file. –  Jan 07 '19 at 18:53
  • LotPings, thank you! Your first suggestion is different than what I was looking to do, but the end result gives me just what I need. I tried putting the parenthesis in the command line and it works on my small test file. Why won't it work on the giant file? – Andrew Jan 07 '19 at 19:38
  • 3
    `Get-Content in.txt | Select-Object -Skip 5 -First 6 | Set-Content out.txt` – Ansgar Wiechers Jan 07 '19 at 19:40
  • @Andrew Well it depends on your memory, the parentheses forces to read in the whole file first. Ansgar and Andrei showed you alternatives. To have someone recognize your comment address him/her with an @ in front. –  Jan 07 '19 at 20:39
  • @LotPings, Thank you all for the input and instruction. I'm testing the solutions provided now and will post back. Again, thank you. – Andrew Jan 07 '19 at 20:47

1 Answers1

1

The ReadCount parameter of the Get-Content cmdlet may help.

Get-Content -Path C:\Users\alind\Downloads\2018\in.txt -ReadCount 100000 |
ForEach-Object -Begin { $i = 0 } -Process {
    Set-Content -Path ('C:\Users\alind\Downloads\2018\in{0:d8}.txt' -f ++$i) -Value $_
  }
Andrei Odegov
  • 2,925
  • 2
  • 15
  • 21