have it split a text file into an array
Note that Get-Content
does this by default, i.e. it streams a text file's lines one by one (with any trailing newline removed).
If you capture this stream of lines as an array:
If there are two or more lines, PowerShell automatically creates an array (of type [object[]]
) for you.
If there's only one line, PowerShell captures it as-is, as a [string]
; to ensure that even a file with a single line is captured as an array, enclose the Get-Content
call in @(...)
, the array-subexpression operator
Therefore, the following captures the individual lines of file $file
in an array (but see the faster alternative below):
# See faster alternative below.
$lines = @(Get-Content $file)
However, if the intent is to capture Get-Content
's output in full, it isn't actually necessary to stream the lines by one; the -ReadCount
parameter allows reading the file in batches of lines, as arrays of the specified size; -ReadCount 0
reads all lines into an array, and unconditionally creates an array, i.e. also if there's only one line. Therefore, the following is a much faster alternative.
# Faster alternative to the above.
$lines = Get-Content -ReadCount 0 $file
Note that if the multiline input string also has a trailing newline (as you're likely to get if you use Get-Content -Raw
to read a )
If your intent is to split a multiline string into an array of individual lines, use PowerShell's -split
operator:
- Its regex capabilities make it easy to to match newlines (line breaks) in a cross-platform manner: regex
\r?\n
matches both Windows-format CRLF newlines (\r\n
) and Unix-format LF newlines (\n
)
The following example uses a here-string to split a multiline string into lines, and visualizes each resulting line by enclosing it in [...]
:
@'
line 1
line 2
line 3
'@ -split '\r?\n' |
ForEach-Object { "[$_]" } # -> '[line 1]', '[line 2]', '[line 3]'
If you want to avoid an empty final array element resulting from a trailing newline (as you might get if you read a text file as a whole into a string, e.g. with Get-Content -Raw
), apply regex \r?\n\z
via the -replace
operator first:
# Multiline string with trailing newline.
@'
line 1
line 2
line 3
'@ -replace '\r?\n\z' -split '\r?\n' |
ForEach-Object { "[$_]" } # -> '[line 1]', '[line 2]', '[line 3]'
If you want to filter out empty lines, use the filtering capabilities of PowerShell's comparison operators such as -ne
, which return the sub-array of matching elements with an array as the LHS:
# Multiline string with empty lines.
@'
line 1
line 2
line 3
'@ -split '\r?\n' -ne '' |
ForEach-Object { "[$_]" } # -> '[line 1]', '[line 2]', '[line 3]'
If you want to filter out empty or blank (all-whitespace) lines, you can use the -match
operator (which equally has filtering capabilities) with regex \S
, which matches any non-whitespace character and therefore only lines that are neither empty nor composed exclusively of whitespace chars.:
# The 2nd line is nonempty, but composed of spaces only
@'
line 1
line 2
line 3
'@ -split '\r?\n' -match '\S' |
ForEach-Object { "[$_]" } # -> '[line 1]', '[line 2]', '[line 3]'