1

Last week I have asked you guys to replace a string with newline character with .bat script. I have realized that my file has some carriage return and newline characters already, which I need to remove first and then do the replace. to replace '#@#@#' with linefeed I am using the line below.

(gc $Source) -replace "#@#@#", "`r`n"|set-content $Destination

So I tried to implement the same logic to replace \r and \n as well, however it did not work.

(gc $Source) -replace "`n", ""|set-content $Destination

my file looks like :

abc|d  ef|123#@#@#xyz|tuv|567#@#@#

and I need to make it look like

abc|def|123  xyz|tuv|567

like I said, replacing the row delimiter character with new line works, but I need to remove all cr and lf characters first before I do that.

For small files the script below works, but my file is >1.5GB and it throws OutofMemoryException error

param
(
  [string]$Source,
  [string]$Destination
)

echo $Source
echo $Destination

$Writer = New-Object IO.StreamWriter $Destination
$Writer.Write( [String]::Join("", $(Get-Content $Source)) )
$Writer.Close()
yasemin
  • 95
  • 3
  • 11
  • I believe it is not duplicate, since I am trying to remove all \r and \n. not the combination. Also I want to replace them with space only. – yasemin Dec 20 '16 at 00:15
  • So you're trying to remove `\r` and `\n` but keep `\r\n`? – SomethingDark Dec 20 '16 at 01:26
  • No, I am trying to remove all \r and \n . The file is not supposed to have linefeed character, since '#@#@#' is the end of line string. Having said that, is there anyway that I set my rowdelimiter as #@#@# and then read line by line in powershell to remove the \r, \n characters? – yasemin Dec 21 '16 at 15:22
  • You confused me when you said you wanted to remove all \r and \n but not the combination. I don't know if you can use delimiters larger than a single character in PowerShell. My idea was going to be to convert the file to hexadecimal, remove any instances of `0d` and `0a`, and convert back to ASCII. – SomethingDark Dec 21 '16 at 15:27
  • Considering it is big file, is it time-efficient to convert the file and remove unwanted characters? And what scripting would you recommend for that conversion?It has to be automated – yasemin Dec 21 '16 at 15:31

2 Answers2

0

This is vbscript. Windows isn't consistent. Mostly it breaks on CR and removes LF (all inbuilt programming languages). But Edit controls (ie Notepad) break on LF and ignore CR (unless preceding a LF).

Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Do Until Inp.AtEndOfStream
    Text = Inp.readall
    Text = Replace(Text, vbcr, "")
    Text = Replace(Text, vblf, "")
    Text = Replace(Text, "#@#@#", vblf)
    outp.write Text
Loop

This uses redirection of StdIn and StdOut.

Filtering the output of a command

YourProgram | Cscript //nologo script.vbs > OutputFile.txt

Filtering a file

Cscript //nologo script.vbs < InputFile.txt > OutputFile.txt

See my CMD Cheat Sheet about the Windows' command line Command to run a .bat file

So this removes line ending in win.ini and prints to screen the now one line win.ini.

cscript //nologo "C:\Users\David Candy\Desktop\Replace.vbs" < C:\windows\win.ini
Community
  • 1
  • 1
  • Thank you Noodles. When we say do until, does it loop line by line? – yasemin Dec 20 '16 at 01:29
  • When we say do until, does it loop line by line? My file is huge and has only few carriage return line feed characters. How does this loop read the file? When we say Do Until Inp.AtEndOfStream, what does it mean? – yasemin Dec 20 '16 at 01:31
  • It once did but I changed it to a `readall` thus making the `Do ... Loop` unnecessary (it's goes through it once). Put a space between the `""` in the replace statements to replace with a space rather than just remove. So you could read line by line and `text = text & out.readline` and do the write outside the loop. –  Dec 20 '16 at 01:32
  • sorry, I haven't used any vb scripts before. Where does my source and tergat go here? – yasemin Dec 20 '16 at 13:40
  • It's a command line thing not a vbscript thing. There are examples in my answer. –  Dec 20 '16 at 21:35
0

Use the below function to remove the special characters. Put all of them in $SpecChars what ever you want to remove and call the function with the Text-data as a parameter.

Function Convert-ToFriendlyName
{param ($Text)
# Unwanted characters (includes spaces and '-') converted to a regex:
#Whatever characters you want to remove, put it here with comma separation.
$SpecChars =  '\', ' ','\\','-'
$remspecchars = [string]::join('|', ($SpecChars | % {[regex]::escape($_)}))
# Convert the text given to correct naming format (Uppercase)
$name = (Get-Culture).textinfo.totitlecase(“$Text”.tolower())
# Remove unwanted characters
$name = $name -replace $remspecchars, ""
$name
}

Hope it helps...!!!

Ranadip Dutta
  • 8,857
  • 3
  • 29
  • 45