1

I am a new user and struggling with the below:

I would need to read the file and write back with some modifications, the issue is:

  1. “ReadToEnd” works fine for smaller files [100MB approx] and everything works fine what I like to do. But for Bigger Size Files [300 MB +], it bombs out.
  2. Then I tried “ReadLine” (Reading Line By Line) It works fine on the smaller or bigger files but it takes very very long to save back.

I have included both of the codes below "ReadToEnd" and "ReadLine" For testing you would need to create "100MB-File.txt" and "300MB-File.txt" files in c:\temp\ area.

I would really appreciate your help in this regard

'----------------Reading whole File ReadToEnd
Dim sr As New StreamReader("C:\temp\100MB-File.txt")
Dim path As String = "C:\temp\myFileNew1.txt"
Dim oneLine As String
oneLine = sr.ReadToEnd
Using sw As StreamWriter = File.CreateText(path)
    sw.WriteLine(oneLine)
End Using
sr.Close()

''---------------Reading Line by Line
Dim sr As New StreamReader("C:\temp\300MB-File.txt")
Dim path As String = "C:\temp\myFileNew2.txt"
Dim oneLine As String
oneLine = sr.ReadLine

Using sw As StreamWriter = File.CreateText(path)
    sw.WriteLine(oneLine)
End Using

Do Until sr.EndOfStream
    Console.WriteLine(oneLine)
    oneLine = sr.ReadLine()

    Using sw As StreamWriter = File.AppendText(path)
        sw.WriteLine(oneLine)
    End Using
Loop
sr.Close()
djv
  • 15,168
  • 7
  • 48
  • 72
RDM
  • 11
  • 1
  • 5
  • 3
    If you comment out the `Console.WriteLine(oneLine)` line you may find it runs *much* faster. – Andrew Morton Nov 09 '16 at 20:31
  • 1
    See http://stackoverflow.com/questions/2161895/reading-large-text-files-with-streams-in-c-sharp , you probably won't be able to make many improvements past what @AndrewMorton sugested – djv Nov 09 '16 at 20:33

1 Answers1

4

In addition to the Console.WriteLine mentioned in the comments, you are opening and closing the output file for each line you are writing. If you just open the file once, it should be much faster:

Dim sr As New StreamReader("C:\temp\300MB-File.txt")
Dim path As String = "C:\temp\myFileNew2.txt"
Dim oneLine As String
oneLine = sr.ReadLine

Using sw As StreamWriter = File.CreateText(path)
    sw.WriteLine(oneLine)
    Do Until sr.EndOfStream
        'Console.WriteLine(oneLine)
        oneLine = sr.ReadLine()
        sw.WriteLine(oneLine)
    Loop
End Using

sr.Close()

You could also simplify the code using File.ReadLines:

Dim inPath As String = "C:\temp\300MB-File.txt"
Dim outPath As String = "C:\temp\myFileNew2.txt"

Using sw As StreamWriter = File.CreateText(outPath)
    For Each oneLine As String In File.ReadLines(inPath)
        ' modify line here
        sw.WriteLine(oneLine)
    Next
End Using
Mark
  • 8,140
  • 1
  • 14
  • 29
  • Thank you very much "Mark" Your suggestion with "File.ReadLines" works great. At the ' modify line here I have added: If line.Contains("") Then Your comments please! This saves the whole stream back to the variable its "superfast" Thanks again! – RDM Nov 10 '16 at 14:51
  • @RDM Yep, just update the `oneLine` variable, using whatever conditional logic makes sense for your use case, before the `sw.WriteLine` call. You may also need to consider what encoding to use for reading and writing - the default for both `CreateText` and `ReadLines` is UTF-8. – Mark Nov 10 '16 at 16:31
  • Thank you once again Mark. This is very helpful and learning experience for me. ---I would like to take the opportunity and ask you another question if you please don't mind.-----The value in oneLine variable, I would like to take it outside "For Each" Block. I like to do some work below this tag and can't figure out how to jump the value of the variable outside ForEach Block. – RDM Nov 11 '16 at 19:09
  • Actually please ignore my above question regarding jumping the variable outside "ForEach" Block. I figured out that I was not initializing the variable at top. ---Thanks once again. --RDM – RDM Nov 11 '16 at 19:23
  • Hello Mark, I have another question please: The file I am reading from "inPath" is very large 300MB to 1 GB +. I need to load the file into the variable "oneLine" as shown in the above program. Approximately 200 MB files works fine but larger files bomb out. The purpose is once file is loaded into the variable, I would need to run RegEx and pick certain section of the file and save somewhere else. Thanks once again for your kind attention. – RDM Nov 12 '16 at 01:56
  • I have added the variable ---- Dim wholeFile as String = "" -----Then Added :------wholeFile = wholeFile & vbCrLf & oneLine---- where you mentioned 'modify line here. – RDM Nov 12 '16 at 02:17
  • @RDM You should ask that in a new question, since it is quite different from your original question, but you will probably have to avoid reading the entire file into a string if they are that large. Perhaps you can read line by line and when you find the start of a block you are interested in, start writing to the output file and continue until you find the end of the block. That would assume that a single line would be enough to recognize the start of a block - otherwise you may have to buffer multiple lines, e.g in a queue. – Mark Nov 12 '16 at 02:48
  • Thank you once again "Mark". I really appreciate your help and advise. I plan to try your suggestions this week. Regards. – RDM Nov 14 '16 at 13:41