0

I must import a big text file into sql database. The text file contains some 5 million of rows. :D My pc is old ed I use vb.net, but is only for one shot. When import started, the code read and entered 100 lines for second. Is slowly but for one shot is not a problem. Now, at 300.000 lines readed, it work ar 8 lines for second. :( The goal is far away.... The strange thing is, if I start from first line and in the database have 300.000 record, the speed is fast (100 L/s) even though I check that I don't have the record. In sql I created a cluster index. This is the vb.net code into BackgroundWorker DoWork:

  Private Sub Bw_DoWork(sender As Object, e As DoWorkEventArgs)

        Dim d = DirectCast(e.Argument, BwData).Path
        Dim s = DirectCast(e.Argument, BwData).Start

        Dim reader As IO.StreamReader = My.Computer.FileSystem.OpenTextFileReader(d)
        Dim l As String
        Dim lineIndex As Integer = 0
        Dim lineStart As Integer = 0
        Dim part As String() = Nothing

        Try
            Using ctx = New ContentsDBEntities

                Dim rowN = s 'Math.Max(ctx.xHamster.Count - 100, 1)

                Do
                    l = reader.ReadLine
                    If l Is Nothing Then
                        Exit Do
                    End If
                    part = l.Split("|")

                    If Bw.CancellationPending Then Exit Do

                    If lineIndex >= rowN AndAlso part.Length >= 14 Then
                        If lineIndex = rowN Then
                            lineStart = lineIndex
                            startAt = Now
                        End If

                        Dim vId = part(0)

                        If IsNumeric(vId) Then
                            Dim there = (From c In ctx.xHamster Where c.VideoId = vId Select c).FirstOrDefault

                            If there Is Nothing Then
                                Dim r = New xHamster With {
                                            .VideoId = vId,
                                            .Url = part(1),
                                            .Umbed_Url = part(2),
                                            .Title = part(3),
                                            .Duration = StringToSec(part(4)),
                                            .dateAdded = Date.Parse(part(5)),
                                            .Thumb = part(6),
                                            .Channels = part(7),
                                            .Pornstars = part(8),
                                            .MaxResolution = part(9),
                                            .Orientation = part(10),
                                            .Title_RU = part(11),
                                            .Username = part(12),
                                            .Description = part(13)}

                                ctx.xHamster.Add(r)

                                If lineIndex Mod 100 = 0 Then
                                    ctx.SaveChanges()
                                End If
                            End If
                        End If
                    End If
                    lineIndex += 1

                    DirectCast(sender, BackgroundWorker).ReportProgress(lineIndex, lineStart)

                Loop Until l Is Nothing

                reader.Close()
                ctx.SaveChanges()
            End Using

        Catch ex As Exception
            Debug.Print(ex.Message)
        End Try

    End Sub

Do you have a trick for me? Why does the speed decrease?

Thank you

Luca

Luca_Fly
  • 21
  • 5
  • 1
    [SqlBulkCopy](https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlbulkcopy?view=dotnet-plat-ext-5.0) I would stay away from EF for this kind of task – Steve Dec 26 '20 at 18:09
  • And here an example loading a CSV file. A context looking quite like yours https://stackoverflow.com/questions/20759302/upload-csv-file-to-sql-server – Steve Dec 26 '20 at 18:12
  • 2
    Interesting coulmn names ... ;) – SteveCinq Dec 26 '20 at 18:35
  • I solved with SqlBulkCopy, it work at 7000 L/s :D Thank you @Steve – Luca_Fly Dec 27 '20 at 14:18

0 Answers0