1

If anyone had tried this before using Go, please get the idea with code, that would be really appreciated.

I wrote few line which is slow

// This is to read the csv file

func usersFileLoader(filename string, channel chan User) {
    defer close(channel)
    file, err := os.Open(filename)
    if err != nil {
        panic(err)
    }
    defer file.Close()
    var user User
    reader := csv.NewReader(file)
    for {
        err := Unmarshal(reader, &user)
        if err == io.EOF {
            break
        }
        if err != nil {
            panic(err)
        }
        channel <- user
    }
}

// This is to insert csv file

func saveUser(channel <-chan User, db *sql.DB) {
    stmt, err := db.Prepare(`
        INSERT INTO Users( id, name, address) values ( ?, ?, ?)`)
    if err != nil {
        log.Fatal(err)
    }

    for usr := range channel {
        _, err := stmt.Exec(
            usr.ID,
            usr.Name,
            usr.Address,
        )
        if err != nil {
            log.Fatal(err)
        }
    }
}

// here is the struct of the user

type User struct {
    ID      int `csv:"id"`
    Name    int `csv:"name"`
    Address int `csv:"address"`
}

// here is my main func

func main() {
    db := DBconnect(ConnectionString(dbConfig()))
    channel := make(chan User)
    go usersFileLoader("../user.csv", channel)
    saveUser(channel, db)
    defer db.Close()
}

// This code is working but slow for me. Share your thought and ideas

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
  • 1
    How slow? How fast do you expect it to be? I'm sure a bulk insert would speed things up a bit. Rather than an insert for each line? – AJ X. May 04 '18 at 03:27
  • There are some answers here that might send you in the right direction: https://stackoverflow.com/questions/12486436/golang-how-do-i-batch-sql-statements-with-package-database-sql – AJ X. May 04 '18 at 03:27

1 Answers1

5

I wouldn't attempt to use Go's built in standard library functions for loading a very large CSV file into MySQL (unless, of course, you are simply trying to learn how they work).

For best performance I would simply use MySQL's built in LOAD DATA INFILE functionality.

For example:

result, err := db.Exec("LOAD DATA INFILE ?", filename)
if err != nil {
    log.Fatal(err)
}
log.Printf("%d rows inserted\n", result.RowsAffected())

If you haven't used LOAD DATA INFILE before, note carefully the documentation regarding LOCAL. Depending on your server configuration and permissions, you might need to use LOAD DATA LOCAL INFILE instead. (If you intend to use Docker containers, for instance, you will absolutely need to use LOCAL.)

Michael Hampton
  • 9,737
  • 4
  • 55
  • 96
  • Thanks for your reply, I modified the above code little bit and able to load 3.1GB of data in 1hrs 15min. Basically I split the huge file into multiple small files and dump the data into multiple tables , to dump the data in multiple table I ran the dumping function parallelly. and once the dumping completed I merge all the data together in one table. – Biplav Pokharel May 06 '18 at 17:41