I have a CSV database of 1M rows that I'm ingesting into Laravel using the following Seeder:
public function run()
{
DB::table("terms")->truncate();
DB::beginTransaction();
$csvFile = fopen(base_path("database/terms.csv"), "r");
while (($data = fgetcsv($csvFile, 100, ",")) !== FALSE) {
DB::table("terms")->insert([
"serial" => $data['0'],
"output" => $data['1']
]);
}
fclose($csvFile);
DB::commit();
}
And here's the migration that the seeder is being run from:
public function up()
{
Schema::create('items', function (Blueprint $table) {
$table->mediumIncrements("id");
$table->string("serial")->unique();
$table->string("output");
});
$seeder = new ItemsSeeder();
$seeder->run();
}
I can't use chunking because it doesn't make sense for me to create an Eloquent model for the data, so I'm using database transactions to improve performance (which takes ingestion time down from an hour to half an hour). This works fine as a solution except for the fact that it adds an extra few hundred rows to the table. Even weirder, when I reverse the sorting order to check the last row, it has an id
of 1000000
(1 million) as expected, indicating no extra rows have been added.
What's going on here?