I came across a post discussing how to use Powershell to bulk import massive data relatively fast. I have a typical csv file with about 5 million rows formatted in the usual way.
I keep getting the same error messages regardless if I choose to import a txt or csv file. Playing around with the csvdelimiter/firstcolumnnames section also created their own issues.
I've spent hours trying to figure out how to get it to work with MY csv files and I keep getting the same error messages no matter what I try. All field names accept Null and they are identical in every way between the table and csv file. I do not have a primary key for the database.
# Database variables
$sqlserver = "SERVERNAMEHERE"
$database = "autos"
$table = "AgedAutos"
# CSV variables
$csvfile = "C:\temp\aged.csv"
$csvdelimiter = "',"
$firstRowColumnNames = $true
################### No need to modify anything below ###################
Write-Host "Script started..."
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()
[void][Reflection.Assembly]::LoadWithPartialName("System.Data")
[void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient")
# 50k worked fastest and kept memory usage to a minimum
$batchsize = 50000
# Build the sqlbulkcopy connection, and set the timeout to infinite
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;"
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock)
$bulkcopy.DestinationTableName = $table
$bulkcopy.bulkcopyTimeout = 0
$bulkcopy.batchsize = $batchsize
# Create the datatable, and autogenerate the columns.
$datatable = New-Object System.Data.DataTable
# Open the text file from disk
$reader = New-Object System.IO.StreamReader($csvfile)
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)
if ($firstRowColumnNames -eq $true) { $null = $reader.readLine() }
foreach ($column in $columns) {
$null = $datatable.Columns.Add()
}
# Read in the data, line by line
while (($line = $reader.ReadLine()) -ne $null) {
$null = $datatable.Rows.Add($line.Split($csvdelimiter))
$i++; if (($i % $batchsize) -eq 1) {
$bulkcopy.WriteToServer($datatable)
Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())."
$datatable.Clear()
}
}
# Add in all the remaining rows since the last clear
if($datatable.Rows.Count -gt 0) {
$bulkcopy.WriteToServer($datatable)
$datatable.Clear()
}
# Clean Up
$reader.Close(); $reader.Dispose()
$bulkcopy.Close(); $bulkcopy.Dispose()
$datatable.Dispose()
Write-Host "Script complete. $i rows have been inserted into the database."
Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())"
# Sometimes the Garbage Collector takes too long to clear the huge datatable.
[System.GC]::Collect()
Error message listed below.
Exception calling "WriteToServer" with "1" argument(s): "The given value of type String from the data source cannot be converted to
type date of the specified target column."
At C:\powershell_scripts\batch_csv_import-code1-working-test for auto table.ps1:43 char:3
+ $bulkcopy.WriteToServer($datatable)
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : InvalidOperationException
340000 rows have been inserted in 00:00:03.5156162
I have no idea what that error means since I cannot find anything useful on Google. I'm thinking one of the columns might be listed incorrectly in SQL Server, but I could be wrong.
Please help me figure out the problem. Thanks.