0

Trying to complete step one of a project where I create a new table with

CREATE TABLE Customers (customerID SMALLINT,firstName VARCHAR(25),lastName VARCHAR(25),street VARCHAR(50),city VARCHAR(50),state VARCHAR(25),zipCode VARCHAR(10),telephone VARCHAR(15),PRIMARY KEY (customerID));

Then, I am supposed to upload a csv file from the application.

LOAD DATA INFILE "/home/codio/workspace/customers.csv" INTO TABLE Customers FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';

or

LOAD DATA INFILE '/home/codio/workspace/customers.csv' INTO TABLE Customers FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';

but both of these provide the error code

ERROR 1062 (23000): Duplicate entry '32767' for key 'PRIMARY'

Opening the file and searching for 32767 only yields one result in the telephone line. There is no duplicate. I have DELETE FROM Customers and DROP TABLE Customers. Both times confirming the table is empty by

mysql> SELECT * FROM Customers;
Empty set (0.00 sec)

I have been arguing with this for hours and am unsure what else to do. The worst part is I feel like I am not doing anything wrong. But that is where you come in. Fingers crossed.

I have deleted the table, cleared the table, recreated the table, at one point, went so far as creating a brand new file.

Progman
  • 16,827
  • 6
  • 33
  • 48
  • have you considered the [`IGNORE`](https://stackoverflow.com/a/13425727) keyword? – Bagus Tesa Jul 23 '23 at 08:39
  • Do you have more than 32767 rows in your CSV file you try to import? Do you have ids which are bigger than the value 32767? – Progman Jul 23 '23 at 08:51

1 Answers1

1

You have defined your primary key as

customerID SMALLINT

which means that it can only hold values between -32768 and 32767. When you use the LOAD DATA statement any value outside this range gets truncated as mentioned in the manual for the LOAD DATA statement:

  • Invalid values produce warnings rather than errors, and are converted to the “closest” valid value for the column data type. Examples:

    • [...]

    • An out-of-range numeric or temporal value is clipped to the closest endpoint of the range for the column data type.

This means that your first new row with a larger customerId value than 32767 gets clipped to 32767. The second new row with a larger customerId value than 32767 gets clipped to 32767 as well, resulting in a primary key conflict.

The solution is to use a bigger integer type for the customerId column like INT or BIGINT.

Progman
  • 16,827
  • 6
  • 33
  • 48