1

I have a CSV file that has data that looks like this:

54861,54850,Doe,John,NULL,-87.1181407064,30.3773576858
54862,54851,Doe,Linda,Lee,None,None

The last two columns are longitude and latitude

I'm using mysql.connector to insert into the database. It handles the first record, okay but, because the last two columns are floats, it croaks when it gets to the second record with the values set to "None".

I tried programmatically to set the values to NULL but it won't accept that either.

I've tried a couple of different things but can't figure it out.

This has to be done in Python.

Here is the code:

import sys
import mysql.connector
import csv
import os
from mysql.connector import Error
from mysql.connector import errorcode


#Specify the import file
try:


    inputCSV = 'geocoded_test.csv'

#Open the file and give it a handle

    csvFile = open(inputCSV, 'r')

#Create a reader object for the input file

    reader = csv.reader(csvFile, delimiter = ',')

except IOError as e:
    print("The input file ", inputCSV, " was not found", e)
    exit()


try:
   mydb = mysql.connector.connect(host='localhost',
                             database='wordpress',
                             user='wp_user',
                             password='XXXXXXXX!'
                             )

   mycursor = mydb.cursor()

except mysql.connector.Error as error:
    print( "Failed to connect to database: {}".format(error))
    exit()

try:

    record_count = 0

    for row in reader:
        contact_id,address_id,last_name, first_name, middle_name, longitude, latitude = row
        print(row)

# It is here that I want to convert to NULL. 

        if longitude == "None":
            longitude = -1.0
        if latitude == "None":
            latitude = -1.0

   #Update single record now

        mycursor.execute("""
            update civicrm_address
            set 
            geo_code_1 = %s,
            geo_code_2 = %s
        where
            id = %s 
        and
            location_type_id = %s 
        """,
        (latitude, longitude, address_id, 6)
        )
        mydb.commit()
        print(mycursor.rowcount)
        record_count +=1
        print("Record", record_count, " updated successfully")


finally:
    print(record_count, " records updated")
    #closing database connection.
    if(mydb.is_connected()):
        mydb.close()
        print("connection is closed")
Dennis M. Gray
  • 332
  • 1
  • 3
  • 17

1 Answers1

-1

One option would be to use LOAD DATA with custom logic which catches the None string values and then converts them to NULL:

LOAD DATA LOCAL INFILE 'your_file.csv'
INTO TABLE yourTable
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n' 
(col1, col2, last, first, middle, @lat, @lng)
SET lat = CASE WHEN @lat = 'None'
               THEN NULL
               ELSE CAST(@lat AS DECIMAL(10,8)) END,
SET lng = CASE WHEN @lng = 'None'
               THEN NULL
               ELSE CAST(@lng AS DECIMAL(10,8)) END;

I assume above that your latitude column is called lat, and your longitude column lng. I just dummy placeholder names for the other columns, but you would need to use the actual column names to make the above load work.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • You can certainly call `LOAD DATA` from Python, [see here](https://stackoverflow.com/questions/1231900/mysql-load-data-local-infile-example-in-python). – Tim Biegeleisen Jul 17 '19 at 08:03
  • I'm looking for a pure Python solution with the mysql.connector module. I could also load the data as a table then join it to the table I'm trying to update. – Dennis M. Gray Jul 17 '19 at 08:19
  • This is a pure Python solution. `LOAD DATA` is basically the fastest way to load CSV data into a MySQL table, and it should be considered as an option for any programming language (Python included). – Tim Biegeleisen Jul 17 '19 at 09:32
  • I agree it is fast but I'm still looking for a means to convert the data to NULL. Collumns in the csv file which are strings but have a value of NULL are updated as NULL in the database. The float type is causing the problem. – Dennis M. Gray Jul 17 '19 at 10:40