1

I have a csv events.csv:

"PATIENT ID,PATIENT NAME,EVENT TYPE,EVENT VALUE,EVENT UNIT,EVENT TIME"
"1,Jane,HR,82,beats/minute,2021-07-07T02:27:00Z"
"1,Jane,RR,5,breaths/minute,2021-07-07T02:27:00Z"

Then I use python csv to read it:

import csv
with open(r'/Users/williaml/Downloads/events.csv') as csvfile: 
    spamreader = csv.DictReader(csvfile, delimiter=',' ,quotechar=' ')
    for row in spamreader:            
        print(row)

Output:

{'"PATIENT ID': '"1', 'PATIENT NAME': 'Jane', 'EVENT TYPE': 'HR', 'EVENT VALUE': '82', 'EVENT UNIT': 'beats/minute', 'EVENT TIME"': '2021-07-07T02:27:00Z"'}

{'"PATIENT ID': '"1', 'PATIENT NAME': 'Jane', 'EVENT TYPE': 'RR', 'EVENT VALUE': '5', 'EVENT UNIT': 'breaths/minute', 'EVENT TIME"': '2021-07-07T02:27:00Z"'}

And I tried to insert these rows into database:

import psycopg2
conn = psycopg2.connect(host='localhost', dbname='patientdb',user='username',password='password',port='')
cur = conn.cursor()
import csv
with open(r'apps/patients/management/commands/events.csv') as csvfile:
        spamreader = csv.DictReader(csvfile, delimiter=',' ,quotechar=' ')
        for row in spamreader:
                cur.execute(f"""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES
  ({row['"PATIENT ID']},{row['EVENT TYPE']},{row['EVENT VALUE']},
   {row['EVENT UNIT']},{row['EVENT TIME"']})""")

Error:

psycopg2.errors.UndefinedColumn: column "1,HR,82,
   beats/minute,2021-07-07T02:27:00Z" does not exist
LINE 2:   ("1,HR,82,
           ^

However if I directly run the following sql in database command terminal it works:

INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES('1','HR','82','beats/minute','2021-07-07T02:27:00Z');

So I think it seems this part of code is incorrect:

cur.execute(f"""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES
      ({row['"PATIENT ID']},{row['EVENT TYPE']},{row['EVENT VALUE']},
       {row['EVENT UNIT']},{row['EVENT TIME"']})""")

Any friend can help?

William
  • 3,724
  • 9
  • 43
  • 76
  • Can you see anything in `{'"PATIENT ID': '"1',` which might be the reason for this. This is from your Output mentioned above which you are writing into db. – avats Nov 11 '21 at 17:35
  • 1
    Why not to use [COPY](https://www.postgresqltutorial.com/import-csv-file-into-posgresql-table/) ? – balderman Nov 11 '21 at 17:48
  • @balderman Not every column is needed.For example the PATIENT NAME is not needed can I still use copy ? – William Nov 11 '21 at 17:51
  • @William I think COPY knows how to handle this situation. – balderman Nov 11 '21 at 17:53
  • See [this](https://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table#comment19538681_2987451) for how to use `copy` for specific columns. – avats Nov 11 '21 at 17:56
  • Or more generally [this](https://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table) question on SO. – avats Nov 11 '21 at 17:57
  • 1
    Does the CSV file really have double quotes at the beginning and end of each line, and nowhere else? That would make each line a single field, which seems highly suspicious. That should be fixed upstream, if possible. And once you have sorted that out, please please please use [proper parameter passing](https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries) and not string formatting to pass values to SQL queries. – Ture Pålsson Nov 11 '21 at 18:16
  • Second @TurePålsson comment. This starts with improperly formatted `CSV` file. Fix that and the rest will follow. – Adrian Klaver Nov 11 '21 at 19:02

2 Answers2

1

Use this:

cur.execute("""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES ({1},{2},{3},{4},{5})"""
            .format(row['"PATIENT ID'][1:], row['EVENT TYPE'], row['EVENT VALUE'], row['EVENT UNIT'], row['EVENT TIME"'][:-1]))

So, this basically handles your extra quotes in the output dict which I have mentioned in the comment here which is causing this issue.

And, that's why

INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES('1','HR','82','beats/minute','2021-07-07T02:27:00Z');

passes on db terminal as you can see the difference between values inserted here in both ways.

UPDATE: Avoid using python's string formatting for queries as it can lead to wrong queries or vulnerable points for sql injection. See parameters to know the correct ways to do this, as Adrian have mentioned in comments below.

avats
  • 437
  • 2
  • 10
  • Thank you for your answer,after using your code,the error change to:psycopg2.errors.SyntaxError: syntax error at or near "T02" LINE 3: beats/minute,2021-07-07T02:27:00Z) – William Nov 11 '21 at 17:48
  • Do not use formatted strings to do this. Read this section of docs [Parameters](https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries) for correct way to do this. This should not be an accepted answer as it promotes bad coding. – Adrian Klaver Nov 11 '21 at 18:54
  • @AdrianKlaver psycopg has nowhere mentioned explicitly that one should avoid using string formatting for queries. Also, they have used `.format` in the last code block in your mentioned [reference](https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries). – avats Nov 11 '21 at 21:17
  • That is the `format` provided by the [psycopg2.sql](https://www.psycopg.org/docs/sql.html) and is a method off `sql.SQL()`. It is an different thing, in particular because it does correct escaping. [f strings](https://www.python.org/dev/peps/pep-0498/#specification) are just a form of string interpolation and are not suitable for building SQL queries. – Adrian Klaver Nov 11 '21 at 21:29
  • Ok, my bad! But, I think the question is more about where OP was wrong and what a solution in his perspective would be. And I answered accordingly. I'll update my answer. – avats Nov 11 '21 at 21:38
  • Well I got turned around also as I referred to the `f string` the OP was using not the `str.format()` you used. Though the advice is the same as `str.format()` is yet another form of string interpolation. – Adrian Klaver Nov 11 '21 at 21:40
  • To make the answer acceptable and to return the vote correct your example to use parameters. – Adrian Klaver Nov 11 '21 at 22:33
0

so one problem with CSV is " at the beginning and end of each line. The way you are interpreting it is causing it to become part of SQL expression.

           here 
LINE 2:   ("1,HR,82

And this is causing error, as it's not correctly closed" and actually it not intended for it to be in the generated SQL.

matkk
  • 115
  • 1
  • 8