I have written my program to read words from a text file and enter them into a sqlite database and also treat them as strings. But I need to enter some words containing German umlauts: ä
, ö
, ü
, ß
.
Here is a prepared piece of code:
I tried both with # -- coding: iso-8859-15 -- and # -- coding: utf-8 -- No difference(!)
# -*- coding: iso-8859-15 -*-
import sqlite3
dbname = 'sampledb.db'
filename ='text.txt'
con = sqlite3.connect(dbname)
cur = con.cursor()
cur.execute('''create table IF NOT EXISTS table1 (id INTEGER PRIMARY KEY,name)''')
#f=open(filename)
#text = f.readlines()
#f.close()
text = u'süß'
print (text)
cur.execute("insert into table1 (id,name) VALUES (NULL,?)",(text,))
con.commit()
sentence = "The name is: %s" %(text,)
print (sentence)
f.close()
con.close()
the above code runs well. But I need to read 'text' from a file containing the word 'süß'. So when I uncomment the 3 lines ( f.open(filename) .... ), and commenting text = u'süß' it brings the error
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.
I tried codecs module to read a utf-8, iso-8859-15. But I could not decode them to the string 'süß' which I need to complete my sentence at the end of the code.
Once I tried decoding to utf-8 before inserting into the database. It worked, but I could not use it as string.
Is there a way I can import süß from a file and use it both for inserting to sqlite and using as string?
more detail:
Here I add more details for clarification. I have used codecs.open
before.
The text file containing the word süß is saved as utf-8
. Using f=codecs.open(filename, 'r', 'utf-8')
and text=f.read()
, I read the file as unicode u'\ufeffs\xfc\xdf'
. Inserting this unicode in sqlite3 is smoothly done: cur.execute("insert into table1 (id,name) VALUES (NULL,?)",(text,))
.
The problem is here: sentence = "The name is: %s" %(text,)
gives u'The name is: \ufeffs\xfc\xdf'
, and I also need to print(text)
as my output süß, while print(text)
brings this error UnicodeEncodeError: 'charmap' codec can't encode character u'\ufeff' in position 0: character maps to <undefined>
.
Thank you.