0

I would like to know how I can write not only the 128 characters of ascci but also the characters of unicode in a txt file, I want to create the file and start writing, and then I want to reopen that file and continue writing below it.

Take into account that I need to copy the content of the file, that is, I should not see any, for example a

 \n 

but a line break,

and so on for each unicode character that you can enter Emphasis is not for a program to read it, it is to have the txt file and read it quietly from a notepad

def write_file(text, name_table):
    file = open("llenado_tabla_" + name_table + ".txt","a")
    file.write(text)
    file.close()

def create_file(name_table, atributos):
    file = open("llenado_tabla_" + name_tabla + ".txt","w")
    file.write("-- Llenando tabla " + name_tabla + '\n')
    file.write("INSERT INTO\n")
    file.write(nombre_tabla + '(')
    for i in range(len(atributos)):
        if i == len(atributos) - 1:
            file.write(atributos[i] + ') \n VALUES \n')
        else:
            file.write(atributos[i] + ',')
    file.close()

When I try I get this error:

Traceback (most recent call last):
  File "D:\Projects2.0\4to semestre\Bases de datos\creador_tablas.py", line 96, in <module>
    main()
  File "D:\Projects2.0\4to semestre\Bases de datos\creador_tablas.py", line 95, in main
    introducir_registros(atributos, tipo_dato, cantidad_id, id_generado, nombre_tabla)
  File "D:\Projects2.0\4to semestre\Bases de datos\creador_tablas.py", line 62, in introducir_registros
    insertar_datos_txt(datos_tabla, False, nombre_tabla)
  File "D:\Projects2.0\4to semestre\Bases de datos\creador_tablas.py", line 29, in insertar_datos_txt
    escribe_fichero(texto, nombre_tabla)
  File "D:\Projects2.0\4to semestre\Bases de datos\creador_tablas.py", line 3, in escribe_fichero
    archivo.write(texto)
  File "C:\Python39\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2212' in position 25: character maps to <undefined>
FJGJFOR
  • 17
  • 6

1 Answers1

0

Python will handle this for you transparently. Python, especially Python 3, works natively with unicode characters. When you write them to file, it normally writes in a unicode file format. Same with reading

Here's a sample:

data = "3 unicode chars > ☀️‼️ <"

with open('/tmp/data.uni', 'w') as f:
    f.write(data)

with open('/tmp/data.uni') as f:
    read_data = f.read()

print(read_data)

Result:

3 unicode chars > ☀️‼️ <

This kinda gets screwed up on the StackOverflow site, but on my Mac in PyCharm, there are three Emojis in my editor and in the output when I run the program.

Here's what it looks like on my Mac:

emojis on my mac

CryptoFool
  • 21,719
  • 5
  • 26
  • 44
  • Remove the print because I am doing it from the console but it still shows me the following error Traceback (most recent call last): File "D:\Projects2.0\4to semestre\Bases de datos\ejem.py", line 3, in f.write(data) File "C:\Python39\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 18-22: character maps to – FJGJFOR Nov 29 '20 at 04:43
  • Line 3: f.write(data) – FJGJFOR Nov 29 '20 at 04:44
  • I can see it is great but in my case I do not use an IDE I use the terminal to compile so I took out the print in case that was the problem so I would only have to create the file and I check that I can see those special characters, but still so the file is not created – FJGJFOR Nov 29 '20 at 04:50
  • @FJGJFOR - well, the problem isn't Python. That's what my example shows. It is able to deal with Unicode source code and read, process, and write Unicode without the programmer having to do anything special. That's all Python can do. It couldn't do it any better or make it any easier. What your system is able to do with Unicode is a totally separate matter. Python can be made to store Unicode in different file formats. So if it turns out your system doesn't like the one it uses by default, you could use a more compatible one. The main thing to know is "string" == "Unicode" in Python 3. – CryptoFool Nov 29 '20 at 05:06
  • The answer seems great to me, but in my specific case it does not work thank you very much anyway – FJGJFOR Nov 29 '20 at 05:21
  • -1: the problem with this answer is that it does not account for varying environments. The default text encoding will depend on how the computer is set up, and some - not including yours, but including OP's - will not be able to handle all characters. The solution is to *explicitly specify a text encoding*, as in the duplicate I linked. – Karl Knechtel Nov 29 '20 at 05:29