0

I have a Example.txt file like the following:

Fu
ih
✌
te
eirt
a
nq
awi
oq
qu
acisrrtr
qopa

My real .txt file is much longer.

Now I want to read in line 2, 5, 8, 11, ... in a list with Python.

I tried to read in every line and take then only the certain lines from the list but the problem is that I can't read in symbols like ✌, and (which occur only in line 3, 6, 9, 12, ...).

I tried the following Python code to do this but it didn't work:

Column1 = []

pfad = r"C:/Users/.../"

with open(pfad + "Example1.txt", "r") as f:
    reader = csv.reader(f, delimiter = '\n')
    for row in reader:
        Column1.append(row[0])

I also tried

f.readlines()

instead of

csv.reader

but it isn't working neither.

Can someone please help me?

Best regards

Fab_Freak

Fab_Freak
  • 1
  • 2
  • how exactly is readlines() not working? – Psytho Sep 09 '22 at 14:18
  • Because the symbols are charmaps. You need to use a different type of encoding to read the files: ```with open('yourfile.txt', encoding="utf8") as f``` – Stefan Sep 09 '22 at 14:24
  • Does this answer your question? [UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to ](https://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character) – Stefan Sep 09 '22 at 14:25

3 Answers3

0
line_numbers = []

With open(“file.txt”, “r”) as f:
    for num in line_numbers:
        data = f.readlines()[num]
AsukaMinato
  • 1,017
  • 12
  • 21
  • Assuming the line_numbers list is not empty, this would result in exactly the same error since the encoding is wrong for the symbols – Stefan Sep 09 '22 at 14:34
0

Instead you can try this like this.

**Convert txt to csv

**Csv to df

import pandas as pd
df = pd.read_fwf('path_to_text_file_with_emojis.txt')
df.to_csv('output.csv')

df = pd.read_csv('output.csv')
df.columns =['num','text']
one_string = ' '.join(df['text'].tolist())
print(one_string)

output

'Fu ih ✌ te eirt a nq awi oq qu acisrrtr qopa'
0

Try to use encoding="utf8" in the open function, it is gonna work

  • see the ansewer in this link https://stackoverflow.com/questions/53046443/read-txt-with-emoji-characters-in-python – karam yakoub agha Sep 09 '22 at 14:30
  • Thanks, this works. Sometimes it's that easy. Nevertheless I would like to know how to read in only certain lines, because if I would have really large .txt files it would probably be quicker. – Fab_Freak Sep 09 '22 at 14:56
  • please check this article to read certain lines https://www.geeksforgeeks.org/how-to-read-specific-lines-from-a-file-in-python/ if your purpose to get the lines that has emojis or to erase them from the lines then please check the library emoji, you can just install it using pip install emoji see this sample code import emoji file = open("t.txt",encoding='utf8') for i, line in enumerate(file): for ch in line: if emoji.is_emoji(ch): print(line) – karam yakoub agha Sep 09 '22 at 15:37