0

I have a txt file with following data;

"a" "b" "c" "d" "e" "f" "g" "h"
"2" "222" "0.111" "b1" "19.11" "17.96" "2.85" ""
"3" "333" "0.123" "b2" "42.79" "26.68" "2.85" ""
"4" "444" "0.105" "b3" "17.28" "16.50" "2.85" ""
"5" "555" "0.106" "b4" "18.74" "19.78" "2.88" ""
"6" "666" "0.107" "b5" "5.79" "9.93" "2.88" ""

I am trying to read file with pandas with following code;

df = pd.read_csv("test.txt",sep="\t",quotechar='"',quoting=csv.QUOTE_NONE)

When I try to get columns with room_file.columns it print this;

Index(['"uid" "raum_code" "tuerschild" "raumbezeichung" "raumflaeche" "raumumfang" "raumhoehe" "Comments"'], dtype='object')

But the df output looks just 1 column;

    "a" "b" "c" "d" "e" "f" "g" "h"
0   "2" "222" "0.111" "b1" "19.11" "17.96" "2.8...
1   "3" "333" "0.123" "b2" "42.79" ...
2   "4" "444" "0.105" "b3" "17.28" "16.50" "2...
3   "5" "555" "0.106" "b4" "18.74" "19.78" "2...
4   "6" "666" "0.107" "b5" "5.79"...

5 rows × 1 columns

But it supposed to be 5 rows × 8 columns

I already try:

Solution 1 Solution 2

And I also try like this:

df = pd.read_csv("test.txt",sep='\t',header=0)

But everything same. Can you please help to read the txt file as a dataframe.

Murat Demir
  • 716
  • 7
  • 26
  • 2
    Pandas thinks your file should look like `"2"delimiter"222"delimiter...` because you told it that there will be the word `delimiter` between each column; since your columns are separated by a space, not by the word `delimiter`, it is all one column. `\t` would work if there is a tab character. Your "CSV" would be read with `delimiter=' ', quotechar='"'`. – Amadan Feb 22 '22 at 15:14
  • delimeter was a variable sorry I fix it in question. @Amadan – Murat Demir Feb 22 '22 at 15:17
  • @BigBen I mention in the question I try already – Murat Demir Feb 22 '22 at 15:18

2 Answers2

2

Let file.txt content be

"uid" "raum_code" "tuerschild" "raumbezeichung" "raumflaeche" "raumumfang" "raumhoehe" "Comments"
"2" "222" "0.111" "Büro" "19.11" "17.96" "2.85" ""
"3" "333" "0.123" "Besprechungsraum" "42.79" "26.68" "2.85" ""
"4" "444" "0.105" "Büro" "17.28" "16.50" "2.85" ""
"5" "555" "0.106" "Büro" "18.74" "19.78" "2.88" ""
"6" "666" "0.107" "Fernmeldetechnik" "5.79" "9.93" "2.88" ""

observe that spaces are used for separating and everything (both string and numeric values) are quoted and first line is header, considering that suitable reading is

import csv
import pandas as pd
df = pd.read_csv("file.txt",sep=' ',quotechar='"',quoting=csv.QUOTE_ALL)
print(df.shape)  # (5, 8)
print(df)

output

   uid  raum_code  tuerschild    raumbezeichung  raumflaeche  raumumfang  raumhoehe  Comments
0    2        222       0.111              Büro        19.11       17.96       2.85       NaN
1    3        333       0.123  Besprechungsraum        42.79       26.68       2.85       NaN
2    4        444       0.105              Büro        17.28       16.50       2.85       NaN
3    5        555       0.106              Büro        18.74       19.78       2.88       NaN
4    6        666       0.107  Fernmeldetechnik         5.79        9.93       2.88       NaN
Daweo
  • 31,313
  • 3
  • 12
  • 25
1

i try this and it shows correct result

df = pd.read_csv("test.txt", header=0, quotechar="\"", sep=" ")

print(df.columns)
print(df.shape) 
print(df)
Amir Aref
  • 361
  • 1
  • 5