Load data from txt with pandas

Question

I am loading a txt file containig a mix of float and string data. I want to store them in an array where I can access each element. Now I am just doing

import pandas as pd

data = pd.read_csv('output_list.txt', header = None)
print data

This is the structure of the input file: 1 0 2000.0 70.2836942112 1347.28369421 /file_address.txt.

Now the data are imported as a unique column. How can I divide it, so to store different elements separately (so I can call data[i,j])? And how can I define a header?

score 348 · Answer 1 · edited Aug 03 '17 at 10:07

348

You can use:

data = pd.read_csv('output_list.txt', sep=" ", header=None)
data.columns = ["a", "b", "c", "etc."]

Add sep=" " in your code, leaving a blank space between the quotes. So pandas can detect spaces between values and sort in columns. Data columns is for naming your columns.

edited Aug 03 '17 at 10:07

Chrisji

311
2
13

answered Feb 04 '14 at 07:53

pietrovismara

6,102
5
33
45

Thanks! How can I access an element of the table? – albus_c Feb 04 '14 at 07:57
1

if you want to call a column use data.a if you named the column "a". – pietrovismara Feb 04 '14 at 08:01
2

Or if you want to call a single row you can use data.a[1] (this example calls the first row of the column) – pietrovismara Feb 04 '14 at 08:20

score 148 · Answer 2 · answered Aug 13 '17 at 06:03

148

I'd like to add to the above answers, you could directly use

df = pd.read_fwf('output_list.txt')

fwf stands for fixed width formatted lines.

answered Aug 13 '17 at 06:03

Meenakshi Ravisankar

1,580
1
8
4

thank you! It solved my long time pending problem. – Deepak Harish Dec 06 '22 at 06:32

score 74 · Accepted Answer · edited Jun 27 '18 at 17:40

74

You can do as:

import pandas as pd
df = pd.read_csv('file_location\filename.txt', delimiter = "\t")

(like, df = pd.read_csv('F:\Desktop\ds\text.txt', delimiter = "\t")

edited Jun 27 '18 at 17:40

Rajat Jain

1,339
2
16
29

answered Jun 27 '18 at 16:52

tulsi kumar

986
8
6

score 54 · Answer 4 · answered Aug 04 '16 at 03:25

54

@Pietrovismara's solution is correct but I'd just like to add: rather than having a separate line to add column names, it's possible to do this from pd.read_csv.

df = pd.read_csv('output_list.txt', sep=" ", header=None, names=["a", "b", "c"])

answered Aug 04 '16 at 03:25

Sam Perry

2,554
3
28
29

score 34 · Answer 5 · edited Oct 01 '17 at 13:14

34

you can use this

import pandas as pd
dataset=pd.read_csv("filepath.txt",delimiter="\t")

edited Oct 01 '17 at 13:14

Marian Nasry

821
9
22

answered Oct 01 '17 at 06:57

ramakrishnareddy

611
1
6
13

As you can see from this answer, 'sep' and 'delimeter' are the same :) https://stackoverflow.com/a/49533103 – Давид Шико Jul 03 '20 at 05:44

score 28 · Answer 6 · answered Apr 02 '19 at 11:07

28

If you don't have an index assigned to the data and you are not sure what the spacing is, you can use to let pandas assign an index and look for multiple spaces.

df = pd.read_csv('filename.txt', delimiter= '\s+', index_col=False)

answered Apr 02 '19 at 11:07

bfree67

669
7
6

4

Equivalently you can specify the more verbose argument `delim_whitespace=True` instead of the `'\s+'` delimiter – ALollz Aug 28 '19 at 18:55

score 9 · Answer 7 · answered Nov 20 '20 at 19:43

9

If you want to load the txt file with specified column name, you can use the code below. It worked for me.

import pandas as pd    
data = pd.read_csv('file_name.txt', sep = "\t", names = ['column1_name','column2_name', 'column3_name'])

answered Nov 20 '20 at 19:43

mpriya

823
8
15

score 8 · Answer 8 · answered Sep 08 '19 at 20:10

8

Based on the latest changes in pandas, you can use, read_csv , read_table is deprecated:

import pandas as pd
pd.read_csv("file.txt", sep = "\t")

answered Sep 08 '19 at 20:10

pari

788
8
12

score 7 · Answer 9 · answered Apr 14 '19 at 17:19

7

You can import the text file using the read_table command as so:

import pandas as pd
df=pd.read_table('output_list.txt',header=None)

Preprocessing will need to be done after loading

answered Apr 14 '19 at 17:19

Kaustubh J

742
8
9

score 2 · Answer 10 · answered Jun 02 '20 at 13:08

2

I usually take a look at the data first or just try to import it and do data.head(), if you see that the columns are separated with \t then you should specify sep="\t" otherwise, sep = " ".

import pandas as pd     
data = pd.read_csv('data.txt', sep=" ", header=None)

answered Jun 02 '20 at 13:08

Mohamed Berrimi

130
10

Carefully adding "header=None" and adding an additional row with the max number of columns, you will get errors like "pandas.errors.ParserError: Error tokenizing data. C error: Expected N fields in line M" very hard to understand why. Removing "header=None" fix the problem. – Gustavo Rodríguez Oct 18 '21 at 10:17

score 2 · Answer 11 · answered Oct 26 '20 at 09:51

2

You can use it which is most helpful.

df = pd.read_csv(('data.txt'), sep="\t", skiprows=[0,1], names=['FromNode','ToNode'])

answered Oct 26 '20 at 09:51

Sunil Singh

187
2
12

Load data from txt with pandas

11 Answers11

Linked

Related