How to load a tsv file into a Pandas DataFrame?

Question

I'm trying to get a tsv file loaded into a pandas DataFrame.

This is what I'm trying and the error I'm getting:

>>> df1 = DataFrame(csv.reader(open('c:/~/trainSetRel3.txt'), delimiter='\t'))

Traceback (most recent call last):
  File "<pyshell#28>", line 1, in <module>
    df1 = DataFrame(csv.reader(open('c:/~/trainSetRel3.txt'), delimiter='\t'))
  File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 318, in __init__
    raise PandasError('DataFrame constructor not properly called!')
PandasError: DataFrame constructor not properly called!

For those coming to this answer in 2017+, use `read_csv('path_to_file', sep='\t')`. See [this answer below](https://stackoverflow.com/a/34548894/3707607) — Ted Petrou, Nov 06 '17 at 16:49

score 269 · Accepted Answer · edited Jun 05 '21 at 10:42

269

The .read_csv function does what you want:

pd.read_csv('c:/~/trainSetRel3.txt', sep='\t')

If you have a header, you can pass header=0.

pd.read_csv('c:/~/trainSetRel3.txt', sep='\t', header=0)

Note: Prior 17.0, pd.DataFrame.from_csv was used (it is now deprecated and the .from_csv documentation link redirects to the page for pd.read_csv).

edited Jun 05 '21 at 10:42

Rick

43,029
15
76
119

answered Mar 11 '12 at 06:06

huon

94,605
21
231
225

5

I had some issues with this method - it was very slow and failed indexing at the end. Instead, i used read_table(), which worked much faster and without the extra param. – Yuri Astrakhan Aug 15 '14 at 09:56
I get empty 'columns' and the data are a bunch of mess, can this read tab-separated .txt with the header as first line, I guess not. – imrek Aug 31 '15 at 04:59
26

Note that as of 17.0 `from_csv` is discouraged: use `pd.read_csv` instead! – rafaelvalle Dec 03 '16 at 00:10
2

I had to use the following: DataFrame.read_csv('filepath.tsv', sep=' ', header=0) – Archie Jan 20 '17 at 09:30
3

This is a bad answer; you can read TSV natively with `pd.read_csv/read_table`, you just need to set `delim_whitespace=True` or `sep` – smci Apr 29 '18 at 08:31
1

Downvoted because from_csv is deprecated now and you are stil top answer in google – azerty Nov 13 '18 at 19:51
3

@rafaelvalle added deprecated notice – Arayan Singh Feb 15 '19 at 21:53
FWIW: Here's a warning message that comes up when using from_csv in jupyter notebook (MacPorts version) /opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ipykernel_launcher.py:3: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls This is separate from the ipykernel package so we can avoid doing imports until – ATutorMe Jul 29 '19 at 02:38
Thanks for sharing this, it works perfectly with `DataFrame.read_csv('c:/~/trainSetRel3.txt', sep='\t')` – Salomon Kabongo Nov 21 '19 at 21:21
Also if your first column is an index, `index_col=0` is pretty important too – jimh Jun 07 '21 at 10:31

Kamil Sindi · Answer 2 · 2016-01-07T15:42:04.623

111

As of 17.0 from_csv is discouraged.

Use pd.read_csv(fpath, sep='\t') or pd.read_table(fpath).

edited Jan 07 '16 at 15:42

answered Dec 31 '15 at 16:13

Kamil Sindi

21,782
19
96
120

7

Note: read_table is deprecated since version 0.24.0. Use pandas.read_csv() instead. – ManuelSchneid3r Mar 31 '19 at 14:34
2

Apparently `read_table` was later [un-deprecated](https://github.com/pandas-dev/pandas/issues/25220#issuecomment-506848168) in 0.25.0. – yodavid May 05 '22 at 09:26

score 68 · Answer 3 · edited Apr 06 '21 at 08:31

68

Use pandas.read_table(filepath). The default separator is tab.

edited Apr 06 '21 at 08:31

Cristian Ciupitu

20,270
7
50
76

answered Mar 11 '12 at 15:34

Wes McKinney

101,437
32
142
108

2

read_table doesn't require any parameters. Perfectly working. – Jay Jul 24 '16 at 09:19

Mohsin Ashraf · Answer 4 · 2019-08-04T05:36:25.637

25

Try this

df = pd.read_csv("rating-data.tsv",sep='\t')
df.head()

You actually need to fix the sep parameter.

edited Aug 04 '19 at 05:36

answered Aug 01 '19 at 05:14

Mohsin Ashraf

972
12
18

score 9 · Answer 5 · edited Feb 18 '19 at 06:50

9

open file, save as .csv and then apply

df = pd.read_csv('apps.csv', sep='\t')

for any other format also, just change the sep tag

edited Feb 18 '19 at 06:50

Antonio Correia

1,093
1
15
22

answered Feb 10 '18 at 17:28

ankit srivastava

116
1
2

Đ.J vicky · Answer 6 · 2021-02-16T13:36:56.710

3

data = pd.read_csv('your_dataset.tsv', delimiter = '\t', quoting = 3)

You can use a delimiter to separate data, quoting = 3 helps to clear quotes in datasst

edited Feb 16 '21 at 13:36

answered Feb 16 '21 at 13:23

Đ.J vicky

61
4

score 2 · Answer 7 · edited Apr 17 '20 at 20:18

2

df = pd.read_csv('filename.csv', sep='\t', header=0)

You can load the tsv file directly into pandas data frame by specifying delimitor and header.

edited Apr 17 '20 at 20:18

Stefan Ollinger

1,577
9
16

answered Apr 15 '20 at 17:24

Kofi

1,224
1
10
21

score 1 · Answer 8 · answered Jun 21 '22 at 08:22

1

use this

import pandas as pd
df = pd.read_fwf('xxxx.tsv')

answered Jun 21 '22 at 08:22

Emeka Boris Ama

429
4
5

Why this instead of `read_csv` with `sep='\t'`? – Mutoh Jun 23 '22 at 12:13

score 0 · Answer 9 · edited Feb 21 '21 at 01:20

0

Try this:

import pandas as pd
DataFrame = pd.read_csv("dataset.tsv", sep="\t")

edited Feb 21 '21 at 01:20

Robert Columbia

6,313
15
32
40

answered Feb 21 '21 at 01:17

peaceloving

1
1

How to load a tsv file into a Pandas DataFrame?

9 Answers9

Linked