8

I have a file I'm trying to open up in python with the following line:

f = open("C:/data/lastfm-dataset-360k/test_data.tsv", "r", "utf-8")

Calling this gives me the error

TypeError: an integer is required

I deleted all other code besides that one line and am still getting the error. What have I done wrong and how can I open this correctly?

tshepang
  • 12,111
  • 21
  • 91
  • 136
Jim
  • 4,509
  • 16
  • 50
  • 80

5 Answers5

12

From the documentation for open():

open(name[, mode[, buffering]])

[...]

The optional buffering argument specifies the file’s desired buffer size: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size. A negative buffering means to use the system default, which is usually line buffered for tty devices and fully buffered for other files. If omitted, the system default is used.

You appear to be trying to pass open() a string describing the file encoding as the third argument instead. Don't do that.

Kristian Glass
  • 37,325
  • 7
  • 45
  • 73
10

You are using the wrong open.

>>> help(open)
Help on built-in function open in module __builtin__:

open(...)
    open(name[, mode[, buffering]]) -> file object

    Open a file using the file() type, returns a file object.  This is the
    preferred way to open a file.  See file.__doc__ for further information.

As you can see it expects the buffering parameter which is a integer.

What you probably want is codecs.open:

open(filename, mode='rb', encoding=None, errors='strict', buffering=1)
Glider
  • 1,568
  • 12
  • 13
4

From the help docs:

open(...)
    open(file, mode='r', buffering=-1, encoding=None,
         errors=None, newline=None, closefd=True) -> file object

you need encoding='utf-8'; python thinks you are passing in an argument for buffering.

ninjagecko
  • 88,546
  • 24
  • 137
  • 145
1

This resolved my issue, ie providing an encoding(utf-8) while opening the file

    with open('tomorrow.txt', mode='w', encoding='UTF-8', errors='strict', buffering=1) as file:
file.write(result)
Hamid
  • 717
  • 7
  • 15
1

The last parameter to open is the size of the buffer, not the encoding of the file.

File streams are more or less encoding-agnostic (with the exception of newline translation on files not open in binary mode), you should handle encoding elsewhere (e.g. when you get the data with a read() call, you can interpret it as utf-8 using its decode method).

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299