0

Background: I have a large file and I want to read the first few values from it. I don't really want to read the entire file partly since I have no further use for it, so that it does not use unnecessary memory and is faster to execute (since it does not need to read this huge file).


From the documentation I am using:

test.txt

Greetings World :)

test.py:

with open('test.txt', buffering=3) as file:
    a = file.read()
print(a)

It does not only partly read my file.

Is there a way to only read part of a file?

Xantium
  • 11,201
  • 10
  • 62
  • 89

2 Answers2

2

From this answer What is the use of buffering in python's built-in open() function? you will see that buffering does not actually read up to a range. Instead set a range in a.read(). So:

with open('test.txt') as file:
    a = file.read(3) 
print(a)

returns Gre as you would expect.

See the documentation

To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string (in text mode) or bytes object (in binary mode). size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned;

If you need to read values inside the text file, then you could use a.seek() see seek() function?

Xantium
  • 11,201
  • 10
  • 62
  • 89
  • So how would I read the first two lines if I knew nothing about the line length beforehand? – roganjosh Jun 08 '18 at 19:29
  • `file.readline()` reads a single line at a time (reads until \n or EOF) – Sam Jun 08 '18 at 19:31
  • @roganjosh you could use 'yield' to read the file in a lazy (not eager) fashion and stop when you know that was the end of the second line – The Fool Jun 08 '18 at 19:32
  • @Sam Yes providing that you don't mind calling `file.readline(1)` and then `file.readline(2)`. – Xantium Jun 08 '18 at 19:32
  • @TheFool I understand that. I'm trying to understand what issue this self-answered question is addressing. It's perfectly valid to set up a question and and an answer but I'm not seeing the hole in knowledge it fills. – roganjosh Jun 08 '18 at 19:33
  • @Sam But it certainly works. I guess you could also use `a = file.readlines()[1]`, `a = file.readlines()[0]` – Xantium Jun 08 '18 at 19:37
  • 1
    @Simon but that defeats the purpose of not reading the whole file – Sam Jun 08 '18 at 19:38
  • 1
    @Sam Try running `file.readlines.__doc__` you should see "hint can be specified to control the number of lines read" – Xantium Jun 08 '18 at 19:44
  • @Simon "sizehint − This is the number of bytes to be read from the file." which would also be challenging to use to read for a specific number of lines unless you know line length, +1 didnt know about the sizehint param – Sam Jun 08 '18 at 19:53
2

You have a couple of options.

file.read() reads the entire file

file.read(size) reads size amount of data (character for text mode, byte for binary mode)

file.readlines() list(file) for line in file: all provide ways to read the whole file

file.readline() returns a single line at a time (read until newline char (\n) or end of file (EOF))

check here for documentation.

Xantium
  • 11,201
  • 10
  • 62
  • 89
Sam
  • 1,542
  • 2
  • 13
  • 27