4

I need to process a BIG text file that contains space-separated float numbers in ASCII representation:

1.0012 0.63 18.201 -0.7911 92.2869 ...

How do I read these numbers one-by-one (not entire file and not line-by-line) using built-in Python tools? As sample, the C source code to solve this task looks like:

float number;
FILE *f = fopen ("bigfile.txt", "rt");
while (!feof (f)) {
    fscanf (f, "%f", &number);
    /* ... processing the number here ... */
}
fclose (f);
Rabbid76
  • 202,892
  • 27
  • 131
  • 174
R0bur
  • 315
  • 2
  • 6
  • Does this answer your question? [Reading a file with a specified delimiter for newline](https://stackoverflow.com/questions/16260061/reading-a-file-with-a-specified-delimiter-for-newline) – tevemadar Jun 18 '21 at 10:35
  • also https://stackoverflow.com/questions/35459765/read-file-up-to-a-character and https://stackoverflow.com/questions/10183784/is-there-a-way-to-read-a-file-in-a-loop-in-python-using-a-separator-other-than-n – tevemadar Jun 18 '21 at 10:37
  • Links you provide are acceptable workarounds, but I'm wondering that standard Python file object doesn't have a method to do that. Thank you. – R0bur Jun 18 '21 at 13:54

4 Answers4

2

You can try read file char by char, specifying the Chunk size to 1, and then recognize if a word is complete.

with open('file', 'r') as openedFile:
    for chunk in iter(partial(openedFile.read, 1), b''):
        ...

Links usefulls:

https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects

1

You should be able to just read one line at a time and then split() each line to get the number tokens:

with open('file.txt') as f:
    lines = f.readlines()

for line in lines:
    tokens = line.split()
    for token in tokens:
        # process number here
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

If a line-by-line solution is not viable (e.g. the file is just one massive line), you can read one character at a time using read(size=1).

You can do something like this:

current = ""
with open("file.txt") as f:
    while True:
        char = f.read(1)
        if char == "":
            # Reached EOF
            break
        elif char.isdecimal():
            current += char
        else:
            num = float(current)
            # process num however you like
            current = ""
Andrew Eckart
  • 1,618
  • 9
  • 15
0

You can try using the str.isspace() methods to check for spaces:

nums = ['0']
char = ' '
with open('file.txt', 'r') as f:
    while char:
        char = f.read(1)
        if nums[-1][-1].isspace():
            nums.append(char)
        else:
            nums[-1] += char
nums = list(map(float, nums))
Red
  • 26,798
  • 7
  • 36
  • 58