How can I go to the next line in .txt file?

Question

How can I read only first symbol in each line with out reading all line, using python? For example, if I have file like:

apple  
pear  
watermelon

In each iteration I must store only one (the first) letter of line. Result of program should be ["a","p","w"], I tried to use file.seek(), but how can I move it to the new line?

Yam Mesicka · Answer 1 · 2021-04-15T21:31:45.583

2

ti7 answer is great, but if the lines might be too long to save in memory, you might wish to read char-by-char to prevent storing the whole line in memory:

from pathlib import Path
from typing import Iterator

NEWLINE_CHAR = {'\n', '\r'}


def first_chars(file_path: Path) -> Iterator[str]:
    with open(file_path) as fh:
        new_line = True
        while c := fh.read(1):
            if c in NEWLINE_CHAR:
                new_line = True
            elif new_line:
                yield c
                new_line = False

Test:

path = Path('/some/path/a.py')
easy_first_chars = [l[0] for l in path.read_text().splitlines() if l]
smart_first_chars = list(first_chars(path))
assert smart_first_chars == easy_first_chars

edited Apr 15 '21 at 21:31

answered Apr 15 '21 at 21:28

Yam Mesicka

6,243
7
45
64

Ah, unfortunately, I suspect this will be incredibly slow relative to what it could be (it creates a new Python string for every character in the file!) .. which will be noticeable if the file has so much content on a single line that it cannot practically be read into memory! I suspect for such a case, loading very big blocks, splitting 'em, and working around `\n` on the boundary might be ideal.. there's some splitting analysis [here](https://stackoverflow.com/a/42373311/4541045) . Additionally, the default open args translates all newlines to `\n` regardless of what they were initially! – ti7 Apr 16 '21 at 07:03

Joan Puigcerver · Answer 2 · 2021-04-15T21:21:14.210

0

You can read one letter with file.read(1)

file = open(filepath, "r")

letters = []
# Initilalized to '\n' to sotre first letter
previous = '\n'

while True:
    # Read only one letter
    letter = file.read(1)
    if letter == '':
        break
    elif previous == '\n':
        # Store next letter after a next line '\n'
        letters.append(letter)

    previous = letter

edited Apr 15 '21 at 21:21

answered Apr 15 '21 at 21:05

Joan Puigcerver

104
1
13

i don't need all lines, i need ONLY first symbol of each, I can't store the whole line – deethereal Apr 15 '21 at 21:08
After getting the whole line, iterate them and get the first character with `first_letters = [line[0] for line in lines]` – Joan Puigcerver Apr 15 '21 at 21:09
I can't allow to store whole line, only one letter – deethereal Apr 15 '21 at 21:12

score 0 · Accepted Answer · answered Apr 15 '21 at 21:14

file-like objects are iterable, so you can directly use them like this

collection = []

with open("input.txt") as fh:
    for line in fh:  # iterate by-lines over file-like
        try:
            collection.append(line[0])  # get the first char in the line
        except IndexError:  # line has no chars
            pass  # consider other handling

# work with collection

You may also consider enumerate() if you cared about which line a particular value was on, or yielding line[0] to form a generator (which may allow a more efficient process if it can halt before reading the entire file)

def my_generator():
    with open("input.txt") as fh:
        for lineno, line in enumerate(fh, 1):  # lines are commonly 1-indexed
            try:
                yield lineno, line[0]  # first char in the line
            except IndexError:  # line has no chars
                pass  # consider other handling

for lineno, first_letter in my_generator():
    # work with lineno and first_letter here and break when done

so there's no way to read only one letter I need to read whole line anyway? — deethereal, Apr 15 '21 at 21:17
_sort of_ .. if you know exactly how many characters are in each line, you can `.seek()` into it the appropriate amount each time (this increments the file pointer), but this is not a common case and may not really be more efficient and depend on file encoding (the file is loaded into memory in blocks of the block size; often 4096 bits and not all chars are 8 bits wide, etc.) .. You could also hunt for the `\n` chars yourself, but practically, iterating will do this for you. The file will not be entirely stored in memory in this case. Using `fh.read()` will bring the entire file into memory. — ti7, Apr 15 '21 at 21:20

How can I go to the next line in .txt file?

3 Answers3