0

According to the python documentation,

  • SEEK_SET or 0: seek from the start of the stream (the default); offset must either be a number returned by TextIOBase.tell(), or zero. Any other offset value produces undefined behaviour.
  • SEEK_CUR or 1: “seek” to the current position; offset must be zero, which is a nooperation (all other values are unsupported).
  • SEEK_END or 2: seek to the end of the stream; offset must be zero (all other values are unsupported).

For SEEK_END and SEEK_CUR, why is only 0 offset supported?
Has it simply not been implemented yet/does it need a PEP?
Is this by design? If so, why?

Is there a workaround with acceptable performance (for files with UTF-8, without using undefined behavior)?
All the answers on this question either use undefined behavior or don't handle files with UTF-8.

MarcellPerger
  • 405
  • 3
  • 13
  • The question you linked explains the problem for utf-8 text pretty well: UTF-8 backward seeking is very tricky. You cannot really do it in text mode, but if you want - use binary mode and handle decoding manually. – STerliakov Sep 24 '22 at 19:47
  • You could work with binary files, after seek check if the byte referred by the file pointer is in a multibyte character and, if so, adjust the pointer accordingly. – Michael Butscher Sep 24 '22 at 19:47
  • You're asking too many different questions at once here. Do you want an answer to the "Why" or the "Is there a workaround"? Pick one. – Aran-Fey Sep 24 '22 at 19:49

0 Answers0