0

I need to seek around in large files of between 30GB-500GB. Unfortunately I need to do this on a 32-bit system.

I've just crashed into PHP's 2GB limit with fseek(), in a way I've not seen documented online.

$ php -r 'var_dump(fseek(fopen("/dev/sda", "r"), 0, SEEK_END));'
int(-1)

So the syscall is returning -1. I wonder why:

$ strace php -r 'var_dump(fseek(fopen("/dev/sda", "r"), 0, SEEK_END));'
...
lstat64("/dev/sda", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 0), ...}) = 0
lstat64("/dev", {st_mode=S_IFDIR|0755, st_size=5460, ...}) = 0
open("/dev/sda", O_RDONLY)              = 3
fstat64(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 0), ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
lseek(3, 0, SEEK_END)                   = -1 EOVERFLOW (Value too large for defined data type)

This is happening on a file only 60GB in size.

So I can't even seek to the end - lseek() is apparently incapable of representing the file pointer back to PHP.

 

On a separate note, repeatedly seeking forward with SEEK_CUR seems to get stuck at 4GB - I can't move the file pointer past that point. I need to seek to the end of 500GB files.

What are my alternatives? My current plans are to use dd (this is on Linux), and the tool I'm writing might as well be a shell script at this point...

i336_
  • 1,813
  • 1
  • 20
  • 41
  • Is this any help https://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2-gb-file-without-external-program – RiggsFolly Jun 04 '17 at 13:58
  • @RiggsFolly: Unfortunately not - that project only uses a bunch of different techniques to get **only** the file *size* - for example it can use libcurl to do a header-only request for the `file://` path and then extract the file size from the `Content-Length` field (yeah). Pretty ingenious if you ask me; I'm just disappointed that PHP was so badly mis-designed back in the day when there was only one chance to get this right. Ah well. – i336_ Jun 04 '17 at 22:25
  • I think I'm going to end up using a shell script for this, because I was already going to shell out to the `xz` utility (the only XZ library for PHP is 4 years old, unmaintained, segfaults far too easily for my liking, offers no low-level control, and only works for files) to do stream compression. For anyone else needing to solve this - libcurl ***might*** work if you do a `file://` request and set the Range header. Untested, tell me if it works! (Just thought of this while typing the above comment). – i336_ Jun 04 '17 at 22:33

0 Answers0