0

Background:

I am trying to write a small script that logs JSONS concurrently, if the file sizes are small, things are ok. But when file sizes are large, processes start to overwrite each other. This SO post is helpful to point in the right direction: PIPE_BUFF. Windows seems to have it set to 1024, on linux larger How big is the PIPE_BUFF?

PS: I am on WSL2 Ubuntu 20.04

Issue

I tried setting the PIPE_BUFF value using the Fcntl constant F_SETPIPE_SZ, however I was not successful:

#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use Data::Dumper qw(Dumper);
use File::Basename qw(basename dirname);
use File::Spec qw(catfile file_name_is_absolute catdir);
use feature qw(say current_sub);
use Cwd qw(abs_path);
use lib dirname(abs_path $0);
use Fcntl qw(F_SETPIPE_SZ F_GETPIPE_SZ F_GETFL F_GETFD F_SETFL O_NONBLOCK O_WRONLY O_APPEND O_CREAT O_RDWR F_DUPFD F_SETOWN F_GETOWN);

open(my $fileH,">", File::Spec -> catfile(dirname(__FILE__), "blabla.txt"));
#does F_GETFD work?
my $val = fcntl($fileH, F_GETFD, 0);
say $val; #outputs 1, works
#does F_GETFL work?
my $val2 = fcntl($fileH, F_GETFL, 0);
say $val2; #outputs 32769, works
#does F_GETOWN work?
my $val3 = fcntl($fileH, F_GETOWN, 0);
say $val3; #outputs 0 but true, so it works too.
my $pid = +$$;#process id
#does F_SETOWN work?
my $val4 = fcntl($fileH, F_SETOWN, $pid) or die("error: $!");;
say $val4; #outputs 0 but true
#does getting pipe buffer work?
my $val5 = fcntl($fileH, F_GETPIPE_SZ, 0) or die("error: $!"); #"Bad file descriptor"
say $val5; #Use of uninitialized..so $val5 is undef, did not work
#does setting pipe buffer work?
my $val6 = fcntl($fileH, F_SETPIPE_SZ, 1000 * 1000) or die("error: $!"); #"Bad file descriptor"
say $val6; #undef

Several constants like F_GETFD or F_GETFL works, so I am guessing Fcntl is operating properly.

However F_SETPIPE_SZ and some others does not seem to work at all. Passing fcntl fileno($fileH) instead of $fileH results in "unopened file handle" error, so it does not change anything as well.

What is the underlying reason for this?

ibrahim tanyalcin
  • 5,643
  • 3
  • 16
  • 22
  • This feels very much like an [XY-problem](https://xyproblem.info/). Your main issue is that different processes are writing to the same file...? – TLP Apr 06 '22 at 09:59
  • I do know about the flocks. Since I can confirm that keeping PIPE_BUFF under a reasonable limit like 1024 never results in overwriting (since the writes are atomic), I forego 'Y'. I am particularly interested in why I cannot get Fcntl to work and what is the rationale behind the error logs. "Bad file descriptor" is an awfully generic message. – ibrahim tanyalcin Apr 06 '22 at 10:19
  • If you use `$!` in the wrong context, it can give weird errors. You cannot rely on `$!` to tell when you did something wrong. – TLP Apr 06 '22 at 10:29
  • Try for example `perl -lwe'print $a++ . $!; open F, "a.txt" or die $!; print $a++ . $!;'`. On my system it shows the error `Inappropriate I/O control operation`, even though clearly nothing is wrong. – TLP Apr 06 '22 at 10:31
  • @TLP Updated the example to make sure I catch $! in correct context. Still "Bad file descriptor". Although I can print to that descriptor just fine. – ibrahim tanyalcin Apr 06 '22 at 10:40
  • I am not finding any documentation describing `F_GETPIPE_SZ`. Also, I get `Your vendor has not defined Fcntl macro F_GETPIPE_SZ` if I try to use it. This feels like murky water you are fishing around in. – TLP Apr 06 '22 at 10:50
  • @TLP They're Linux specific. OP linked to the relevant manpage. – Shawn Apr 06 '22 at 10:55
  • @Shawn no the file is not a named pipe, no mkfifo is NOT used at all. I thought I could control how many bytes are written to ordinary files using fcntl....Ow..If that's not the case you can answer and I will accept – ibrahim tanyalcin Apr 06 '22 at 11:08

1 Answers1

1

The Linux fcntl() flags F_GETPIPE_SZ and F_SETPIPE_SZ are, as their names suggest, specific to pipes. You're trying to use them with a regular file, hence the failures.

For clarity:

#!/usr/bin/env perl                                                                                                                                                                                                                               
use strict;
use warnings;
use feature qw/say/;
use Fcntl qw/F_GETPIPE_SZ/;

open my $file, ">", "/tmp/foo.txt" or die "Unable to open /tmp/foo.txt: $!\n";
pipe my $reader, my $writer or die "Unable to create pipe: $!\n";

if (my $size = fcntl($file, F_GETPIPE_SZ, 0)) {
    say "filehandle size: $size";
} else {
    say "fcntl of file failed: $!";
}

if (my $size = fcntl($reader, F_GETPIPE_SZ, 0)) {
    say "pipe size: $size";
} else {
    say "fcntl of pipe filed: $!";
}

Possible output:

fcntl of file failed: Bad file descriptor
pipe size: 65536
Shawn
  • 47,241
  • 3
  • 26
  • 60
  • I wish there could be a way to have similar system for OSs to control how many bytes are written to an ordinary file, but I guess that's not possible. Great answer, ty. – ibrahim tanyalcin Apr 06 '22 at 11:12
  • 1
    You need to use file locking or some other synchronization method. – Shawn Apr 06 '22 at 11:14
  • flock is not supported on all systems, having a file with pid written to it did not work consistent under high stress (500+ processes) maybe because -e is a system call and is not always accurate. having small log size of <10kb never makes an issue. Anything bigger than that, processes start to overwrite each other, possibly JSON::XS is not thread safe either. So at least I tried.. – ibrahim tanyalcin Apr 06 '22 at 12:01
  • 1
    You're unlikely to find an OS in actual use that doesn't support `flock` either directly or through emulation (Even works on Windows). But with that many processes, a server-based log system might work better to avoid lock contention. Writing to a message queue of some kind (Posix, SysV, ZeroMQ, etc.), or a datagram local sockets, with a program on the receiving end that does the actual logging. – Shawn Apr 06 '22 at 12:27
  • Re "*I wish there could be a way to have similar system for OSs to control how many bytes are written to an ordinary file*", That's not what `F_SETPIPE_SZ` does. (I mean, not even for pipes.) It sets the limit on how many unread bytes there can be in the pipe. (Plain files obviously have no concept of unread bytes, which is why it doesn't apply to plain files.) – ikegami Apr 06 '22 at 13:30