Benchmark
Only one way to find out: Benchmark it yourself.
Here are some implementations that come to mind.
gen() { seq "$max"; }
# functions returning 0 (success) iff `gen` prints less than `$thold` lines
a() { [ "$(gen | head -n"$thold" | wc -l)" != "$thold" ]; }
b() { [ -z "$(gen | tail -n+"$thold" | head -c1)" ]; }
c() { [ "$(gen | grep -cm"$thold" ^)" != "$thold" ]; }
d() { [ "$(gen | grep -Fcm"$thold" '')" != "$thold" ]; }
e() { gen | awk "NR >= $thold{exit 1}"; }
f() { gen | awk -F^ "NR >= $thold{exit 1}"; }
g() { gen | sed -n "$thold"q1; }
h() { mapfile -n1 -s"$thold" < <(gen); [ -z "$MAPFILE" ]; }
max=1''000''000''000
for fn in {a..h}; do
printf '%s: ' "$fn"
for ((thold=1''000''000; thold<=max; thold*=10)); do
printf '%.0e=%2.1fs, ' "$thold" "$({ time -p "$fn"; } 2>&1 | grep -Eom1 '[0-9.]+')"
done
echo
done
In the script from above gen
is a placeholder for your actual command tsharks output lines
. The functions a
to g
test whether tsharks
' output has at least $thold
lines. You can use them like
a && echo "tsharks printed less than $thold lines"
Results
These are the results on my system:
a: 1e+06=0.0s, 1e+07=0.1s, 1e+08=0.8s, 1e+09=8.9s,
b: 1e+06=0.0s, 1e+07=0.1s, 1e+08=0.9s, 1e+09=8.4s,
c: 1e+06=0.0s, 1e+07=0.2s, 1e+08=1.6s, 1e+09=16.1s,
d: 1e+06=0.0s, 1e+07=0.2s, 1e+08=1.6s, 1e+09=15.7s,
e: 1e+06=0.1s, 1e+07=0.8s, 1e+08=8.2s, 1e+09=83.2s,
f: 1e+06=0.1s, 1e+07=0.8s, 1e+08=8.2s, 1e+09=84.6s,
g: 1e+06=0.0s, 1e+07=0.3s, 1e+08=3.0s, 1e+09=31.6s,
h: 1e+06=7.7s, 1e+07=90.0s, ... (manually aborted)
b: ... 1e+08=0.9s ...
means that approach b
took 0.9 seconds to find out that the output of seq 1000000000
had at least 1e+08
(= 100'000'000) lines.
Conclusion
From the approaches presented in this answer b
is clearly the fastest. However, the actual results might differ from system to system (there are different implementations and versions for head
, grep
, ...) and for your atual use-case. I reccommend to benchmark with your actual data (that is, replace the seq
in gen()
with your tsharks output lines
and set thold
to any actually used values).
If you need an even faster approach you can experiment more with stdbuf
and LC_ALL=C
.