1

I'd like to use standard utilities, rather than compiled code, to efficiently fill a file with a constant value, repeated N times. In the simple version, the value is a single byte; in a complex version - an arbitrary string.

Note:

  • If the value is 0, we have dd if=/dev/zero of=/path/to/file bs=XXX count=YYY for appropriate values of XXX and YYY.
  • We can obviously do this with echo $value >> file done in a loop, but that should be very slow.
einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • Does this answer https://stackoverflow.com/a/5349842/7411306 help? – dmadic Apr 28 '18 at 19:56
  • Why do you need such a file? It would probably be even *more* efficient to generate such data in the program that needs to read from it. – chepner Apr 28 '18 at 19:58
  • @chepner: Suppose I don't control the program. – einpoklum Apr 28 '18 at 20:11
  • Do you want the `yes` program? I'm having trouble imagining what program needs a source of repeated values, other than to autorespond to a series of prompts. – chepner Apr 28 '18 at 20:12
  • @chepner: Hmm, that actually does cover some use-cases, see below. – einpoklum Apr 28 '18 at 20:24
  • Rather than opening, appending to, and closing a file at each iteration of your loop with `>>`, you can simply direct the entire output of the loop in one go... `for ((i=0;i<100;i++)) ; do echo "abc"; done > file` – Mark Setchell Apr 28 '18 at 22:53

2 Answers2

2

Let's take

value_to_replicate="whatever"
file_length=65536
output_file="/path/to/file"

for example.

Based on @dmadic's suggestion, we can do:

bash -c "printf ${value_to_replicate}%.0s {1..${file_length}} > $output_file"

(If I don't "wrap" the final command, there's an issue with the expansion of ${file_length}). This has a throughput of about 0.6 seconds / MB on my system.

If the string ends with a newline, things become simpler:

yes "$value_to_replicate" | head -$file_length > $output_file

And if the string has no newlines, we can also do this:

yes "$value_to_replicate" | head -$file_length | tr -d '\n' > $output_file
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0

You could use Perl as follows:

time perl -e 'print "Hello"x1000000000' > /dev/null

real    0m2.930s
user    0m1.778s
sys     0m1.151s

So, 5GB in 3 seconds.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • You need to time it when printing into a real file. Also, please mention that a newline is not appended. – einpoklum Apr 28 '18 at 21:02
  • If I time it when printing to a real file, it will be affected by the speed of the disk, whereas I am showing the true sustained bandwidth achieved by the command. If you want linefeeds, just put them in your string! `print "Hello\n"` - you said **arbitrary**, not constrained by having to end in linefeeds. – Mark Setchell Apr 28 '18 at 21:05