1

I have a function to write a text file based on the form settings, a rather large form.

Shortly, I want to compare the output of a function to a single file, and only do execution (rewriting the file) if the destination file is different from the output. As you guess, it is a performance concern.

Is it doable, BTW?

The process is, I fill up some forms:

  1. A single file is written to contain some "specific" selected options

  2. Some "non-specific" options do not necessarily write anything to the file.

The form is updateable anytime, so the content of the file may grow or shrink based on different options.

It only needs a rewrite to the file if I am at point #1. When at point #2, nothing should be written.

This is what I tried:

if ($output != file_get_contents($filepath)) {
  // save the data
}

But I felt so much delay of execution in this.

I found a almost similar issue here: Can I use file_get_contents() to compare two files?, but my issue is different. Mine is comparing the result of the process to an already existing file which simply the result of the process previously. And only rewrite the file if they are different.

No sensitive data on the form, btw. Any hint is very much appreciated.

Thanks

Community
  • 1
  • 1
swan
  • 2,509
  • 3
  • 24
  • 40

3 Answers3

1

Rather than load the entire file into memory, it may be faster to read it line-by-line (fgets) and compare it to the input string also line-by-line. You could even go as small as character-by-character, but I think that's overkill.

Explosion Pills
  • 188,624
  • 52
  • 326
  • 405
  • They are, but if `filepath` is huge, it will take a long time. – Explosion Pills May 24 '12 at 16:08
  • Yes, you meant the content of filepath. It can be huge, because it contains aggregation of files written in there. The form options mostly triggers to copy some content of text files and put there. – swan May 24 '12 at 16:11
1

To compare a whole file with a string (I suppose it's a string, isn't it?) the only way is to read whole file and do comparison. To improve performance you can read file line by line and stop at first different line, as Explosion Pills said before me.

If your file is really big, and you want to improve performance further, you can do some hashing stuff:

  • Generate the output, let's say $output.
  • Calculate md5($output) and store in $output_md5.
  • Compare $output_md5 with a stored one, let's say in file output.md5.
  • Are they equal?
    • If yes, do nothing.
    • If not, save $output into output.txt and $output_md5 in output.md5.
lorenzo-s
  • 16,603
  • 15
  • 54
  • 86
  • Thanks, this sounds reasonable, like the suggested sha1_file() on the link. – swan May 24 '12 at 16:15
  • Using `sha1_file()` cause PHP to read whole file anytime you want to check it. Calculate SHA1 or MD5 (as you wish) and **store** it, so you don't need to re-calculate it every time. – lorenzo-s May 24 '12 at 16:18
1

You could always try a combination of what was in the other post, the sha1_file($file) function, with the sha1($string) function, and check the equality of that.

dpk2442
  • 701
  • 3
  • 8
  • 1
    Using `sha1_file()` cause PHP to read whole file anytime you want to check it, because hash needs to be calculated. That is not better than checking file content. If you **store** hash along file, and then use stored hash to do comparison (as I said in my answer), then, ok, you are on the right way. – lorenzo-s May 24 '12 at 16:20