1

Is there a command line utility or a php/py script that will generate a html diff so that multiple files can be compared in order to compare 4 or more files.

Each of my files have max of 10k lines each.

Note: these files are plain text files . not html . Only contain A-Za-z0-9=., . and no HTML tags

Sairam
  • 2,708
  • 1
  • 25
  • 34
  • 1
    http://stackoverflow.com/questions/86905/suggestions-on-how-build-an-html-diff-tool – ArK Nov 11 '10 at 06:23
  • the question was to compare 2 html files. I am comparing plain text files here – Sairam Nov 11 '10 at 06:30
  • good question, but the HTML output is kind of unreadable ? compare 2 files is slightly less confuse, simple way `diff A B > /tmp/diff.a.b; diff A C > /tmp/diff.a.c; diff A D > /tmp/diff.a.d; cat /tmp/diff.a.*;` u can later on beautified it with HTML, not exactly what u want, maybe, just a try-on-error – ajreal Nov 11 '10 at 06:56

1 Answers1

0

It depends what type of data you're comparing/analyzing.

The basic solution is

  • file_get_contents gives you strings of the file data
  • strcmp will do a "binary-safe compare" of the data

You will probably want to explode() your data to delimit it somehow, and compare sections of the data.

Another option is to delimit, loop through, and make a "comparison coefficient" which would indicate to what degree the files deviate from a norm. For example, File 1 has cc=3, file 4 has cc=8. File 4 would be a closer match.

A final problem you'll run into is the memory limit on the server computer. You can change this in php.ini.

//EDIT

Just noticed the diff tag, but I'll leave this up anyway in case it helps somehow.

Ben
  • 54,723
  • 49
  • 178
  • 224