When it becomes necessary to compare two or more text files in UNIX, most developers reach for the diff program. This program, included by default in almost all UNIX distributions, compares the files line by line and displays the changes between them in a number of different output formats. Though diff originally is a command-line utility, packages replicating its functionality are available for most development environments and languages, including Perl, JSP, and PHP. And so we come to Text_Diff, a PEAR class that makes it possible to compare file contents in the PHP environment and render the output in various formats. This tutorial will demonstrate this class in action, illustrating how you can use it to dynamically compare file contents with PHP and render the results as a Web page. I'll assume here that you have a working Apache and PHP installation and that the PEAR Text_Diff class has been correctly installed. Note: You can install the PEAR Text_Diff package directly from the Web, either by downloading it or by using the instructions provided. Setting up test files Before writing any code, it's necessary to set up the test files we'll be using in this tutorial. These are two simple files, with some deliberate differences that Text_Diff should be able to pick up on. Snippet A is the first file, named data1.txt. Snippet A Code: apple banana cantaloupe drumstick enchilada fig grape horseradish And Snippet B is the second file, named data2.txt. Snippet B Code: apple bat cantaloupe drumstick enchilada fig peach pear zebra Performing basic comparison Having set up the files, let's begin with a simple illustration of how Text_Diff works. Start with the script in Snippet C. Snippet C PHP: <?php // adjust file paths as per your local configuration! include_once "Text/Diff.php"; include_once "Text/Diff/Renderer.php"; // define files to compare $file1 = "data1.txt"; $file2 = "data2.txt"; // perform diff, print output $diff = &new Text_Diff(file($file1), file($file2)); $renderer = &new Text_Diff_Renderer(); echo $renderer->render($diff); ?> This is fairly simple at first glance. There are two basic classes in the Text_Diff package: Text_Diff(), which actually performs the comparison and returns diffoutput; and Text_Diff_Renderer(), which formats the diff output into a format that is easily understandable. The Text_Diff() object, in particular, must be initialized with the actual contents (and not the locations) of the two files to be compared. The script begins by initializing these two objects, making use of PHP's file() function to extract the contents of each file as a series of arrays. The Text_Renderer() object is then used to render the output in standard diff format, producing output which should be familiar to any UNIX developer: Code: 2c2 <banana --- >bat 7,8c7,12 <grape <horseradish --- >peach >pear > > > >zebra Making differences easier to read Now, the output above is not particularly easy to read unless you have lots of experience at decoding diff results. That's why Text_Diff comes with a couple of options to reformat this output into something more readable. These options are accessible as child classes of the Text_Diff_Renderer() object and make it possible to view comparison results in either unified or inline format. The following script (Snippet D) modifies the previous example to demonstrate unified format: Snippet D PHP: <html> <head></head> <body> <pre> <?php // adjust file paths as per your local configuration! include_once "Text/Diff.php"; include_once "Text/Diff/Renderer.php"; include_once "Text/Diff/Renderer/unified.php"; // define files to compare $file1 = "data1.txt"; $file2 = "data2.txt"; // perform diff, print output $diff = &new Text_Diff(file($file1), file($file2)); $renderer = &new Text_Diff_Renderer_unified(); echo $renderer->render($diff); ?> </pre> </body> </html> Notice the call to the appropriate child class when initializing the renderer. And here's the output: Code: @@ -1,8 +1,12 @@ apple -banana +bat cantaloupe drumstick enchilada fig -grape -horseradish +peach +pear + + + +zebra A quick explanation is in order here: in the unified format, the plus (+) prefix indicates additional lines, the minus (-) prefix indicates deleted lines, and no prefix indicates unchanged lines. Comparing the output above with the original files, it's fairly easy to see how the diff output reflects which lines have changed and what the changes are. Of course, it's possible to make it even more user-friendly -- and that's precisely what inline formatting tries to accomplish. In this format, strikethroughs are used to visually indicate which characters and lines have changed. Snippet E shows you how to use it. Snippet E PHP: <html> <head></head> <body> <pre> <?php // adjust file paths as per your local configuration! include_once "Text/Diff.php"; include_once "Text/Diff/Renderer.php"; include_once "Text/Diff/Renderer/inline.php"; // define files to compare $file1 = "data1.txt"; $file2 = "data2.txt"; // perform diff, print output $diff = &new Text_Diff(file($file1), file($file2)); $renderer = &new Text_Diff_Renderer_inline(); echo $renderer->render($diff); ?> </pre> </body> </html> And here's the output: apple <strike>banana</strike>bat cantaloupe drumstick enchilada fig <strike> grape</strike> <strike>horseradishpeach</strike> pear zebraAnd that's about it for this tutorial. Hopefully you now have a clear idea of how Text_Diff can be used to rapidly and efficiently compare files in the PHP environment and how the output can be formatted for easy readability. Happy coding!