Understanding the Diff Software

By: Sam Miller

Diff software, also known as a software patch, is a file comparison utility that is used to produce the differences between two distinct files. Basically, it compares an original file to a modified file, and displays a list of changes made to the file. It is usually used to fix bugs or add new features that the initial software or application does not have.

The operation of diff is based on the solution of LCS, or the Longest Common Subsequence. The process of LCS is basically finding the longest subsequence common to all the series in the given set of sequences. A subsequence is defined as a new sequence, which is produced from the initial sequence by removing some elements without disturbing the relative arrangement of the remaining elements.

The development of the diff software started during the early 1970s and the final version was written by Douglas McIlroy. McIlroy’s design of diff was influenced by a comparison program that came out earlier. There were several similarities between the comparison program and the diff software, such as the line-by-line modifications, and the use of angle brackets for line insertions and deletions.

However, the method utilized by the earlier comparison program was deemed unreliable. The possible effectiveness of a diff software triggered McIlroy into researching and devising a more efficient tool. He collaborated with several individuals, and the research paper for this venture was published in a 1976 paper co-written by James Hunt, who developed an initial prototype of diff.

In the early years of diff, some features that were commonly used include the comparison of the source of software code and the markup for technical documents, the verification of program debugging output, the comparison of file system listings, and the analysis of computer assembly code.

In the conventional output format, the symbols used are ‘a’ for added,‘d’ for deleted, and ‘c’ for changed. The line numbers of the primary file are shown before the symbols (a, d, c) and those of the revised file are shown after. Angle brackets are situated at the start of the lines that are added, deleted, or changed. Addition lines are inserted in the original file and will appear in the new file. Deletion lines are removed from the original file and will not be seen in the new file. Furthermore, lines that have been inserted will show up as added in their new location, and lines that have been removed will show up as deleted in their old location. By default, lines that are common to both will not appear.

The diff software has remained externally unchanged. The developments to the software include enhancements to the core algorithm, addition of useful features to the command, and the design of new output formats.

If you want a software application that can compare documents and files of MS Word, PowerPoint, Excel, PDF, HTML, RTF, HTM, and TXT documents, then using the diff software can employ the appropriate function. Users can then identify differences in file display and details using the software.

Software
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 
 • 

» More on Software