|
Posted by Unruh on April 26, 2006, 7:36 pm
If you were Registered and logged in, you could reply and use other advanced thread options
>On Wednesday 26 April 2006 00:07, mulangi@gmail.com wrote:
>> The purpose is to detect small changes in a document. Also the original
>> document will not be available when the information that the document
>> has changed is required. And I need to answer the question as to
>> whether the changes are small or large ...
Essentially impossible without having the two documents to compare.
You could do running hashes-- ie hash subsections of the document, to make
one long hash which contains the hash of the subsections together with the
hash of the whole thing.
But that of course does not tell you if the change is small. A single bit
change should compeletely alter the hash. Ie, a hash simply says "it
changed" not how it changed.
There are of course various error correcting codes that could be used.
>Find the difference between the original document and new document.
>Use the size of the difference, that is, the number of bytes for which
>they differ.
>There are a number of algorithms to find a small set of changes
>between documents (e.g., Unix "diff" program).
>-paul-
>--
>Paul E. Black (p.black@acm.org)
|